如何删除重复的行?

How can I remove duplicate rows?(如何删除重复的行?)
本文介绍了如何删除重复的行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

从相当大的 SQL Server 表(即 300,000+ 行)中删除重复行的最佳方法是什么?

What is the best way to remove duplicate rows from a fairly large SQL Server table (i.e. 300,000+ rows)?

当然,由于 RowID 标识字段的存在,这些行不会完全重复.

The rows, of course, will not be perfect duplicates because of the existence of the RowID identity field.

MyTable

RowID int not null identity(1,1) primary key,
Col1 varchar(20) not null,
Col2 varchar(2048) not null,
Col3 tinyint not null

推荐答案

假设没有空值,你 GROUP BY 唯一列,SELECT MIN (或 MAX) RowId 作为要保留的行.然后,删除所有没有行 ID 的内容:

Assuming no nulls, you GROUP BY the unique columns, and SELECT the MIN (or MAX) RowId as the row to keep. Then, just delete everything that didn't have a row id:

DELETE FROM MyTable
LEFT OUTER JOIN (
   SELECT MIN(RowId) as RowId, Col1, Col2, Col3 
   FROM MyTable 
   GROUP BY Col1, Col2, Col3
) as KeepRows ON
   MyTable.RowId = KeepRows.RowId
WHERE
   KeepRows.RowId IS NULL

如果你有一个 GUID 而不是整数,你可以替换

In case you have a GUID instead of an integer, you can replace

MIN(RowId)

CONVERT(uniqueidentifier, MIN(CONVERT(char(36), MyGuidColumn)))

这篇关于如何删除重复的行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!

本站部分内容来源互联网,如果有图片或者内容侵犯您的权益请联系我们删除!

相关文档推荐

Execute complex raw SQL query in EF6(在EF6中执行复杂的原始SQL查询)
SSIS: Model design issue causing duplications - can two fact tables be connected?(SSIS:模型设计问题导致重复-两个事实表可以连接吗?)
SQL Server Graph Database - shortest path using multiple edge types(SQL Server图形数据库-使用多种边类型的最短路径)
Invalid column name when using EF Core filtered includes(使用EF核心过滤包括时无效的列名)
How should make faster SQL Server filtering procedure with many parameters(如何让多参数的SQL Server过滤程序更快)
How can I generate an entity–relationship (ER) diagram of a database using Microsoft SQL Server Management Studio?(如何使用Microsoft SQL Server Management Studio生成数据库的实体关系(ER)图?)