T-SQL:删除所有重复的行但保留一个

T-SQL: Deleting all duplicate rows but keeping one(T-SQL:删除所有重复的行但保留一个)
本文介绍了T-SQL:删除所有重复的行但保留一个的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

I have a table with a very large amount of rows. Duplicates are not allowed but due to a problem with how the rows were created I know there are some duplicates in this table. I need to eliminate the extra rows from the perspective of the key columns. Some other columns may have slightly different data but I do not care about that. I still need to keep one of these rows however. SELECT DISTINCT won't work because it operates on all columns and I need to suppress duplicates based on the key columns.

How can I delete the extra rows but still keep one efficiently?

解决方案

You didn't say what version you were using, but in SQL 2005 and above, you can use a common table expression with the OVER Clause. It goes a little something like this:

WITH cte AS (
  SELECT[foo], [bar], 
     row_number() OVER(PARTITION BY foo, bar ORDER BY baz) AS [rn]
  FROM TABLE
)
DELETE cte WHERE [rn] > 1

Play around with it and see what you get.

(Edit: In an attempt to be helpful, someone edited the ORDER BY clause within the CTE. To be clear, you can order by anything you want here, it needn't be one of the columns returned by the cte. In fact, a common use-case here is that "foo, bar" are the group identifier and "baz" is some sort of time stamp. In order to keep the latest, you'd do ORDER BY baz desc)

这篇关于T-SQL:删除所有重复的行但保留一个的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!

本站部分内容来源互联网,如果有图片或者内容侵犯您的权益请联系我们删除!

相关文档推荐

Execute complex raw SQL query in EF6(在EF6中执行复杂的原始SQL查询)
Hibernate reactive No Vert.x context active in aws rds(AWS RDS中的休眠反应性非Vert.x上下文处于活动状态)
Bulk insert with mysql2 and NodeJs throws 500(使用mysql2和NodeJS的大容量插入抛出500)
Flask + PyMySQL giving error no attribute #39;settimeout#39;(FlASK+PyMySQL给出错误,没有属性#39;setTimeout#39;)
auto_increment column for a group of rows?(一组行的AUTO_INCREMENT列?)
Sort by ID DESC(按ID代码排序)