问题描述
当查询中没有进行聚合时,为什么有人会使用 group by 和 distinct ?
Why would someone use a group by versus distinct when there are no aggregations done in the query?
此外,是否有人了解 MySQL 和 SQL Server 中的分组依据与不同的性能考虑因素.我猜 SQL Server 有一个更好的优化器,它们可能接近于等效的优化器,但在 MySQL 中,我预计会有显着的性能优势.
Also, does someone know the group by versus distinct performance considerations in MySQL and SQL Server. I'm guessing that SQL Server has a better optimizer and they might be close to equivalent there, but in MySQL, I expect a significant performance advantage to distinct.
我对 dba 的回答很感兴趣.
I'm interested in dba answers.
Bill 的帖子很有趣,但并不适用.让我更具体...
Bill's post is interesting, but not applicable. Let me be more specific...
select a, b, c
from table x
group by a, b,c
对比
select distinct a,b,c
from table x
推荐答案
来自 MS SQL Server 的少量(非常少)经验数据,来自我们数据库的几个随机表.
A little (VERY little) empirical data from MS SQL Server, on a couple of random tables from our DB.
对于模式:
SELECT col1, col2 FROM table GROUP BY col1, col2
和
SELECT DISTINCT col1, col2 FROM table
当查询没有覆盖索引时,两种方式都会产生以下查询计划:
When there's no covering index for the query, both ways produced the following query plan:
|--Sort(DISTINCT ORDER BY:([table].[col1] ASC, [table].[col2] ASC))
|--Clustered Index Scan(OBJECT:([db].[dbo].[table].[IX_some_index]))
当有覆盖索引时,两者都产生:
and when there was a covering index, both produced:
|--Stream Aggregate(GROUP BY:([table].[col1], [table].[col2]))
|--Index Scan(OBJECT:([db].[dbo].[table].[IX_some_index]), ORDERED FORWARD)
因此,从那个非常小的示例中,SQL Server 肯定会同等对待.
so from that very small sample SQL Server certainly treats both the same.
这篇关于sql group by 与不同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!