问题描述
我正在阅读有关 APPLY 的页面:
I was reading this page about APPLY:
http://sqlblog.com/blogs/alexander_kuznetsov/archive/2009/07/07/using-cross-apply-to-optimize-joins-on-between-conditions.aspx
我阅读了这篇关于逻辑查询处理的文章:
And I read this article about Logical Query Processing:
http://blog.sqlauthority.com/2009/04/06/sql-server-logical-query-processing-phases-order-of-statement-execution/
所以我可以理解这个查询需要很长时间.
So I can understand how this query takes a long time.
SELECT s.StartedAt, s.EndedAt, c.AirTime
FROM dbo.Commercials s JOIN dbo.Calls c
ON c.AirTime >= s.StartedAt AND c.AirTime < s.EndedAt
WHERE c.AirTime BETWEEN '20080701' AND '20080701 03:00'
连接遍历所有行,然后 WHERE 子句过滤结果.
The join goes through all the rows, then the WHERE clauses filters the results.
但为什么这个查询快如闪电?
But why is this query lightning fast?
SELECT s.StartedAt, s.EndedAt, c.AirTime
FROM dbo.Commercials s JOIN dbo.Calls c
ON c.AirTime >= s.StartedAt AND c.AirTime < s.EndedAt
WHERE c.AirTime BETWEEN '20080701' AND '20080701 03:00'
AND s.StartedAt BETWEEN '20080630 23:45' AND '20080701 03:00'
我知道 WHERE 子句正在过滤两个表的结果.但是过滤发生在 JOIN 之后,而不是 之前.现在,如果它以某种方式实际发生在 JOIN 之前,那么我绝对理解为什么它如此之快.但是,如果我通过第二个链接中的 LOE,情况就不应该如此.对吗?
I get that the WHERE clauses are filtering the results of both tables. But that filtering happens after the JOIN, not before it. Now if it somehow, actually happens before the JOIN, then I definitely understand why it's so fast. But, if I go by the LOE in second link, this shouldn't be the case. Right?
推荐答案
这些查询没有明确的之前"和之后".允许 RDBMS 决定何时运行查询的哪个部分,只要查询产生的结果不改变.
There is no definitive "before" and "after" on these queries. RDBMS is allowed to decide when to run what part of the query, as long as the results the query produces do not change.
在第一种情况下,查询无法对 Commercials
的行进行预过滤,因为 WHERE
子句仅约束 的行>调用代码>.这些约束根据
Commercials
的相应行指定了c.AirTime
的范围,因此不可能进行预过滤:Calls
的所有行> 将考虑用于 Commercials
的每一行.
In the first case, there is nothing the query can do to pre-filter the rows of Commercials
, because the WHERE
clause constrains only the rows of the Calls
. These constraints specified a range for c.AirTime
in terms of the corresponding row of Commercials
, so no pre-filtering is possible: all rows of Calls
would be considered for each row of Commercials
.
然而,在第二种情况下,RDBMS 可以通过观察您将 c.AirTime
的范围额外限制在 2008 年 6 月 30 日 23:45 到 7 月午夜之间来改进时间-1, 2008 通过约束 c.AirTime
加入的 s.StartedAt
.如果在 Calls.AirTime
列上定义了索引,这可以允许优化器使用索引.
In the second case, however, RDBMS can improve on the time by observing that you additionally constraint the range for c.AirTime
to between 23:45 on Jun-30, 2008 through midnight of Jul-1, 2008 by constraining s.StartedAt
to which c.AirTime
is joined. This can allow the optimizer use an index, if one is defined on the Calls.AirTime
column.
这里的重要观察是 RDBMS 在优化查询时可以做非常聪明的事情.它通过应用多个逻辑规则来达到优化策略,试图将约束推向更接近连接中的行源".检查优化器所做工作的最佳选择是阅读查询计划.
The important observation here is that the RDBMS can do very clever things when optimizing your query. It arrives at the optimized strategy by applying multiple rules of logic, trying to push the constraints closer to the "source of rows" in a join. The best option to checking what the optimizer does is reading the query plan.
这篇关于WHERE 和 ON 子句的执行顺序是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!