问题描述
我有一个表值的内联 UDF.我想过滤该 UDF 的结果以获得一个特定值.当我使用常量参数指定过滤器时,一切都很好,性能几乎是瞬时的.当我使用可变参数指定过滤器时,它花费的时间要长得多,大约是 500 倍的逻辑读取和 20 倍的持续时间.
I have a table-valued, inline UDF. I want to filter the results of that UDF to get one particular value. When I specify the filter using a constant parameter, everything is great and performance is almost instantaneous. When I specify the filter using a variable parameter, it takes a significantly larger chunk of time, on the order of 500x more logical reads and 20x greater duration.
执行计划表明,在可变参数的情况下,过滤器直到进程的很晚才应用,导致多次索引扫描而不是在常量情况下执行的搜索.
The execution plan shows that in the variable parameter case the filter is not applied until very late in the process, causing multiple index scans rather than the seeks that are performed in the constant case.
我想我的问题是:为什么,因为我指定了一个对索引字段具有高度选择性的过滤器参数,当该参数位于变量中时,我的性能是否会陷入困境?我能对此做些什么吗?
I guess my questions are: Why, since I'm specifying a single filter parameter that is going to be highly selective against an indexed field, does my performance go into the weeds when that parameter is in a variable? Is there anything I can do about this?
和查询中的解析函数有关系吗?
Does it have something to do with the analytic function in the query?
这是我的疑问:
CREATE FUNCTION fn_test()
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN
SELECT DISTINCT GCN_SEQNO, Drug_package_version_ID
FROM
(
SELECT COALESCE(ndctbla.GCN_SEQNO, ndctblb.GCN_SEQNO) AS GCN_SEQNO,
dpv.Drug_package_version_ID, ROW_NUMBER() OVER (PARTITION BY dpv.Drug_package_version_id ORDER BY
ndctbla.GCN_SEQNO DESC) AS Predicate
FROM dbo.Drug_Package_Version dpv
LEFT JOIN dbo.NDC ndctbla ON ndctbla.NDC = dpv.Sp_package_code
LEFT JOIN dbo.NDC ndctblb ON ndctblb.SPC_NDC = dpv.Sp_package_code
) iq
WHERE Predicate = 1
GO
GRANT SELECT ON fn_test TO public
GO
-- very fast
SELECT GCN_SEQNO
FROM dbo.fn_test()
WHERE Drug_package_version_id = 10000
GO
-- comparatively slow
DECLARE @dpvid int
SET @dpvid = 10000
SELECT GCN_SEQNO
FROM dbo.fn_test()
WHERE Drug_package_version_id = @dpvid
推荐答案
我得到的答复很好,我从中学到了东西,但我想我已经找到了让我满意的答案.
The responses I got were good, and I learned from them, but I think I've found an answer that satisfies me.
我确实认为是使用 PARTITION BY 子句导致了这里的问题.我使用自连接习语的变体重新制定了 UDF:
I do think it's the use of the PARTITION BY clause that is causing the problem here. I reformulated the UDF using a variant of the self-join idiom:
SELECT t1.A, t1.B, t1.C
FROM T t1
INNER JOIN
(
SELECT A, MAX(C) AS C
FROM T
GROUP BY A
) t2 ON t1.A = t2.A AND t1.C = t2.C
具有讽刺意味的是,这比使用特定于 SQL 2008 的查询性能更高,而且优化器在使用变量而不是常量连接此版本的查询时没有问题.在这一点上,我得出的结论是优化器不能处理更新的 SQL 扩展以及旧的东西.作为奖励,我现在可以在预升级的 SQL 2000 平台中使用 UDF.
Ironically, this is more performant than using the SQL 2008-specific query, and also the optimizer doesn't have a problem with joining this version of the query using variables rather than constants. At this point, I'm concluding that the optimizer just doesn't handle the more recent SQL extensions as well as the older stuff. As a bonus, I can make use of the UDF now, in my pre-upgraded SQL 2000 platforms.
感谢大家的帮助!
这篇关于为什么当我使用可变参数而不是常量参数时,我的内联表 UDF 会慢很多?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!