问题描述
有没有一种简洁的方法可以从 sql server 表中检索随机记录?
Is there a succinct way to retrieve a random record from a sql server table?
我想随机化我的单元测试数据,所以我正在寻找一种从表中选择随机 ID 的简单方法.在英语中,选择将是从表中选择一个 id,其中 id 是表中最低 id 和表中最高 id 之间的随机数."
I would like to randomize my unit test data, so am looking for a simple way to select a random id from a table. In English, the select would be "Select one id from the table where the id is a random number between the lowest id in the table and the highest id in the table."
如果不运行查询,测试是否为空值,然后在为空值时重新运行,我无法找到一种方法.
I can't figure out a way to do it without have to run the query, test for a null value, then re-run if null.
想法?
推荐答案
是否有一种简洁的方法可以从 sql server 表中检索随机记录?
Is there a succinct way to retrieve a random record from a sql server table?
是的
SELECT TOP 1 * FROM table ORDER BY NEWID()
说明
为每一行生成一个 NEWID()
,然后按它对表格进行排序.返回第一条记录(即具有最低"GUID 的记录).
Explanation
A NEWID()
is generated for each row and the table is then sorted by it. The first record is returned (i.e. the record with the "lowest" GUID).
从第 4 版开始,GUID 生成为伪随机数:
GUIDs are generated as pseudo-random numbers since version four:
第 4 版 UUID 旨在从真正随机或伪随机数.
The version 4 UUID is meant for generating UUIDs from truly-random or pseudo-random numbers.
算法如下:
- 设置两个最高有效位(位 6 和 7)clock_seq_hi_and_reserved 分别为零和一.
- 设置四个最高有效位(位 12 到 15)time_hi_and_version 字段到 4 位版本号从第 4.1.3 节.
- 将所有其他位设置为随机(或伪随机)选择值.
—通用唯一标识符 (UUID) URN 命名空间 - RFC 4122
—A Universally Unique IDentifier (UUID) URN Namespace - RFC 4122
替代的 SELECT TOP 1 * FROM table ORDER BY RAND()
不会像人们想象的那样工作.RAND()
每次查询返回一个值,因此所有行将共享相同的值.
The alternative SELECT TOP 1 * FROM table ORDER BY RAND()
will not work as one would think. RAND()
returns one single value per query, thus all rows will share the same value.
虽然 GUID 值是伪随机的,但对于要求更高的应用程序,您将需要更好的 PRNG.
While GUID values are pseudo-random, you will need a better PRNG for the more demanding applications.
对于大约 1,000,000 行,典型性能不到 10 秒 —当然取决于系统.请注意,不可能命中索引,因此性能会相对有限.
Typical performance is less than 10 seconds for around 1,000,000 rows — of course depending on the system. Note that it's impossible to hit an index, thus performance will be relatively limited.
这篇关于来自数据库表的随机记录 (T-SQL)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!