问题描述
许多书籍和教程都说哈希表的大小必须是素数,才能在所有桶中均匀分布密钥.但是 Java 的 HashMap
总是使用 2 的幂的大小.不应该使用素数吗?哈希表大小是素数"还是2 的幂"哪个更好?
Many books and tutorials say that the size of a hash table must be a prime to evenly distribute the keys in all the buckets. But Java's HashMap
always uses a size that is a power of two. Shouldn't it be using a prime? What's better, a "prime" or a "power of two" as the hash table size?
推荐答案
使用 2 的幂可以有效地屏蔽哈希码的最高位.因此,在这种情况下,质量差的哈希函数可能会表现得特别糟糕.
Using a power of two effectively masks out top bits of the hash code. Thus a poor-quality hash function might perform particularly badly in this scenario.
Java 的 HashMap
通过不信任对象的 hashCode()
实现和 对其结果应用第二级散列:
Java's HashMap
mitigates this by mistrusting the object's hashCode()
implementation and applying a second level of hashing to its result:
对给定的 hashCode 应用补充散列函数,以防止质量差的散列函数.这很关键,因为 HashMap 使用长度为二的幂的哈希表,否则会遇到低位没有差异的 hashCode 的冲突.
Applies a supplemental hash function to a given hashCode, which defends against poor quality hash functions. This is critical because HashMap uses power-of-two length hash tables, that otherwise encounter collisions for hashCodes that do not differ in lower bits.
如果你有一个很好的散列函数,或者做一些类似于 HashMap
所做的事情,你是否使用素数等作为表大小都没有关系.
If you have a good hash function, or do something similar to what HashMap
does, it does not matter whether you use prime numbers etc as the table size.
另一方面,如果散列函数未知或质量差,那么使用质数将是更安全的选择.但是,它会使动态大小的表格难以实现,因为突然之间,您需要能够生成素数,而不是仅仅将大小乘以一个常数因子.
If, on the other hand, the hash function is of unknown or poor quality, then using a prime number would be a safer bet. It will, however, make dynamically-sized tables tricker to implement, since all of a sudden you need to be able to produce prime numbers instead of just multiplying the size by a constant factor.
这篇关于Java:一个“主要"的数或“二的幂"作为 HashMap 的大小?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!