问题描述
我了解 HashMap 如何工作的基础知识 - hm.put(obj) 根据 obj.hashCode 值找到正确的存储桶来放置对象.然后在该桶中,如果另一个对象 .equals(obj) 则替换它,如果没有将其添加到桶中.
I understand the basics of how a HashMap works - hm.put(obj) finds the correct bucket to place the object into, based on the obj.hashCode value. Then within that bucket if another object .equals(obj) then replace it, if not add it to the bucket.
但我不清楚 HashMap.put 和 HashMap.get 如何可以是常数时间 O(1).据我了解,存储桶的数量应基于哈希码,因此将 100 个对象放入哈希图中将(大致)创建 100 个存储桶(我确实了解哈希码有时会发生冲突,因此它可能小于 100 但不是经常).
But I am unclear on how HashMap.put and HashMap.get can be constant time O(1). As I understand it, the number of buckets should be based on the hashcodes, so putting 100 objects into a hashmap will (roughly) create 100 buckets (I do understand there are sometimes collisions in hashcodes, so it could be less than 100 but not often).
因此,随着添加到哈希图中的对象数量的增加,桶的数量也会增加 - 由于冲突很少,这并不意味着桶的数量几乎随着添加的对象数量线性增长,其中case HashMap.put/HashMap.get 将是 O(n),因为它必须在找到正确的存储桶之前搜索每个存储桶.
So as the number of objects added to a hashmap grows, so does the number of buckets - and since collisions are rare then doesn't that mean that the number of buckets grows almost linearly with the number of objects added, in which case HashMap.put/HashMap.get would be O(n) as it has to search over every bucket before it finds the right one.
我错过了什么?
推荐答案
这是我的两分钱,朋友.以您认为数组的方式来考虑 HashMap.事实上,它是一个数组.如果我给你索引 11,你不必遍历数组来查找索引 11 处的对象.你只需直接去那里.这就是 HashMap 的工作原理:诀窍是使索引与值相同——本质上.
Here is my two cent, friend. Think of a HashMap the way you think of an array. In fact, it is an array. If I give you index 11, you don't have to iterate through the array to find the object at index 11. You simply go there directly. That's how a HashMap works: the trick is to make the index the same as the value -- essentially.
这就是哈希码的用武之地.让我们看一下哈希函数是单位乘数(即 1)的简单情况.然后,如果你有 0 到 99 的值并且你想将它们存储在一个数组中,那么它们将分别存储在索引 0 到 99 处.因此 put 和 get 显然是 O(1),因为获取和放入数组是 O(1).现在让我们想象一个不那么简单的散列函数,比如 y = x+2.所以在这种情况下,值 0 到 99 将存储在索引 2 到 101 处.这里给定一个值,比如 11,您必须计算散列以找到它或放入它(散列为 11+2 =13).好吧,哈希函数正在做一些工作来计算给定值的正确索引(在我们的例子中 y = x+2= 11+2=13).但是,这项工作所付出的努力与您拥有多少数据点无关.如果我需要放置 999 或 123,则单个 put 或 get 的工作量仍然相同: y= x+2:我每次执行 put 或 get 时只需添加两个:这是恒定的工作.
So that is where a hashcode comes in. Let's look at the trivial case where your hash function is a unit multiplier (i.e. 1). Then if you have the values 0 to 99 and you want to store them in an array, then they will be stored at indices 0 to 99 respectively. So that a put and a get is clearly O(1) since getting and putting things in an array is O(1). Now let's imagine a less trivial hash function, say, y = x+2. So in this case the values 0 to 99 will be stored at indices 2 to 101. Here given a value, say 11, you must compute the hash to find it or put it (the hash being 11+2 =13). So okay, the hash function is doing some work to calculate the correct index given the value (in our case y = x+2= 11+2=13). But the amount of effort that goes into that work has nothing to do with how many data points you have. If I need to put 999 or 123, the amount of work for a single put or get is still the same: y= x+2: I only have to add two each time I do a put or a get: that's constant work.
您的困惑可能是您想一次放置 n 个值.那么在这种情况下,每个 put 仍然是常量 c
.但是 c
乘以 n 是 c*n
=O(n).因此, n 与 put 本身无关,而是您同时进行 n 个 put.我希望这个解释会有所帮助.
Your confusion may be that you want to put n values at once. Well in that case each put is still constant c
. But c
multiplied by n is c*n
=O(n). So the n has nothing to do with the put itself, but rather that you are making n puts all at once. I hope this explanation helps.
这篇关于Java HashMap 如何对“get"执行恒定时间查找 O(1)?操作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!