问题描述
我已阅读有关世代和大型对象堆的信息.但是我仍然不明白拥有大型对象堆的意义(或好处)是什么?
I have read about Generations and Large object heap. But I still fail to understand what is the significance (or benefit) of having Large object heap?
如果 CLR 仅仅依赖第 2 代(考虑到 Gen0 和 Gen1 的阈值对于处理大对象来说很小)来存储大对象,会出现什么问题(在性能或内存方面)?
What could have went wrong (in terms of performance or memory) if CLR would have just relied on Generation 2 (Considering that threshold for Gen0 and Gen1 is small to handle Large objects) for storing large objects?
推荐答案
垃圾回收不只是摆脱未引用的对象,它还压缩堆.这是一个非常重要的优化.它不仅使内存使用效率更高(没有未使用的漏洞),而且使 CPU 缓存更高效.缓存在现代处理器上非常重要,它们比内存总线快一个数量级.
A garbage collection doesn't just get rid of unreferenced objects, it also compacts the heap. That's a very important optimization. It doesn't just make memory usage more efficient (no unused holes), it makes the CPU cache much more efficient. The cache is a really big deal on modern processors, they are an easy order of magnitude faster than the memory bus.
简单地通过复制字节来完成压缩.然而,这需要时间.对象越大,复制它的成本就越有可能超过可能的 CPU 缓存使用改进.
Compacting is done simply by copying bytes. That however takes time. The larger the object, the more likely that the cost of copying it outweighs the possible CPU cache usage improvements.
因此,他们运行了一系列基准测试来确定盈亏平衡点.并达到 85,000 字节作为复制不再提高性能的截止点.除了 double 数组的特殊例外,当数组有超过 1000 个元素时,它们被认为是大"的.这是对 32 位代码的另一种优化,大对象堆分配器具有特殊属性,它在对齐到 8 的地址分配内存,这与仅分配对齐到 4 的常规分代分配器不同.这种对齐对于 double 来说很重要,读取或写入未对齐的双精度非常昂贵.奇怪的是,稀疏的 Microsoft 信息从未提及长数组,不知道这是怎么回事.
So they ran a bunch of benchmarks to determine the break-even point. And arrived at 85,000 bytes as the cutoff point where copying no longer improves perf. With a special exception for arrays of double, they are considered 'large' when the array has more than 1000 elements. That's another optimization for 32-bit code, the large object heap allocator has the special property that it allocates memory at addresses that are aligned to 8, unlike the regular generational allocator that only allocates aligned to 4. That alignment is a big deal for double, reading or writing a mis-aligned double is very expensive. Oddly the sparse Microsoft info never mention arrays of long, not sure what's up with that.
Fwiw,有很多程序员担心大型对象堆没有被压缩.当他们编写的程序消耗了整个可用地址空间的一半以上时,这总是会被触发.随后使用诸如内存分析器之类的工具来找出程序被炸毁的原因,即使仍有大量未使用的虚拟内存可用.这样的工具显示了 LOH 中的漏洞,未使用的内存块,以前有一个大对象存在,但被垃圾收集了.这就是 LOH 不可避免的代价,这个洞只能通过分配给大小相等或更小的对象来重复使用.真正的问题是假设应该允许程序随时消耗所有虚拟内存.
Fwiw, there's lots of programmer angst about the large object heap not getting compacted. This invariably gets triggered when they write programs that consume more than half of the entire available address space. Followed by using a tool like a memory profiler to find out why the program bombed even though there was still lots of unused virtual memory available. Such a tool shows the holes in the LOH, unused chunks of memory where previously a large object lived but got garbage collected. Such is the inevitable price of the LOH, the hole can only be re-used by an allocation for an object that's equal or smaller in size. The real problem is assuming that a program should be allowed to consume all virtual memory at any time.
仅通过在 64 位操作系统上运行代码即可完全消失的问题.64 位进程有 8 TB 的可用虚拟内存地址空间,比 32 位进程高 3 个数量级.你就是不能用完孔.
A problem that otherwise disappears completely by just running the code on a 64-bit operating system. A 64-bit process has 8 terabytes of virtual memory address space available, 3 orders of magnitude more than a 32-bit process. You just can't run out of holes.
长话短说,LOH 使代码运行效率更高.代价是使用可用的虚拟内存地址空间效率较低.
Long story short, the LOH makes code run more efficient. At the cost of using available virtual memory address space less efficient.
更新,.NET 4.5.1 现在支持压缩 LOH,GCSettings.LargeObjectHeapCompactionMode 属性.请注意后果.
UPDATE, .NET 4.5.1 now supports compacting the LOH, GCSettings.LargeObjectHeapCompactionMode property. Beware the consequences please.
这篇关于为什么选择大对象堆,我们为什么要关心?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!