元组在 CPython 中是如何实现的?

How is tuple implemented in CPython?(元组在 CPython 中是如何实现的?)
本文介绍了元组在 CPython 中是如何实现的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在尝试了解 CPython 是如何在幕后实现的.Python 是高级别的很好,但我不喜欢把它当作一个黑盒子.

I've been trying to learn how CPython is implemented under the scenes. It's great that Python is high level, but I don't like treating it like a black box.

考虑到这一点,如何实现元组?我看过 源代码 (tupleobject.c),但它超出了我的头脑.

With that in mind, how are tuples implemented? I've had a look at the source (tupleobject.c), but it's going over my head.

我看到 PyTuple_MAXSAVESIZE = 20PyTuple_MAXFREELIST = 2000,什么是保存和空闲列表"?(长度为 20/21 或 2000/2001 的元组之间会有性能差异吗?是什么强制执行最大元组长度?)

I see that PyTuple_MAXSAVESIZE = 20 and PyTuple_MAXFREELIST = 2000, what is saving and the "free list"? (Will there be a performance difference between tuples of length 20/21 or 2000/2001? What enforces the maximum tuple length?)

推荐答案

请注意,此答案中的所有内容均基于我从查看您链接的实现中收集到的内容.

As a caveat, everything in this answer is based on what I've gleaned from looking over the implementation you linked.

元组的标准实现似乎只是一个数组.但是,有很多优化可以加快速度.

It seems that the standard implementation of a tuple is simply as an array. However, there are a bunch of optimizations in place to speed things up.

首先,如果您尝试创建一个空元组,CPython 将返回一个代表空元组的规范对象.因此,它可以节省大量仅分配单个对象的分配.

First, if you try to make an empty tuple, CPython instead will hand back a canonical object representing the empty tuple. As a result, it can save on a bunch of allocations that are just allocating a single object.

接下来,为了避免分配一堆小对象,CPython 为许多小列表回收内存.有一个固定常量 (PyTuple_MAXSAVESIZE),这样所有小于此长度的元组都有资格回收它们的空间.每当一个长度小于该常数的对象被释放时,与它相关的内存就有可能不会被释放,而是被存储在一个空闲列表"中.(更多关于下一段的内容)基于它的大小.这样,如果您需要分配一个大小为 n 的元组,而其中一个已被分配且不再使用,CPython 可以回收旧数组.

Next, to avoid allocating a bunch of small objects, CPython recycles memory for many small lists. There is a fixed constant (PyTuple_MAXSAVESIZE) such that all tuples less than this length are eligible to have their space reclaimed. Whenever an object of length less than this constant is deallocated, there is a chance that the memory associated with it will not be freed and instead will be stored in a "free list" (more on that in the next paragraph) based on its size. That way, if you ever need to allocate a tuple of size n and one has previously been allocated and is no longer in use, CPython can just recycle the old array.

空闲列表本身被实现为一个大小为 PyTuple_MAXSAVESIZE 的数组,存储指向未使用元组的指针,其中数组的第 n 个元素指向 NULL(如果没有大小为 n 的额外元组可用)或者到一个大小为 n 的回收元组.如果有多个不同的大小为 n 的元组可以重用,则它们通过将每个元组的第零个入口点指向下一个可以重用的元组,以一种链表的形式链接在一起.(由于只分配了一个长度为零的元组,因此永远不会有读取不存在的第零元素的风险).通过这种方式,分配器可以存储一定数量的每个大小的元组以供重用.为了确保这不会使用太多内存,还有第二个常量 PyTuple_MAXFREELIST 用于控制任何存储桶中任何这些链表的最大长度.然后有一个长度为 PyTuple_MAXSAVESIZE 的辅助数组,它存储每个给定长度的元组的链表长度,这样就不会超过这个上限.

The free list itself is implemented as an array of size PyTuple_MAXSAVESIZE storing pointers to unused tuples, where the nth element of the array points either to NULL (if no extra tuples of size n are available) or to a reclaimed tuple of size n. If there are multiple different tuples of size n that could be reused, they are chained together in a sort of linked list by having each tuple's zeroth entry point to the next tuple that can be reused. (Since there is only one tuple of length zero ever allocated, there is never a risk of reading a nonexistent zeroth element). In this way, the allocator can store some number of tuples of each size for reuse. To ensure that this doesn't use too much memory, there is a second constant PyTuple_MAXFREELIST that controls the maximum length of any of these linked lists within any bucket. There is then a secondary array of length PyTuple_MAXSAVESIZE that stores the length of the linked lists for tuples of each given length so that this upper limit isn't exceeded.

总而言之,这是一个非常聪明的实现!

All in all, it's a very clever implementation!

这篇关于元组在 CPython 中是如何实现的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!

本站部分内容来源互联网,如果有图片或者内容侵犯您的权益请联系我们删除!

相关文档推荐

Leetcode 234: Palindrome LinkedList(Leetcode 234:回文链接列表)
How do I read an Excel file directly from Dropbox#39;s API using pandas.read_excel()?(如何使用PANDAS.READ_EXCEL()直接从Dropbox的API读取Excel文件?)
subprocess.Popen tries to write to nonexistent pipe(子进程。打开尝试写入不存在的管道)
I want to realize Popen-code from Windows to Linux:(我想实现从Windows到Linux的POpen-code:)
Reading stdout from a subprocess in real time(实时读取子进程中的标准输出)
How to call type safely on a random file in Python?(如何在Python中安全地调用随机文件上的类型?)