記憶體管理
**********


總覽
====

在 Python 中，内存管理涉及到一个包含所有 Python 对象和数据结构的私有堆
（heap）。这个私有堆的管理由内部的 *Python 内存管理器（Python memory
manager）* 保证。Python 内存管理器有不同的组件来处理各种动态存储管理方
面的问题，如共享、分割、预分配或缓存。

在最底层，一个原始内存分配器通过与操作系统的内存管理器交互，确保私有堆
中有足够的空间来存储所有与 Python 相关的数据。在原始内存分配器的基础上
，几个对象特定的分配器在同一堆上运行，并根据每种对象类型的特点实现不同
的内存管理策略。例如，整数对象在堆内的管理方式不同于字符串、元组或字典
，因为整数需要不同的存储需求和速度与空间的权衡。因此，Python 内存管理
器将一些工作分配给对象特定分配器，但确保后者在私有堆的范围内运行。

Python 堆内存的管理是由解释器来执行，用户对它没有控制权，即使他们经常
操作指向堆内内存块的对象指针，理解这一点十分重要。Python 对象和其他内
部缓冲区的堆空间分配是由 Python 内存管理器按需通过本文档中列出的
Python/C API 函数进行的。

为了避免内存破坏，扩展的作者永远不应该试图用 C 库函数导出的函数来对
Python 对象进行操作，这些函数包括： "malloc()", "calloc()",
"realloc()" 和 "free()"。这将导致 C 分配器和 Python 内存管理器之间的混
用，引发严重后果，这是由于它们实现了不同的算法，并在不同的堆上操作。但
是，我们可以安全地使用 C 库分配器为单独的目的分配和释放内存块，如下例
所示：

   PyObject *res;
   char *buf = (char *) malloc(BUFSIZ); /* for I/O */

   if (buf == NULL)
       return PyErr_NoMemory();
   ...Do some I/O operation involving buf...
   res = PyBytes_FromString(buf);
   free(buf); /* malloc'ed */
   return res;

在这个例子中，I/O 缓冲区的内存请求是由 C 库分配器处理的。Python 内存管
理器只参与了分配作为结果返回的字节对象。

然而，在大多数情况下，建议专门从 Python 堆中分配内存，因为后者由
Python 内存管理器控制。例如，当解释器扩展了用 C 写的新对象类型时，就必
须这样做。使用 Python 堆的另一个原因是希望*通知* Python 内存管理器关于
扩展模块的内存需求。即使所请求的内存全部只用于内部的、高度特定的目的，
将所有的内存请求交给 Python 内存管理器能让解释器对其内存占用的整体情况
有更准确的了解。因此，在某些情况下，Python 内存管理器可能会触发或不触
发适当的操作，如垃圾回收、内存压缩或其他预防性操作。请注意，通过使用前
面例子中所示的 C 库分配器，为 I/O 缓冲区分配的内存会完全不受 Python 内
存管理器管理。

也參考:

  环境变量 "PYTHONMALLOC" 可被用来配置 Python 所使用的内存分配器。

  环境变量 "PYTHONMALLOCSTATS" 可以用来在每次创建和关闭新的 pymalloc
  对象区域时打印 pymalloc  内存分配器 的统计数据。


原始内存接口
============

以下函数集封装了系统分配器。这些函数是线程安全的，不需要持有 *GIL*。

default raw memory allocator 使用这些函数："malloc()"、 "calloc()"、
"realloc()" 和 "free()"；申请零字节时则调用 "malloc(1)" （或
"calloc(1, 1)"）

3.4 版新加入.

void* PyMem_RawMalloc(size_t n)

   分配 *n* 个字节，并返回一个指向分配的内存的 "void*" 类型指针，如果
   请求失败，则返回 "NULL" 。

   请求零字节可能返回一个独特的非 "NULL" 指针，就像调用了
   "PyMem_RawMalloc(1)" 一样。但是内存不会以任何方式被初始化。

void* PyMem_RawCalloc(size_t nelem, size_t elsize)

   分配 *nelem* 个元素，每个元素的大小为 *elsize* 字节，并返回指向分配
   的内存的 "void*" 类型指针，如果请求失败，则返回 "NULL" 。内存被初始
   化为零。

   请求零字节可能返回一个独特的非 "NULL" 指针，就像调用了
   "PyMem_RawCalloc(1, 1)" 一样。

   3.5 版新加入.

void* PyMem_RawRealloc(void *p, size_t n)

   将 *p* 指向的内存块大小调整为 *n* 字节。以新旧内存块大小中的最小值
   为准，其中内容保持不变，

   如果 *p* 是 "NULL" ，则相当于调用 "PyMem_RawMalloc(n)" ；如果 *n*
   等于 0，则内存块大小会被调整，但不会被释放，返回非 "NULL" 指针。

   除非 *p* 是 "NULL" ，否则它必须是之前调用 "PyMem_RawMalloc()" 、
   "PyMem_RawRealloc()" 或 "PyMem_RawCalloc()" 所返回的。

   如果请求失败，"PyMem_RawRealloc()" 返回 "NULL" ， *p* 仍然是指向先
   前内存区域的有效指针。

void PyMem_RawFree(void *p)

   释放 *p* 指向的内存块。 *p* 必须是之前调用 "PyMem_RawMalloc()" 、
   "PyMem_RawRealloc()" 或 "PyMem_RawCalloc()" 所返回的指针。否则，或
   在 "PyMem_RawFree(p)" 之前已经调用过的情况下，未定义的行为会发生。

   如果 *p* 是 "NULL", 那么什么操作也不会进行。


内存接口
========

以下函数集，仿照 ANSI C 标准，并指定了请求零字节时的行为，可用于从
Python堆分配和释放内存。

默认内存分配器 使用了 pymalloc 内存分配器.

警告:

  在使用这些函数时，必须持有 *全局解释器锁（GIL）* 。

3.6 版更變: 现在默认的分配器是 pymalloc 而非系统的 "malloc()" 。

void* PyMem_Malloc(size_t n)

   分配 *n* 个字节，并返回一个指向分配的内存的 "void*" 类型指针，如果
   请求失败，则返回 "NULL" 。

   请求零字节可能返回一个独特的非 "NULL" 指针，就像调用了
   "PyMem_Malloc(1)" 一样。但是内存不会以任何方式被初始化。

void* PyMem_Calloc(size_t nelem, size_t elsize)

   分配 *nelem* 个元素，每个元素的大小为 *elsize* 字节，并返回指向分配
   的内存的 "void*" 类型指针，如果请求失败，则返回 "NULL" 。内存被初始
   化为零。

   请求零字节可能返回一个独特的非 "NULL" 指针，就像调用了
   "PyMem_Calloc(1, 1)" 一样。

   3.5 版新加入.

void* PyMem_Realloc(void *p, size_t n)

   将 *p* 指向的内存块大小调整为 *n* 字节。以新旧内存块大小中的最小值
   为准，其中内容保持不变，

   如果 *p* 是 "NULL" ，则相当于调用 "PyMem_Malloc(n)" ；如果 *n* 等于
   0，则内存块大小会被调整，但不会被释放，返回非 "NULL" 指针。

   除非 *p* 是 "NULL" ，否则它必须是之前调用 "PyMem_Malloc()" 、
   "PyMem_Realloc()" 或 "PyMem_Calloc()" 所返回的。

   如果请求失败，"PyMem_Realloc()" 返回 "NULL" ， *p* 仍然是指向先前内
   存区域的有效指针。

void PyMem_Free(void *p)

   释放 *p* 指向的内存块。 *p* 必须是之前调用 "PyMem_Malloc()" 、
   "PyMem_Realloc()" 或 "PyMem_Calloc()" 所返回的指针。否则，或在
   "PyMem_Free(p)" 之前已经调用过的情况下，未定义的行为会发生。

   如果 *p* 是 "NULL", 那么什么操作也不会进行。

以下面向类型的宏为方便而提供。 注意 *TYPE* 可以指任何 C 类型。

TYPE* PyMem_New(TYPE, size_t n)

   与 "PyMem_Malloc()" 相同，但分配 "(n * sizeof(TYPE))" 字节的内存。
   返回一个转换为 "TYPE*" 的指针。内存不会以任何方式被初始化。

TYPE* PyMem_Resize(void *p, TYPE, size_t n)

   与 "PyMem_Realloc()" 相同，但内存块的大小被调整为 "(n *
   sizeof(TYPE))" 字节。返回一个转换为 "TYPE|*" 类型的指针。返回时，
   *p* 是指向新内存区域的指针；如果失败，则返回 "NULL" 。

   这是一个 C 预处理宏， *p* 总是被重新赋值。请保存 *p* 的原始值，以避
   免在处理错误时丢失内存。

void PyMem_Del(void *p)

   与 "PyMem_Free()" 相同

In addition, the following macro sets are provided for calling the
Python memory allocator directly, without involving the C API
functions listed above. However, note that their use does not preserve
binary compatibility across Python versions and is therefore
deprecated in extension modules.

* "PyMem_MALLOC(size)"

* "PyMem_NEW(type, size)"

* "PyMem_REALLOC(ptr, size)"

* "PyMem_RESIZE(ptr, type, size)"

* "PyMem_FREE(ptr)"

* "PyMem_DEL(ptr)"


对象分配器
==========

以下函数集，仿照 ANSI C 标准，并指定了请求零字节时的行为，可用于从
Python堆分配和释放内存。

The default object allocator uses the pymalloc memory allocator.

警告:

  在使用这些函数时，必须持有 *全局解释器锁（GIL）* 。

void* PyObject_Malloc(size_t n)

   分配 *n* 个字节，并返回一个指向分配的内存的 "void*" 类型指针，如果
   请求失败，则返回 "NULL" 。

   Requesting zero bytes returns a distinct non-"NULL" pointer if
   possible, as if "PyObject_Malloc(1)" had been called instead. The
   memory will not have been initialized in any way.

void* PyObject_Calloc(size_t nelem, size_t elsize)

   分配 *nelem* 个元素，每个元素的大小为 *elsize* 字节，并返回指向分配
   的内存的 "void*" 类型指针，如果请求失败，则返回 "NULL" 。内存被初始
   化为零。

   Requesting zero elements or elements of size zero bytes returns a
   distinct non-"NULL" pointer if possible, as if "PyObject_Calloc(1,
   1)" had been called instead.

   3.5 版新加入.

void* PyObject_Realloc(void *p, size_t n)

   将 *p* 指向的内存块大小调整为 *n* 字节。以新旧内存块大小中的最小值
   为准，其中内容保持不变，

   If *p* is "NULL", the call is equivalent to "PyObject_Malloc(n)";
   else if *n* is equal to zero, the memory block is resized but is
   not freed, and the returned pointer is non-"NULL".

   Unless *p* is "NULL", it must have been returned by a previous call
   to "PyObject_Malloc()", "PyObject_Realloc()" or
   "PyObject_Calloc()".

   If the request fails, "PyObject_Realloc()" returns "NULL" and *p*
   remains a valid pointer to the previous memory area.

void PyObject_Free(void *p)

   Frees the memory block pointed to by *p*, which must have been
   returned by a previous call to "PyObject_Malloc()",
   "PyObject_Realloc()" or "PyObject_Calloc()".  Otherwise, or if
   "PyObject_Free(p)" has been called before, undefined behavior
   occurs.

   如果 *p* 是 "NULL", 那么什么操作也不会进行。


默认内存分配器
==============

默认内存分配器：

+---------------------------------+----------------------+--------------------+-----------------------+----------------------+
| 配置                            | 名称                 | PyMem_RawMalloc    | PyMem_Malloc          | PyObject_Malloc      |
|=================================|======================|====================|=======================|======================|
| 发布版本                        | ""pymalloc""         | "malloc"           | "pymalloc"            | "pymalloc"           |
+---------------------------------+----------------------+--------------------+-----------------------+----------------------+
| 调试构建                        | ""pymalloc_debug""   | "malloc" + debug   | "pymalloc" + debug    | "pymalloc" + debug   |
+---------------------------------+----------------------+--------------------+-----------------------+----------------------+
| Release build, without pymalloc | ""malloc""           | "malloc"           | "malloc"              | "malloc"             |
+---------------------------------+----------------------+--------------------+-----------------------+----------------------+
| Debug build, without pymalloc   | ""malloc_debug""     | "malloc" + debug   | "malloc" + debug      | "malloc" + debug     |
+---------------------------------+----------------------+--------------------+-----------------------+----------------------+

说明：

* Name: value for "PYTHONMALLOC" environment variable

* "malloc": system allocators from the standard C library, C
  functions: "malloc()", "calloc()", "realloc()" and "free()"

* "pymalloc": pymalloc memory allocator

* "+ debug": with debug hooks installed by "PyMem_SetupDebugHooks()"


Customize Memory Allocators
===========================

3.4 版新加入.

PyMemAllocatorEx

   Structure used to describe a memory block allocator. The structure
   has four fields:

   +------------------------------------------------------------+-----------------------------------------+
   | 域                                                         | 含义                                    |
   |============================================================|=========================================|
   | "void *ctx"                                                | user context passed as first argument   |
   +------------------------------------------------------------+-----------------------------------------+
   | "void* malloc(void *ctx, size_t size)"                     | allocate a memory block                 |
   +------------------------------------------------------------+-----------------------------------------+
   | "void* calloc(void *ctx, size_t nelem, size_t elsize)"     | allocate a memory block initialized     |
   |                                                            | with zeros                              |
   +------------------------------------------------------------+-----------------------------------------+
   | "void* realloc(void *ctx, void *ptr, size_t new_size)"     | allocate or resize a memory block       |
   +------------------------------------------------------------+-----------------------------------------+
   | "void free(void *ctx, void *ptr)"                          | 释放一个内存块                          |
   +------------------------------------------------------------+-----------------------------------------+

   3.5 版更變: The "PyMemAllocator" structure was renamed to
   "PyMemAllocatorEx" and a new "calloc" field was added.

PyMemAllocatorDomain

   Enum used to identify an allocator domain. Domains:

   PYMEM_DOMAIN_RAW

      函数

      * "PyMem_RawMalloc()"

      * "PyMem_RawRealloc()"

      * "PyMem_RawCalloc()"

      * "PyMem_RawFree()"

   PYMEM_DOMAIN_MEM

      函数

      * "PyMem_Malloc()",

      * "PyMem_Realloc()"

      * "PyMem_Calloc()"

      * "PyMem_Free()"

   PYMEM_DOMAIN_OBJ

      函数

      * "PyObject_Malloc()"

      * "PyObject_Realloc()"

      * "PyObject_Calloc()"

      * "PyObject_Free()"

void PyMem_GetAllocator(PyMemAllocatorDomain domain, PyMemAllocatorEx *allocator)

   Get the memory block allocator of the specified domain.

void PyMem_SetAllocator(PyMemAllocatorDomain domain, PyMemAllocatorEx *allocator)

   Set the memory block allocator of the specified domain.

   The new allocator must return a distinct non-"NULL" pointer when
   requesting zero bytes.

   For the "PYMEM_DOMAIN_RAW" domain, the allocator must be thread-
   safe: the *GIL* is not held when the allocator is called.

   If the new allocator is not a hook (does not call the previous
   allocator), the "PyMem_SetupDebugHooks()" function must be called
   to reinstall the debug hooks on top on the new allocator.

void PyMem_SetupDebugHooks(void)

   Setup hooks to detect bugs in the Python memory allocator
   functions.

   Newly allocated memory is filled with the byte "0xCD"
   ("CLEANBYTE"), freed memory is filled with the byte "0xDD"
   ("DEADBYTE"). Memory blocks are surrounded by "forbidden bytes"
   ("FORBIDDENBYTE": byte "0xFD").

   Runtime checks:

   * Detect API violations, ex: "PyObject_Free()" called on a buffer
     allocated by "PyMem_Malloc()"

   * Detect write before the start of the buffer (buffer underflow)

   * Detect write after the end of the buffer (buffer overflow)

   * Check that the *GIL* is held when allocator functions of
     "PYMEM_DOMAIN_OBJ" (ex: "PyObject_Malloc()") and
     "PYMEM_DOMAIN_MEM" (ex: "PyMem_Malloc()") domains are called

   On error, the debug hooks use the "tracemalloc" module to get the
   traceback where a memory block was allocated. The traceback is only
   displayed if "tracemalloc" is tracing Python memory allocations and
   the memory block was traced.

   These hooks are installed by default if Python is compiled in debug
   mode. The "PYTHONMALLOC" environment variable can be used to
   install debug hooks on a Python compiled in release mode.

   3.6 版更變: This function now also works on Python compiled in
   release mode. On error, the debug hooks now use "tracemalloc" to
   get the traceback where a memory block was allocated. The debug
   hooks now also check if the GIL is held when functions of
   "PYMEM_DOMAIN_OBJ" and "PYMEM_DOMAIN_MEM" domains are called.

   3.8 版更變: Byte patterns "0xCB" ("CLEANBYTE"), "0xDB" ("DEADBYTE")
   and "0xFB" ("FORBIDDENBYTE") have been replaced with "0xCD", "0xDD"
   and "0xFD" to use the same values than Windows CRT debug "malloc()"
   and "free()".


The pymalloc allocator
======================

Python has a *pymalloc* allocator optimized for small objects (smaller
or equal to 512 bytes) with a short lifetime. It uses memory mappings
called "arenas" with a fixed size of 256 KiB. It falls back to
"PyMem_RawMalloc()" and "PyMem_RawRealloc()" for allocations larger
than 512 bytes.

*pymalloc* is the default allocator of the "PYMEM_DOMAIN_MEM" (ex:
"PyMem_Malloc()") and "PYMEM_DOMAIN_OBJ" (ex: "PyObject_Malloc()")
domains.

The arena allocator uses the following functions:

* "VirtualAlloc()" and "VirtualFree()" on Windows,

* "mmap()" and "munmap()" if available,

* "malloc()" and "free()" otherwise.


Customize pymalloc Arena Allocator
----------------------------------

3.4 版新加入.

PyObjectArenaAllocator

   Structure used to describe an arena allocator. The structure has
   three fields:

   +----------------------------------------------------+-----------------------------------------+
   | 域                                                 | 含义                                    |
   |====================================================|=========================================|
   | "void *ctx"                                        | user context passed as first argument   |
   +----------------------------------------------------+-----------------------------------------+
   | "void* alloc(void *ctx, size_t size)"              | allocate an arena of size bytes         |
   +----------------------------------------------------+-----------------------------------------+
   | "void free(void *ctx, size_t size, void *ptr)"     | free an arena                           |
   +----------------------------------------------------+-----------------------------------------+

PyObject_GetArenaAllocator(PyObjectArenaAllocator *allocator)

   Get the arena allocator.

PyObject_SetArenaAllocator(PyObjectArenaAllocator *allocator)

   Set the arena allocator.


tracemalloc C API
=================

3.7 版新加入.

int PyTraceMalloc_Track(unsigned int domain, uintptr_t ptr, size_t size)

   Track an allocated memory block in the "tracemalloc" module.

   Return "0" on success, return "-1" on error (failed to allocate
   memory to store the trace). Return "-2" if tracemalloc is disabled.

   If memory block is already tracked, update the existing trace.

int PyTraceMalloc_Untrack(unsigned int domain, uintptr_t ptr)

   Untrack an allocated memory block in the "tracemalloc" module. Do
   nothing if the block was not tracked.

   Return "-2" if tracemalloc is disabled, otherwise return "0".


示例
====

Here is the example from section 總覽, rewritten so that the I/O
buffer is allocated from the Python heap by using the first function
set:

   PyObject *res;
   char *buf = (char *) PyMem_Malloc(BUFSIZ); /* for I/O */

   if (buf == NULL)
       return PyErr_NoMemory();
   /* ...Do some I/O operation involving buf... */
   res = PyBytes_FromString(buf);
   PyMem_Free(buf); /* allocated with PyMem_Malloc */
   return res;

The same code using the type-oriented function set:

   PyObject *res;
   char *buf = PyMem_New(char, BUFSIZ); /* for I/O */

   if (buf == NULL)
       return PyErr_NoMemory();
   /* ...Do some I/O operation involving buf... */
   res = PyBytes_FromString(buf);
   PyMem_Del(buf); /* allocated with PyMem_New */
   return res;

Note that in the two examples above, the buffer is always manipulated
via functions belonging to the same set. Indeed, it is required to use
the same memory API family for a given memory block, so that the risk
of mixing different allocators is reduced to a minimum. The following
code sequence contains two errors, one of which is labeled as *fatal*
because it mixes two different allocators operating on different
heaps.

   char *buf1 = PyMem_New(char, BUFSIZ);
   char *buf2 = (char *) malloc(BUFSIZ);
   char *buf3 = (char *) PyMem_Malloc(BUFSIZ);
   ...
   PyMem_Del(buf3);  /* Wrong -- should be PyMem_Free() */
   free(buf2);       /* Right -- allocated via malloc() */
   free(buf1);       /* Fatal -- should be PyMem_Del()  */

In addition to the functions aimed at handling raw memory blocks from
the Python heap, objects in Python are allocated and released with
"PyObject_New()", "PyObject_NewVar()" and "PyObject_Del()".

These will be explained in the next chapter on defining and
implementing new object types in C.
