拡張モジュールの定義

CPython の C 拡張は、共有ライブラリ (例えば、 Linux では .so ファイル、 Windows では .pyd DLL) であり、 Python プロセスで読み込み可能 (例えば、互換性のあるコンパイラ設定でコンパイルされている) で、 初期化関数 をエクスポートするものです。

デフォルトで (つまり、 importlib.machinery.ExtensionFileLoader によって) インポート可能であるためには、共有ライブラリは sys.path 上で利用可能である必要があり、モジュール名と importlib.machinery.EXTENSION_SUFFIXES. に含まれる拡張子で命名する必要があります。

注釈

拡張モジュールのビルド、パッケージ化および配布には、サードパーティのツールが最適であり、このドキュメントの範囲外です。適切なツールの一つは Setuptools であり、ドキュメントは https://setuptools.pypa.io/en/latest/setuptools.html にあります。

通常、初期化関数は PyModuleDef_Init() を用いて初期化されたモジュール定義を返します。これにより、作成プロセスをいくつかの段階に分割することができます:

  • Before any substantial code is executed, Python can determine which capabilities the module supports, and it can adjust the environment or refuse loading an incompatible extension.

  • By default, Python itself creates the module object -- that is, it does the equivalent of object.__new__() for classes. It also sets initial attributes like __package__ and __loader__.

  • Afterwards, the module object is initialized using extension-specific code -- the equivalent of __init__() on classes.

This is called multi-phase initialization to distinguish it from the legacy (but still supported) single-phase initialization scheme, where the initialization function returns a fully constructed module. See the single-phase-initialization section below for details.

バージョン 3.5 で変更: 多段階初期化 (PEP 489) のサポートを追加しました。

複数のモジュールインスタンス

By default, extension modules are not singletons. For example, if the sys.modules entry is removed and the module is re-imported, a new module object is created, and typically populated with fresh method and type objects. The old module is subject to normal garbage collection. This mirrors the behavior of pure-Python modules.

Additional module instances may be created in sub-interpreters or after Python runtime reinitialization (Py_Finalize() and Py_Initialize()). In these cases, sharing Python objects between module instances would likely cause crashes or undefined behavior.

To avoid such issues, each instance of an extension module should be isolated: changes to one instance should not implicitly affect the others, and all state owned by the module, including references to Python objects, should be specific to a particular module instance. See 拡張モジュールを分離する for more details and a practical guide.

A simpler way to avoid these issues is raising an error on repeated initialization.

All modules are expected to support sub-interpreters, or otherwise explicitly signal a lack of support. This is usually achieved by isolation or blocking repeated initialization, as above. A module may also be limited to the main interpreter using the Py_mod_multiple_interpreters slot.

初期化関数

拡張モジュールによって定義された初期化関数は、次のシグネチャを持ちます:

PyObject *PyInit_modulename(void)

名前が PyInit_<name> であり、 <name> はモジュール名です。

For modules with ASCII-only names, the function must instead be named PyInit_<name>, with <name> replaced by the name of the module. When using 多段階初期化, non-ASCII module names are allowed. In this case, the initialization function name is PyInitU_<name>, with <name> encoded using Python's punycode encoding with hyphens replaced by underscores. In Python:

def initfunc_name(name):
    try:
        suffix = b'_' + name.encode('ascii')
    except UnicodeEncodeError:
        suffix = b'U_' + name.encode('punycode').replace(b'-', b'_')
    return b'PyInit' + suffix

初期化関数は、ヘルパーマクロを使用して定義することが推奨されます:

PyMODINIT_FUNC

Declare an extension module initialization function. This macro:

  • specifies the PyObject* return type,

  • adds any special linkage declarations required by the platform, and

  • for C++, declares the function as extern "C".

For example, a module called spam would be defined like this:

static struct PyModuleDef spam_module = {
    .m_base = PyModuleDef_HEAD_INIT,
    .m_name = "spam",
    ...
};

PyMODINIT_FUNC
PyInit_spam(void)
{
    return PyModuleDef_Init(&spam_module);
}

It is possible to export multiple modules from a single shared library by defining multiple initialization functions. However, importing them requires using symbolic links or a custom importer, because by default only the function corresponding to the filename is found. See the Multiple modules in one library section in PEP 489 for details.

The initialization function is typically the only non-static item defined in the module's C source.

多段階初期化

Normally, the initialization function (PyInit_modulename) returns a PyModuleDef instance with non-NULL m_slots. Before it is returned, the PyModuleDef instance must be initialized using the following function:

PyObject *PyModuleDef_Init(PyModuleDef *def)
次に属します: Stable ABI (バージョン 3.5 より).

Ensure a module definition is a properly initialized Python object that correctly reports its type and a reference count.

Return def cast to PyObject*, or NULL if an error occurred.

Calling this function is required for 多段階初期化. It should not be used in other contexts.

Note that Python assumes that PyModuleDef structures are statically allocated. This function may return either a new reference or a borrowed one; this reference must not be released.

Added in version 3.5.

従来の一段階初期化

注意

Single-phase initialization is a legacy mechanism to initialize extension modules, with known drawbacks and design flaws. Extension module authors are encouraged to use multi-phase initialization instead.

In single-phase initialization, the initialization function (PyInit_modulename) should create, populate and return a module object. This is typically done using PyModule_Create() and functions like PyModule_AddObjectRef().

Single-phase initialization differs from the default in the following ways:

  • Single-phase modules are, or rather contain, “singletons”.

    When the module is first initialized, Python saves the contents of the module's __dict__ (that is, typically, the module's functions and types).

    For subsequent imports, Python does not call the initialization function again. Instead, it creates a new module object with a new __dict__, and copies the saved contents to it. For example, given a single-phase module _testsinglephase [1] that defines a function sum and an exception class error:

    >>> import sys
    >>> import _testsinglephase as one
    >>> del sys.modules['_testsinglephase']
    >>> import _testsinglephase as two
    >>> one is two
    False
    >>> one.__dict__ is two.__dict__
    False
    >>> one.sum is two.sum
    True
    >>> one.error is two.error
    True
    

    The exact behavior should be considered a CPython implementation detail.

  • To work around the fact that PyInit_modulename does not take a spec argument, some state of the import machinery is saved and applied to the first suitable module created during the PyInit_modulename call. Specifically, when a sub-module is imported, this mechanism prepends the parent package name to the name of the module.

    A single-phase PyInit_modulename function should create “its” module object as soon as possible, before any other module objects can be created.

  • Non-ASCII module names (PyInitU_modulename) are not supported.

  • Single-phase modules support module lookup functions like PyState_FindModule().