はじめに¶

Python のアプリケーションプログラマ用インタフェース (Application Programmer's Interface, API) は、 Python インタプリタに対する様々なレベルでのアクセス手段を C や C++ のプログラマに提供しています。この API は通常 C++ からも全く同じように利用できるのですが、簡潔な呼び名にするために Python/C API と名づけられています。根本的に異なる二つの目的から、 Python/C API が用いられます。第一は、特定用途の 拡張モジュール (extension module) 、すなわち Python インタプリタを拡張する C で書かれたモジュールを記述する、という目的です。第二は、より大規模なアプリケーション内で Python を構成要素 (component) として利用するという目的です; このテクニックは、一般的にはアプリケーションへの Python の埋め込み (embedding) と呼びます。

拡張モジュールの作成は比較的わかりやすいプロセスで、 "手引書 (cookbook)" 的なアプローチでうまく実現できます。作業をある程度まで自動化してくれるツールもいくつかあります。一方、他のアプリケーションへの Python の埋め込みは、Python ができてから早い時期から行われてきましたが、拡張モジュールの作成に比べるとやや難解です。

多くの API 関数は、Python の埋め込みであるか拡張であるかに関わらず役立ちます; とはいえ、Python を埋め込んでいるほとんどのアプリケーションは、同時に自作の拡張モジュールも提供する必要が生じることになるでしょうから、Python を実際にアプリケーションに埋め込んでみる前に拡張モジュールの書き方に詳しくなっておくのはよい考えだと思います。

コーディング基準¶

CPython に含める C コードを書いている場合は、 PEP 7 のガイドラインと基準に従わなければ なりません 。このガイドラインは、コントリビュート対象の Python のバージョンに関係無く適用されます。自身のサードパーティーのモジュールでは、それをいつか Python にコントリビュートするつもりでなければ、この慣習に従う必要はありません。

インクルードファイル¶

Python/C API を使うために必要な、関数、型およびマクロの全ての定義をインクルードするには、以下の行:

#define PY_SSIZE_T_CLEAN
#include <Python.h>

をソースコードに記述します。この行を記述すると、標準ヘッダ: <stdio.h>, <string.h>, <errno.h>, <limits.h>, <assert.h>, <stdlib.h> を (利用できれば) インクルードします。

注釈

Python は、システムによっては標準ヘッダの定義に影響するようなプリプロセッサ定義を行っているので、 Python.h をいずれの標準ヘッダよりも前にインクルード せねばなりません 。

Python.h をインクルードする前に、常に PY_SSIZE_T_CLEAN を定義することが推奨されます。このマクロの解説については引数の解釈と値の構築を参照してください。

Python.h で定義されている、ユーザから見える名前全て (Python.h がインクルードしている標準ヘッダの名前は除きます) には、接頭文字列 Py または _Py が付きます。_Py で始まる名前は Python 実装で内部使用するための名前で、拡張モジュールの作者は使ってはなりません。構造体のメンバには予約済みの接頭文字列はありません。

注釈

API のユーザは、Py や _Py で始まる名前を定義するコードを絶対に書いてはなりません。後からコードを読む人を混乱させたり、将来の Python のバージョンで同じ名前が定義されて、ユーザの書いたコードの可搬性を危うくする可能性があります。

ヘッダファイル群は通常 Python と共にインストールされます。 Unixでは prefix/include/pythonversion/ および exec_prefix/include/pythonversion/ に置かれます。 prefix と exec_prefix は Python をビルドする際の configure スクリプトに与えたパラメタに対応し、 version は '%d.%d' % sys.version_info[:2] に対応します。 Windows では、ヘッダは prefix/include に置かれます。 prefix はインストーラに指定したインストールディレクトリです。

ヘッダをインクルードするには、各ヘッダの入ったディレクトリ (別々のディレクトリの場合は両方) を、コンパイラがインクルードファイルを検索するためのパスに入れます。親ディレクトリをサーチパスに入れて、 #include <pythonX.Y/Python.h> のようにしては なりません ; prefix 内のプラットフォームに依存しないヘッダは、 exec_prefix からプラットフォーム依存のヘッダをインクルードしているので、このような操作を行うと複数のプラットフォームでのビルドができなくなります。

C++ users should note that although the API is defined entirely using C, the header files properly declare the entry points to be extern "C". As a result, there is no need to do anything special to use the API from C++.

便利なマクロ¶

Python のヘッダーファイルには便利なマクロがいくつか定義されています。多くのマクロは、それが役に立つところ (例えば、 Py_RETURN_NONE) の近くに定義があります。より一般的な使われかたをする他のマクロはこれらのヘッダーファイルに定義されています。ただし、ここで完全に列挙されているとは限りません。

PyMODINIT_FUNC¶

Declare an extension module PyInit initialization function. The function return type is PyObject*. The macro declares any special linkage declarations required by the platform, and for C++ declares the function as extern "C".

The initialization function must be named PyInit_name, where name is the name of the module, and should be the only non-static item defined in the module file. Example:

static struct PyModuleDef spam_module = {
    .m_base = PyModuleDef_HEAD_INIT,
    .m_name = "spam",
    ...
};

PyMODINIT_FUNC
PyInit_spam(void)
{
    return PyModuleDef_Init(&spam_module);
}

Py_ABS(x)¶

x の絶対値を返します。

If the result cannot be represented (for example, if x has INT_MIN value for int type), the behavior is undefined.

Added in version 3.3.

Py_ALIGNED(num)¶

Specify alignment to num bytes on compilers that support it.

Consider using the C11 standard _Alignas specifier over this macro.

Py_ARITHMETIC_RIGHT_SHIFT(type, integer, positions)¶

Similar to integer >> positions, but forces sign extension, as the C standard does not define whether a right-shift of a signed integer will perform sign extension or a zero-fill.

integer should be any signed integer type. positions is the number of positions to shift to the right.

Both integer and positions can be evaluated more than once; consequently, avoid directly passing a function call or some other operation with side-effects to this macro. Instead, store the result as a variable and then pass it.

type is unused and only kept for backwards compatibility. Historically, type was used to cast integer.

バージョン 3.1 で変更: This macro is now valid for all signed integer types, not just those for which unsigned type is legal. As a result, type is no longer used.

Py_ALWAYS_INLINE¶

Ask the compiler to always inline a static inline function. The compiler can ignore it and decide to not inline the function.

It can be used to inline performance critical static inline functions when building Python in debug mode with function inlining disabled. For example, MSC disables function inlining when building in debug mode.

Marking blindly a static inline function with Py_ALWAYS_INLINE can result in worse performances (due to increased code size for example). The compiler is usually smarter than the developer for the cost/benefit analysis.

If Python is built in debug mode (if the Py_DEBUG macro is defined), the Py_ALWAYS_INLINE macro does nothing.

It must be specified before the function return type. Usage:

static inline Py_ALWAYS_INLINE int random(void) { return 4; }

Added in version 3.11.

Py_CAN_START_THREADS¶

If this macro is defined, then the current system is able to start threads.

Currently, all systems supported by CPython (per PEP 11), with the exception of some WebAssembly platforms, support starting threads.

Added in version 3.13.

Py_CHARMASK(c)¶: 引数は文字か、[-128, 127] あるいは [0, 255] の範囲の整数でなければなりません。このマクロは 符号なし文字 にキャストした c を返します。

Py_DEPRECATED(version)¶

非推奨な宣言に使用してください。このマクロはシンボル名の前に置かれなければなりません。

以下はプログラム例です:

Py_DEPRECATED(3.8) PyAPI_FUNC(int) Py_OldFunction(void);

バージョン 3.8 で変更: MSVC サポートが追加されました。

Py_FORCE_EXPANSION(X)¶: This is equivalent to X, which is useful for token-pasting in macros, as macro expansions in X are forcefully evaluated by the preprocessor.

Py_GCC_ATTRIBUTE(name)¶

Use a GCC attribute name, hiding it from compilers that don't support GCC attributes (such as MSVC).

This expands to __attribute__((name)) on a GCC compiler, and expands to nothing on compilers that don't support GCC attributes.

Py_GETENV(s)¶: getenv(s) に似ていますが、コマンドラインで -E が渡された場合は (PyConfig.use_environment を参照) NULL を返します。

Py_LL(number)¶

Use number as a long long integer literal.

This usally expands to number followed by LL, but will expand to some compiler-specific suffixes (such as I64) on older compilers.

In modern versions of Python, this macro is not very useful, as C99 and later require the LL suffix to be valid for an integer.

Py_LOCAL(type)¶: Declare a function returning the specified type using a fast-calling qualifier for functions that are local to the current file. Semantically, this is equivalent to static type.

Py_LOCAL_INLINE(type)¶: Equivalent to Py_LOCAL but additionally requests the function be inlined.

Py_LOCAL_SYMBOL¶

Macro used to declare a symbol as local to the shared library (hidden). On supported platforms, it ensures the symbol is not exported.

On compatible versions of GCC/Clang, it expands to __attribute__((visibility("hidden"))).

Py_MAX(x, y)¶: x と y の最大値を返します。

Added in version 3.3.

Py_MEMBER_SIZE(type, member)¶: (type) 構造体の member のサイズをバイト単位で返します。

Added in version 3.6.

Py_MIN(x, y)¶: x と y の最小値を返します。

Added in version 3.3.

Py_NO_INLINE¶

Disable inlining on a function. For example, it reduces the C stack consumption: useful on LTO+PGO builds which heavily inline code (see bpo-33720).

使い方:

Py_NO_INLINE static int random(void) { return 4; }

Added in version 3.11.

Py_SAFE_DOWNCAST(value, larger, smaller)¶

Cast value to type smaller from type larger, validating that no information was lost.

On release builds of Python, this is roughly equivalent to (smaller) value (in C++, static_cast<smaller>(value) will be used instead).

On debug builds (implying that Py_DEBUG is defined), this asserts that no information was lost with the cast from larger to smaller.

value, larger, and smaller may all be evaluated more than once in the expression; consequently, do not pass an expression with side-effects directly to this macro.

Py_STRINGIFY(x)¶: x を C 文字列へ変換します。例えば、 Py_STRINGIFY(123) は "123" を返します。

Added in version 3.4.

Py_ULL(number)¶

Similar to Py_LL, but number will be an unsigned long long literal instead. This is done by appending U to the result of Py_LL.

In modern versions of Python, this macro is not very useful, as C99 and later require the ULL/LLU suffixes to be valid for an integer.

Py_UNREACHABLE()¶

Use this when you have a code path that cannot be reached by design. For example, in the default: clause in a switch statement for which all possible values are covered in case statements. Use this in places where you might be tempted to put an assert(0) or abort() call.

In release mode, the macro helps the compiler to optimize the code, and avoids a warning about unreachable code. For example, the macro is implemented with __builtin_unreachable() on GCC in release mode.

A use for Py_UNREACHABLE() is following a call to a function that never returns but that is not declared _Py_NO_RETURN.

If a code path is very unlikely code but can be reached under exceptional case, this macro must not be used. For example, under low memory condition or if a system call returns a value out of the expected range. In this case, it's better to report the error to the caller. If the error cannot be reported to caller, Py_FatalError() can be used.

Added in version 3.7.

Py_UNUSED(arg)¶: コンパイラ警告を抑えるために関数定義の使用されない引数に使用してください。例えば: int func(int a, int Py_UNUSED(b)) { return a; } 。

Added in version 3.4.

Py_BUILD_ASSERT(cond)¶

Asserts a compile-time condition cond, as a statement. The build will fail if the condition is false or cannot be evaluated at compile time.

For example:

Py_BUILD_ASSERT(sizeof(PyTime_t) == sizeof(int64_t));

Added in version 3.3.

Py_BUILD_ASSERT_EXPR(cond)¶

Asserts a compile-time condition cond, as an expression that evaluates to 0. The build will fail if the condition is false or cannot be evaluated at compile time.

For example:

#define foo_to_char(foo) \
    ((char *)(foo) + Py_BUILD_ASSERT_EXPR(offsetof(struct foo, string) == 0))

Added in version 3.3.

PyDoc_STRVAR(name, str)¶

Creates a variable with name name that can be used in docstrings. If Python is built without docstrings, the value will be empty.

Use PyDoc_STRVAR for docstrings to support building Python without docstrings, as specified in PEP 7.

以下はプログラム例です:

PyDoc_STRVAR(pop_doc, "Remove and return the rightmost element.");

static PyMethodDef deque_methods[] = {
    // ...
    {"pop", (PyCFunction)deque_pop, METH_NOARGS, pop_doc},
    // ...
}

PyDoc_STR(str)¶

Creates a docstring for the given input string or an empty string if docstrings are disabled.

Use PyDoc_STR in specifying docstrings to support building Python without docstrings, as specified in PEP 7.

以下はプログラム例です:

static PyMethodDef pysqlite_row_methods[] = {
    {"keys", (PyCFunction)pysqlite_row_keys, METH_NOARGS,
        PyDoc_STR("Returns the keys of the row.")},
    {NULL, NULL}
};

PyDoc_VAR(name)¶

Declares a static character array variable with the given name name.

For example:

PyDoc_VAR(python_doc) = PyDoc_STR("A genus of constricting snakes in the Pythonidae family native "
                                  "to the tropics and subtropics of the Eastern Hemisphere.");

Py_ARRAY_LENGTH(array)¶

Compute the length of a statically allocated C array at compile time.

The array argument must be a C array with a size known at compile time. Passing an array with an unknown size, such as a heap-allocated array, will result in a compilation error on some compilers, or otherwise produce incorrect results.

This is roughly equivalent to:

sizeof(array) / sizeof((array)[0])

Py_EXPORTED_SYMBOL¶: Macro used to declare a symbol (function or data) as exported. On Windows, this expands to __declspec(dllexport). On compatible versions of GCC/Clang, it expands to __attribute__((visibility("default"))). This macro is for defining the C API itself; extension modules should not use it.

Py_IMPORTED_SYMBOL¶: Macro used to declare a symbol as imported. On Windows, this expands to __declspec(dllimport). This macro is for defining the C API itself; extension modules should not use it.

PyAPI_FUNC(type)¶: Macro used by CPython to declare a function as part of the C API. Its expansion depends on the platform and build configuration. This macro is intended for defining CPython's C API itself; extension modules should not use it for their own symbols.

PyAPI_DATA(type)¶: Macro used by CPython to declare a public global variable as part of the C API. Its expansion depends on the platform and build configuration. This macro is intended for defining CPython's C API itself; extension modules should not use it for their own symbols.

Py_VA_COPY¶

This is a soft deprecated alias to the C99-standard va_copy function.

Historically, this would use a compiler-specific method to copy a va_list.

バージョン 3.6 で変更: This is now an alias to va_copy.

オブジェクト、型および参照カウント¶

Python/C API 関数は、 PyObject* 型の一つ以上の引数と戻り値を持ちます。この型は、任意の Python オブジェクトを表現する不透明 (opaque) なデータ型へのポインタです。 Python 言語は、全ての Python オブジェクト型をほとんどの状況 (例えば代入、スコープ規則 (scope rule)、引数渡し) で同様に扱います。ほとんど全ての Python オブジェクトはヒープ (heap) 上に置かれます: このため、 PyObject 型のオブジェクトは、自動記憶 (automatic) としても静的記憶 (static) としても宣言できません。 PyObject* 型のポインタ変数のみ宣言できます。唯一の例外は、型オブジェクトです; 型オブジェクトはメモリ解放 (deallocate) してはならないので、通常は静的記憶の PyTypeObject オブジェクトにします。

全ての Python オブジェクトには (Python 整数型ですら) 型 (type) と参照カウント (reference count) があります。あるオブジェクトの型は、そのオブジェクトがどの種類のオブジェクトか (例えば整数、リスト、ユーザ定義関数、など; その他多数については標準型の階層で説明しています) を決定します。よく知られている型については、各々マクロが存在して、あるオブジェクトがその型かどうか調べられます; 例えば、 PyList_Check(a) は、 a で示されたオブジェクトが Python リスト型のとき (かつそのときに限り) 真値を返します。

参照カウント法¶

The reference count is important because today's computers have a finite (and often severely limited) memory size; it counts how many different places there are that have a strong reference to an object. Such a place could be another object, or a global (or static) C variable, or a local variable in some C function. When the last strong reference to an object is released (i.e. its reference count becomes zero), the object is deallocated. If it contains references to other objects, those references are released. Those other objects may be deallocated in turn, if there are no more references to them, and so on. (There's an obvious problem with objects that reference each other here; for now, the solution is "don't do that.")

Reference counts are always manipulated explicitly. The normal way is to use the macro Py_INCREF() to take a new reference to an object (i.e. increment its reference count by one), and Py_DECREF() to release that reference (i.e. decrement the reference count by one). The Py_DECREF() macro is considerably more complex than the incref one, since it must check whether the reference count becomes zero and then cause the object's deallocator to be called. The deallocator is a function pointer contained in the object's type structure. The type-specific deallocator takes care of releasing references for other objects contained in the object if this is a compound object type, such as a list, as well as performing any additional finalization that's needed. There's no chance that the reference count can overflow; at least as many bits are used to hold the reference count as there are distinct memory locations in virtual memory (assuming sizeof(Py_ssize_t) >= sizeof(void*)). Thus, the reference count increment is a simple operation.

It is not necessary to hold a strong reference (i.e. increment the reference count) for every local variable that contains a pointer to an object. In theory, the object's reference count goes up by one when the variable is made to point to it and it goes down by one when the variable goes out of scope. However, these two cancel each other out, so at the end the reference count hasn't changed. The only real reason to use the reference count is to prevent the object from being deallocated as long as our variable is pointing to it. If we know that there is at least one other reference to the object that lives at least as long as our variable, there is no need to take a new strong reference (i.e. increment the reference count) temporarily. An important situation where this arises is in objects that are passed as arguments to C functions in an extension module that are called from Python; the call mechanism guarantees to hold a reference to every argument for the duration of the call.

However, a common pitfall is to extract an object from a list and hold on to it for a while without taking a new reference. Some other operation might conceivably remove the object from the list, releasing that reference, and possibly deallocating it. The real danger is that innocent-looking operations may invoke arbitrary Python code which could do this; there is a code path which allows control to flow back to the user from a Py_DECREF(), so almost any operation is potentially dangerous.

A safe approach is to always use the generic operations (functions whose name begins with PyObject_, PyNumber_, PySequence_ or PyMapping_). These operations always create a new strong reference (i.e. increment the reference count) of the object they return. This leaves the caller with the responsibility to call Py_DECREF() when they are done with the result; this soon becomes second nature.

参照カウントの詳細¶

The reference count behavior of functions in the Python/C API is best explained in terms of ownership of references. Ownership pertains to references, never to objects (objects are not owned: they are always shared). "Owning a reference" means being responsible for calling Py_DECREF on it when the reference is no longer needed. Ownership can also be transferred, meaning that the code that receives ownership of the reference then becomes responsible for eventually releasing it by calling Py_DECREF() or Py_XDECREF() when it's no longer needed---or passing on this responsibility (usually to its caller). When a function passes ownership of a reference on to its caller, the caller is said to receive a new reference. When no ownership is transferred, the caller is said to borrow the reference. Nothing needs to be done for a borrowed reference.

Conversely, when a calling function passes in a reference to an object, there are two possibilities: the function steals a reference to the object, or it does not.

Stealing a reference means that when you pass a reference to a function, that function assumes that it now owns that reference. Since the new owner can use Py_DECREF() at its discretion, you (the caller) must not use that reference after the call.

参照を盗み取る関数はほとんどありません; 例外としてよく知られているのは、 PyList_SetItem() と PyTuple_SetItem() で、これらはシーケンスに入れる要素に対する参照を盗み取ります (しかし、要素の入る先のタプルやリストの参照は盗み取りません!)。これらの関数は、リストやタプルの中に新たに作成されたオブジェクトを入れていく際の常套的な書き方をしやすくするために、参照を盗み取るように設計されています; 例えば、 (1, 2, "three") というタプルを生成するコードは以下のようになります (とりあえず例外処理のことは忘れておきます; もっとよい書き方を後で示します):

PyObject *t;

t = PyTuple_New(3);
PyTuple_SetItem(t, 0, PyLong_FromLong(1L));
PyTuple_SetItem(t, 1, PyLong_FromLong(2L));
PyTuple_SetItem(t, 2, PyUnicode_FromString("three"));

ここで、 PyLong_FromLong() は新しい参照を返し、すぐに PyTuple_SetItem() に盗まれます。参照が盗まれた後もそのオブジェクトを利用したい場合は、参照盗む関数を呼び出す前に、 Py_INCREF() を利用してもう一つの参照を取得してください。

ちなみに、 PyTuple_SetItem() はタプルに値をセットするための 唯一の 方法です; タプルは変更不能なデータ型なので、 PySequence_SetItem() や PyObject_SetItem() を使うと上の操作は拒否されてしまいます。自分でタプルの値を入れていくつもりなら、 PyTuple_SetItem() だけしか使えません。

同じく、リストに値を入れていくコードは PyList_New() と PyList_SetItem() で書けます。

しかし実際には、タプルやリストを生成して値を入れる際には、上記のような方法はほとんど使いません。より汎用性のある関数、 Py_BuildValue() があり、ほとんどの主要なオブジェクトをフォーマット文字列 format string の指定に基づいて C の値から生成できます。例えば、上の二種類のコードブロックは、以下のように置き換えられます (エラーチェックにも配慮しています):

PyObject *tuple, *list;

tuple = Py_BuildValue("(iis)", 1, 2, "three");
list = Py_BuildValue("[iis]", 1, 2, "three");

It is much more common to use PyObject_SetItem() and friends with items whose references you are only borrowing, like arguments that were passed in to the function you are writing. In that case, their behaviour regarding references is much saner, since you don't have to take a new reference just so you can give that reference away ("have it be stolen"). For example, this function sets all items of a list (actually, any mutable sequence) to a given item:

int
set_all(PyObject *target, PyObject *item)
{
    Py_ssize_t i, n;

    n = PyObject_Length(target);
    if (n < 0)
        return -1;
    for (i = 0; i < n; i++) {
        PyObject *index = PyLong_FromSsize_t(i);
        if (!index)
            return -1;
        if (PyObject_SetItem(target, index, item) < 0) {
            Py_DECREF(index);
            return -1;
        }
        Py_DECREF(index);
    }
    return 0;
}

関数の戻り値の場合には、状況は少し異なります。ほとんどの関数については、参照を渡してもその参照に対する所有権が変わることがない一方で、あるオブジェクトに対する参照を返すような多くの関数は、参照に対する所有権を呼び出し側に与えます。理由は簡単です: 多くの場合、関数が返すオブジェクトはその場で (on the fly) 生成されるため、呼び出し側が得る参照は生成されたオブジェクトに対する唯一の参照になるからです。従って、 PyObject_GetItem() や PySequence_GetItem() のように、オブジェクトに対する参照を返す汎用の関数は、常に新たな参照を返します (呼び出し側が参照の所有者になります)。

重要なのは、関数が返す参照の所有権を持てるかどうかは、どの関数を呼び出すかだけによる、と理解することです --- 関数呼び出し時の お飾り (関数に引数として渡したオブジェクトの型) は この問題には関係ありません! 従って、 PyList_GetItem() を使ってリスト内の要素を得た場合には、参照の所有者にはなりません --- が、同じ要素を同じリストから PySequence_GetItem() (図らずもこの関数は全く同じ引数をとります) を使って取り出すと、返されたオブジェクトに対する参照を得ます。

以下は、整数からなるリストに対して各要素の合計を計算する関数をどのようにして書けるかを示した例です; 一つは PyList_GetItem() を使っていて、もう一つは PySequence_GetItem() を使っています。

long
sum_list(PyObject *list)
{
    Py_ssize_t i, n;
    long total = 0, value;
    PyObject *item;

    n = PyList_Size(list);
    if (n < 0)
        return -1; /* Not a list */
    for (i = 0; i < n; i++) {
        item = PyList_GetItem(list, i); /* Can't fail */
        if (!PyLong_Check(item)) continue; /* Skip non-integers */
        value = PyLong_AsLong(item);
        if (value == -1 && PyErr_Occurred())
            /* Integer too big to fit in a C long, bail out */
            return -1;
        total += value;
    }
    return total;
}

long
sum_sequence(PyObject *sequence)
{
    Py_ssize_t i, n;
    long total = 0, value;
    PyObject *item;
    n = PySequence_Length(sequence);
    if (n < 0)
        return -1; /* Has no length */
    for (i = 0; i < n; i++) {
        item = PySequence_GetItem(sequence, i);
        if (item == NULL)
            return -1; /* Not a sequence, or other failure */
        if (PyLong_Check(item)) {
            value = PyLong_AsLong(item);
            Py_DECREF(item);
            if (value == -1 && PyErr_Occurred())
                /* Integer too big to fit in a C long, bail out */
                return -1;
            total += value;
        }
        else {
            Py_DECREF(item); /* Discard reference ownership */
        }
    }
    return total;
}

型¶

他にも Python/C API において重要な役割を持つデータ型がいくつかあります; ほとんどは int, long, double, および char* といった、単なる C のデータ型です。また、モジュールで公開している関数を列挙する際に用いられる静的なテーブルや、新しいオブジェクト型におけるデータ属性を記述したり、複素数の値を記述したりするために構造体をいくつか使っています。これらの型については、その型を使う関数とともに説明してゆきます。

type Py_ssize_t¶: 次に属します: Stable ABI.
A signed integral type such that sizeof(Py_ssize_t) == sizeof(size_t). C99 doesn't define such a thing directly (size_t is an unsigned integral type). See PEP 353 for details. PY_SSIZE_T_MAX is the largest positive value of type Py_ssize_t.

例外¶

Python プログラマは、特定のエラー処理が必要なときだけしか例外を扱う必要はありません; 処理しなかった例外は、処理の呼び出し側、そのまた呼び出し側、といった具合に、トップレベルのインタプリタ層まで自動的に伝播します。インタプリタ層は、スタックトレースバックと合わせて例外をユーザに報告します。

ところが、 C プログラマの場合、エラーチェックは常に明示的に行わねばなりません。 Python/C API の全ての関数は、関数のドキュメントで明確に説明がない限り例外を発行する可能性があります。一般的な話として、ある関数が何らかのエラーに遭遇すると、関数は例外を設定して、関数内における参照の所有権を全て放棄し、エラー値 (error indicator) を返します。ドキュメントに書かれてない場合、このエラー値は関数の戻り値の型によって、 NULL か -1 のどちらかになります。いくつかの関数ではブール型で真/偽を返し、偽はエラーを示します。きわめて少数の関数では明確なエラー指標を返さなかったり、あいまいな戻り値を返したりするので、 PyErr_Occurred() で明示的にエラーテストを行う必要があります。これらの例外は常に明示的にドキュメント化されます。

例外時の状態情報 (exception state)は、スレッド単位に用意された記憶領域 (per-thread storage) 内で管理されます (この記憶領域は、スレッドを使わないアプリケーションではグローバルな記憶領域と同じです)。一つのスレッドは二つの状態のどちらか: 例外が発生したか、まだ発生していないか、をとります。関数 PyErr_Occurred() を使うと、この状態を調べられます: この関数は例外が発生した際にはその例外型オブジェクトに対する借用参照 (borrowed reference) を返し、そうでないときには NULL を返します。例外状態を設定する関数は数多くあります: PyErr_SetString() はもっともよく知られている (が、もっとも汎用性のない) 例外を設定するための関数で、 PyErr_Clear() は例外状態情報を消し去る関数です。

完全な例外状態情報は、3 つのオブジェクト: 例外の型、例外の値、そしてトレースバック、からなります (どのオブジェクトも NULL を取り得ます)。これらの情報は、 Python の sys.exc_info() の結果と同じ意味を持ちます; とはいえ、 C と Python の例外状態情報は全く同じではありません: Python における例外オブジェクトは、Python の try ... except 文で最近処理したオブジェクトを表す一方、 C レベルの例外状態情報が存続するのは、渡された例外情報を sys.exc_info() その他に転送するよう取り計らう Python のバイトコードインタプリタのメインループに到達するまで、例外が関数の間で受け渡しされている間だけです。

Python 1.5 からは、Python で書かれたコードから例外状態情報にアクセスする方法として、推奨されていてスレッドセーフな方法は sys.exc_info() になっているので注意してください。この関数は Python コードの実行されているスレッドにおける例外状態情報を返します。また、これらの例外状態情報に対するアクセス手段は、両方とも意味づけ (semantics) が変更され、ある関数が例外を捕捉すると、その関数を実行しているスレッドの例外状態情報を保存して、呼び出し側の例外状態情報を維持するようになりました。この変更によって、無害そうに見える関数が現在扱っている例外を上書きすることで引き起こされる、例外処理コードでよくおきていたバグを抑止しています; また、トレースバック内のスタックフレームで参照されているオブジェクトがしばしば不必要に寿命を永らえていたのをなくしています。

一般的な原理として、ある関数が別の関数を呼び出して何らかの作業をさせるとき、呼び出し先の関数が例外を送出していないか調べなくてはならず、もし送出していれば、その例外状態情報は呼び出し側に渡されなければなりません。呼び出し元の関数はオブジェクト参照の所有権をすべて放棄し、エラー指標を返さなくてはなりませんが、余計に例外を設定する必要は ありません --- そんなことをすれば、たった今送出されたばかりの例外を上書きしてしまい、エラーの原因そのものに関する重要な情報を失うことになります。

A simple example of detecting exceptions and passing them on is shown in the sum_sequence() example above. It so happens that this example doesn't need to clean up any owned references when it detects an error. The following example function shows some error cleanup. First, to remind you why you like Python, we show the equivalent Python code:

def incr_item(dict, key):
    try:
        item = dict[key]
    except KeyError:
        item = 0
    dict[key] = item + 1

以下は対応するコードを C で完璧に書いたものです:

int
incr_item(PyObject *dict, PyObject *key)
{
    /* Objects all initialized to NULL for Py_XDECREF */
    PyObject *item = NULL, *const_one = NULL, *incremented_item = NULL;
    int rv = -1; /* Return value initialized to -1 (failure) */

    item = PyObject_GetItem(dict, key);
    if (item == NULL) {
        /* Handle KeyError only: */
        if (!PyErr_ExceptionMatches(PyExc_KeyError))
            goto error;

        /* Clear the error and use zero: */
        PyErr_Clear();
        item = PyLong_FromLong(0L);
        if (item == NULL)
            goto error;
    }
    const_one = PyLong_FromLong(1L);
    if (const_one == NULL)
        goto error;

    incremented_item = PyNumber_Add(item, const_one);
    if (incremented_item == NULL)
        goto error;

    if (PyObject_SetItem(dict, key, incremented_item) < 0)
        goto error;
    rv = 0; /* Success */
    /* Continue with cleanup code */

 error:
    /* Cleanup code, shared by success and failure path */

    /* Use Py_XDECREF() to ignore NULL references */
    Py_XDECREF(item);
    Py_XDECREF(const_one);
    Py_XDECREF(incremented_item);

    return rv; /* -1 for error, 0 for success */
}

なんとこの例は C で goto 文を使うお勧めの方法まで示していますね! この例では、特定の例外を処理するために PyErr_ExceptionMatches() および PyErr_Clear() をどう使うかを示しています。また、所有権を持っている参照で、値が NULL になるかもしれないものを捨てるために Py_XDECREF() をどう使うかも示しています (関数名に 'X' が付いていることに注意してください; Py_DECREF() は NULL 参照に出くわすとクラッシュします)。正しく動作させるためには、所有権を持つ参照を保持するための変数を NULL で初期化することが重要です; 同様に、あらかじめ戻り値を定義する際には値を -1 (失敗) で初期化しておいて、最後の関数呼び出しまでうまくいった場合にのみ 0 (成功) に設定します。

Python の埋め込み¶

Python インタプリタの埋め込みを行う人 (いわば拡張モジュールの書き手の対極) が気にかけなければならない重要なタスクは、Python インタプリタの初期化処理 (initialization)、そしておそらくは終了処理 (finalization) です。インタプリタのほとんどの機能は、インタプリタの起動後しか使えません。

基本的な初期化処理を行う関数は Py_Initialize() です。この関数はロード済みのモジュールからなるテーブルを作成し、土台となるモジュール builtins, __main__, および sys を作成します。また、モジュール検索パス (sys.path) の初期化も行います。

Py_Initialize() does not set the "script argument list" (sys.argv). If this variable is needed by Python code that will be executed later, setting PyConfig.argv and PyConfig.parse_argv must be set: see Python Initialization Configuration.

ほとんどのシステムでは (特に Unix と Windows は、詳細がわずかに異なりはしますが)、 Py_Initialize() は標準の Python インタプリタ実行形式の場所に対する推定結果に基づいて、 Python のライブラリが Python インタプリタ実行形式からの相対パスで見つかるという仮定の下にモジュール検索パスを計算します。とりわけこの検索では、シェルコマンド検索パス (環境変数 PATH) 上に見つかった python という名前の実行ファイルの置かれているディレクトリの親ディレクトリからの相対で、 lib/pythonX.Y という名前のディレクトリを探します。

例えば、 Python 実行形式が /usr/local/bin/python で見つかったとすると、ライブラリが /usr/local/lib/pythonX.Y にあるものと仮定します。 (実際には、このパスは "フォールバック (fallback)" のライブラリ位置でもあり、 python が PATH 上に無い場合に使われます。) ユーザは PYTHONHOME を設定することでこの動作をオーバーライドしたり、 PYTHONPATH を設定して追加のディレクトリを標準モジュール検索パスの前に挿入したりできます。

The embedding application can steer the search by setting PyConfig.program_name before calling Py_InitializeFromConfig(). Note that PYTHONHOME still overrides this and PYTHONPATH is still inserted in front of the standard path. An application that requires total control has to provide its own implementation of Py_GetPath(), Py_GetPrefix(), Py_GetExecPrefix(), and Py_GetProgramFullPath() (all defined in Modules/getpath.c).

たまに、 Python を初期化前の状態にもどしたいことがあります。例えば、あるアプリケーションでは実行を最初からやりなおし (start over) させる (Py_Initialize() をもう一度呼び出させる) ようにしたいかもしれません。あるいは、アプリケーションが Python を一旦使い終えて、Python が確保したメモリを解放させたいかもしれません。 Py_FinalizeEx() を使うとこうした処理を実現できます。また、関数 Py_IsInitialized() は、Python が現在初期化済みの状態にある場合に真を返します。これらの関数についてのさらなる情報は、後の章で説明します。 Py_FinalizeEx() がPythonインタプリタに確保された全てのメモリを 解放するわけではない ことに注意してください。例えば、拡張モジュールによって確保されたメモリは、現在のところ解放する事ができません。

デバッグ版ビルド (Debugging Builds)¶

インタプリタと拡張モジュールに対しての追加チェックをするためのいくつかのマクロを有効にしてPythonをビルドすることができます。これらのチェックは、実行時に大きなオーバーヘッドを生じる傾向があります。なので、デフォルトでは有効にされていません。

Pythonデバッグ版ビルドの全ての種類のリストが、Pythonソース配布(source distribution)の中の Misc/SpecialBuilds.txt にあります。参照カウントのトレース、メモリアロケータのデバッグ、インタプリタのメインループの低レベルプロファイリングが利用可能です。よく使われるビルドについてのみ、この節の残りの部分で説明します。

Py_DEBUG¶

Compiling the interpreter with the Py_DEBUG macro defined produces what is generally meant by a debug build of Python. Py_DEBUG is enabled in the Unix build by adding --with-pydebug to the ./configure command. It is also implied by the presence of the not-Python-specific _DEBUG macro. When Py_DEBUG is enabled in the Unix build, compiler optimization is disabled.

In addition to the reference count debugging described below, extra checks are performed, see Python Debug Build.

Py_TRACE_REFS を宣言すると、参照トレースが有効になります (configure --with-trace-refs オプション を参照してください)。全ての PyObject に二つのフィールドを追加することで、使用中のオブジェクトの循環二重連結リストが管理されます。全ての割り当て(allocation)がトレースされます。終了時に、全ての残っているオブジェクトが表示されます。 (インタラクティブモードでは、インタプリタによる文の実行のたびに表示されます。)

より詳しい情報については、Pythonのソース配布(source distribution)の中の Misc/SpecialBuilds.txt を参照してください。

Recommended third party tools¶

The following third party tools offer both simpler and more sophisticated approaches to creating C, C++ and Rust extensions for Python:

Cython
cffi
HPy
nanobind (C++)
Numba
pybind11 (C++)
PyO3 (Rust)
SWIG

Using tools such as these can help avoid writing code that is tightly bound to a particular version of CPython, avoid reference counting errors, and focus more on your own code than on using the CPython API. In general, new versions of Python can be supported by updating the tool, and your code will often use newer and more efficient APIs automatically. Some tools also support compiling for other implementations of Python from a single set of sources.

These projects are not supported by the same people who maintain Python, and issues need to be raised with the projects directly. Remember to check that the project is still maintained and supported, as the list above may become outdated.

参考

Python Packaging User Guide: Binary Extensions: The Python Packaging User Guide not only covers several available tools that simplify the creation of binary extensions, but also discusses the various reasons why creating an extension module may be desirable in the first place.

はじめに¶

コーディング基準¶

インクルードファイル¶

便利なマクロ¶

オブジェクト、型および参照カウント¶

参照カウント法¶

参照カウントの詳細¶

型¶

例外¶

Python の埋め込み¶

デバッグ版ビルド (Debugging Builds)¶

Recommended third party tools¶

目次

前のトピックへ

次のトピックへ

This page