2. 拡張の型の定義: チュートリアル

Python では、組み込みの str 型や list 型のような Python コードから操作できる新しい型を C 拡張モジュールの作者が定義できます。 全ての拡張の型のコードはあるパターンに従うのですが、書き始める前に理解しておくべき細かいことがあります。 このドキュメントはその話題についてのやさしい入門です。

2.1. 基本的なこと

The CPython runtime sees all Python objects as variables of type PyObject*, which serves as a "base type" for all Python objects. The PyObject structure itself only contains the object's reference count and a pointer to the object's "type object". This is where the action is; the type object determines which (C) functions get called by the interpreter when, for instance, an attribute gets looked up on an object, a method called, or it is multiplied by another object. These C functions are called "type methods".

それなので、新しい拡張の型を定義したいときは、新しい型オブジェクトを作成すればよいわけです。

This sort of thing can only be explained by example, so here's a minimal, but complete, module that defines a new type named Custom inside a C extension module custom:

注釈

ここで紹介している例は、 静的な 拡張の型を定義する伝統的な実装方法です。 これはほとんどの場面で十分なものなのです。 C API では、 PyType_FromSpec() 関数を使い、ヒープ上に配置された拡張の型も定義できますが、これについてはこのチュートリアルでは扱いません。

#define PY_SSIZE_T_CLEAN
#include <Python.h>

typedef struct {
    PyObject_HEAD
    /* Type-specific fields go here. */
} CustomObject;

static PyTypeObject CustomType = {
    .ob_base = PyVarObject_HEAD_INIT(NULL, 0)
    .tp_name = "custom.Custom",
    .tp_doc = PyDoc_STR("Custom objects"),
    .tp_basicsize = sizeof(CustomObject),
    .tp_itemsize = 0,
    .tp_flags = Py_TPFLAGS_DEFAULT,
    .tp_new = PyType_GenericNew,
};

static PyModuleDef custommodule = {
    .m_base = PyModuleDef_HEAD_INIT,
    .m_name = "custom",
    .m_doc = "Example module that creates an extension type.",
    .m_size = -1,
};

PyMODINIT_FUNC
PyInit_custom(void)
{
    PyObject *m;
    if (PyType_Ready(&CustomType) < 0)
        return NULL;

    m = PyModule_Create(&custommodule);
    if (m == NULL)
        return NULL;

    Py_INCREF(&CustomType);
    if (PyModule_AddObject(m, "Custom", (PyObject *) &CustomType) < 0) {
        Py_DECREF(&CustomType);
        Py_DECREF(m);
        return NULL;
    }

    return m;
}

一度に把握するにはちょっと量が多いですが、前の章よりはとっつきやすくなっていることと思います。このファイルでは、3つの要素が定義されています:

  1. What a Custom object contains: this is the CustomObject struct, which is allocated once for each Custom instance.

  2. How the Custom type behaves: this is the CustomType struct, which defines a set of flags and function pointers that the interpreter inspects when specific operations are requested.

  3. How to initialize the custom module: this is the PyInit_custom function and the associated custommodule struct.

まず最初はこれです:

typedef struct {
    PyObject_HEAD
} CustomObject;

This is what a Custom object will contain. PyObject_HEAD is mandatory at the start of each object struct and defines a field called ob_base of type PyObject, containing a pointer to a type object and a reference count (these can be accessed using the macros Py_TYPE and Py_REFCNT respectively). The reason for the macro is to abstract away the layout and to enable additional fields in debug builds.

注釈

上の例では PyObject_HEAD マクロの後にセミコロンはありません。 うっかりセミコロンを追加しないように気を付けてください: これを警告するコンパイラもあります。

もちろん、一般的にはオブジェクトは標準的な PyObject_HEAD ボイラープレートの他にもデータを保持しています; 例えば、これは Python 標準の浮動小数点数の定義です:

typedef struct {
    PyObject_HEAD
    double ob_fval;
} PyFloatObject;

2つ目は型オブジェクトの定義です。

static PyTypeObject CustomType = {
    .ob_base = PyVarObject_HEAD_INIT(NULL, 0)
    .tp_name = "custom.Custom",
    .tp_doc = PyDoc_STR("Custom objects"),
    .tp_basicsize = sizeof(CustomObject),
    .tp_itemsize = 0,
    .tp_flags = Py_TPFLAGS_DEFAULT,
    .tp_new = PyType_GenericNew,
};

注釈

上にあるように C99 スタイルの指示付き初期化子を使って、 PyTypeObject の特に関心の無いフィールドまで全て並べたり、フィールドを宣言する順序に気を使ったりせずに済ませるのをお薦めします。

object.h にある実際の PyTypeObject の定義には上の定義にあるよりももっと多くの フィールド があります。 ここに出てきていないフィールドは C コンパイラによってゼロで埋められるので、必要でない限り明示的には値の指定をしないのが一般的な作法になっています。

一度に1つずつフィールドを取り上げていきましょう:

.ob_base = PyVarObject_HEAD_INIT(NULL, 0)

この行は、上で触れた ob_base フィールドの初期化に必須のボイラープレートです。

.tp_name = "custom.Custom",

実装している型の名前です。 これは、オブジェクトのデフォルトの文字列表現やエラーメッセージに現れます。例えば次の通りです:

>>> "" + custom.Custom()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: can only concatenate str (not "custom.Custom") to str

Note that the name is a dotted name that includes both the module name and the name of the type within the module. The module in this case is custom and the type is Custom, so we set the type name to custom.Custom. Using the real dotted import path is important to make your type compatible with the pydoc and pickle modules.

.tp_basicsize = sizeof(CustomObject),
.tp_itemsize = 0,

This is so that Python knows how much memory to allocate when creating new Custom instances. tp_itemsize is only used for variable-sized objects and should otherwise be zero.

注釈

If you want your type to be subclassable from Python, and your type has the same tp_basicsize as its base type, you may have problems with multiple inheritance. A Python subclass of your type will have to list your type first in its __bases__, or else it will not be able to call your type's __new__() method without getting an error. You can avoid this problem by ensuring that your type has a larger value for tp_basicsize than its base type does. Most of the time, this will be true anyway, because either your base type will be object, or else you will be adding data members to your base type, and therefore increasing its size.

We set the class flags to Py_TPFLAGS_DEFAULT.

.tp_flags = Py_TPFLAGS_DEFAULT,

すべての型はフラグにこの定数を含めておく必要があります。これは最低でも Python 3.3 までに定義されているすべてのメンバを許可します。それ以上のメンバが必要なら、対応するフラグの OR をとる必要があります。

この型の docstring は tp_doc に入れます。

.tp_doc = PyDoc_STR("Custom objects"),

To enable object creation, we have to provide a tp_new handler. This is the equivalent of the Python method __new__(), but has to be specified explicitly. In this case, we can just use the default implementation provided by the API function PyType_GenericNew().

.tp_new = PyType_GenericNew,

Everything else in the file should be familiar, except for some code in PyInit_custom():

if (PyType_Ready(&CustomType) < 0)
    return;

This initializes the Custom type, filling in a number of members to the appropriate default values, including ob_type that we initially set to NULL.

Py_INCREF(&CustomType);
if (PyModule_AddObject(m, "Custom", (PyObject *) &CustomType) < 0) {
    Py_DECREF(&CustomType);
    Py_DECREF(m);
    return NULL;
}

This adds the type to the module dictionary. This allows us to create Custom instances by calling the Custom class:

>>> import custom
>>> mycustom = custom.Custom()

That's it! All that remains is to build it; put the above code in a file called custom.c,

[build-system]
requires = ["setuptools"]
build-backend = "setuptools.build_meta"

[project]
name = "custom"
version = "1"

in a file called pyproject.toml, and

from setuptools import Extension, setup
setup(ext_modules=[Extension("custom", ["custom.c"])])

そして、シェルから以下のように入力します

$ python -m pip install .

in a shell should produce a file custom.so in a subdirectory and install it; now fire up Python --- you should be able to import custom and play around with Custom objects.

そんなにむずかしくありません、よね?

もちろん、現在の Custom 型は面白みに欠けています。何もデータを持っていないし、何もできません。継承してサブクラスを作ることさえできないのです。

2.2. 基本のサンプルにデータとメソッドを追加する

Let's extend the basic example to add some data and methods. Let's also make the type usable as a base class. We'll create a new module, custom2 that adds these capabilities:

#define PY_SSIZE_T_CLEAN
#include <Python.h>
#include <stddef.h> /* for offsetof() */

typedef struct {
    PyObject_HEAD
    PyObject *first; /* first name */
    PyObject *last;  /* last name */
    int number;
} CustomObject;

static void
Custom_dealloc(CustomObject *self)
{
    Py_XDECREF(self->first);
    Py_XDECREF(self->last);
    Py_TYPE(self)->tp_free((PyObject *) self);
}

static PyObject *
Custom_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
    CustomObject *self;
    self = (CustomObject *) type->tp_alloc(type, 0);
    if (self != NULL) {
        self->first = PyUnicode_FromString("");
        if (self->first == NULL) {
            Py_DECREF(self);
            return NULL;
        }
        self->last = PyUnicode_FromString("");
        if (self->last == NULL) {
            Py_DECREF(self);
            return NULL;
        }
        self->number = 0;
    }
    return (PyObject *) self;
}

static int
Custom_init(CustomObject *self, PyObject *args, PyObject *kwds)
{
    static char *kwlist[] = {"first", "last", "number", NULL};
    PyObject *first = NULL, *last = NULL;

    if (!PyArg_ParseTupleAndKeywords(args, kwds, "|OOi", kwlist,
                                     &first, &last,
                                     &self->number))
        return -1;

    if (first) {
        Py_XSETREF(self->first, Py_NewRef(first));
    }
    if (last) {
        Py_XSETREF(self->last, Py_NewRef(last));
    }
    return 0;
}

static PyMemberDef Custom_members[] = {
    {"first", Py_T_OBJECT_EX, offsetof(CustomObject, first), 0,
     "first name"},
    {"last", Py_T_OBJECT_EX, offsetof(CustomObject, last), 0,
     "last name"},
    {"number", Py_T_INT, offsetof(CustomObject, number), 0,
     "custom number"},
    {NULL}  /* Sentinel */
};

static PyObject *
Custom_name(CustomObject *self, PyObject *Py_UNUSED(ignored))
{
    if (self->first == NULL) {
        PyErr_SetString(PyExc_AttributeError, "first");
        return NULL;
    }
    if (self->last == NULL) {
        PyErr_SetString(PyExc_AttributeError, "last");
        return NULL;
    }
    return PyUnicode_FromFormat("%S %S", self->first, self->last);
}

static PyMethodDef Custom_methods[] = {
    {"name", (PyCFunction) Custom_name, METH_NOARGS,
     "Return the name, combining the first and last name"
    },
    {NULL}  /* Sentinel */
};

static PyTypeObject CustomType = {
    .ob_base = PyVarObject_HEAD_INIT(NULL, 0)
    .tp_name = "custom2.Custom",
    .tp_doc = PyDoc_STR("Custom objects"),
    .tp_basicsize = sizeof(CustomObject),
    .tp_itemsize = 0,
    .tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE,
    .tp_new = Custom_new,
    .tp_init = (initproc) Custom_init,
    .tp_dealloc = (destructor) Custom_dealloc,
    .tp_members = Custom_members,
    .tp_methods = Custom_methods,
};

static PyModuleDef custommodule = {
    .m_base =PyModuleDef_HEAD_INIT,
    .m_name = "custom2",
    .m_doc = "Example module that creates an extension type.",
    .m_size = -1,
};

PyMODINIT_FUNC
PyInit_custom2(void)
{
    PyObject *m;
    if (PyType_Ready(&CustomType) < 0)
        return NULL;

    m = PyModule_Create(&custommodule);
    if (m == NULL)
        return NULL;

    if (PyModule_AddObjectRef(m, "Custom", (PyObject *) &CustomType) < 0) {
        Py_DECREF(m);
        return NULL;
    }

    return m;
}

このバージョンでは、いくつもの変更をおこないます。

The Custom type now has three data attributes in its C struct, first, last, and number. The first and last variables are Python strings containing first and last names. The number attribute is a C integer.

これにしたがうと、オブジェクトの構造体は次のようになります:

typedef struct {
    PyObject_HEAD
    PyObject *first; /* first name */
    PyObject *last;  /* last name */
    int number;
} CustomObject;

いまや管理すべきデータができたので、オブジェクトの割り当てと解放に際してはより慎重になる必要があります。最低限、オブジェクトの解放メソッドが必要です:

static void
Custom_dealloc(CustomObject *self)
{
    Py_XDECREF(self->first);
    Py_XDECREF(self->last);
    Py_TYPE(self)->tp_free((PyObject *) self);
}

この関数は tp_dealloc メンバに代入されます。

.tp_dealloc = (destructor) Custom_dealloc,

This method first clears the reference counts of the two Python attributes. Py_XDECREF() correctly handles the case where its argument is NULL (which might happen here if tp_new failed midway). It then calls the tp_free member of the object's type (computed by Py_TYPE(self)) to free the object's memory. Note that the object's type might not be CustomType, because the object may be an instance of a subclass.

注釈

上の destructor への明示的な型変換は必要です。なぜなら、 Custom_deallocCustomObject * 引数をとると定義しましたが、 tp_dealloc 関数のポインタは PyObject * 引数を受け取ることになっているからです。もし明示的に型変換をしなければ、コンパイラが警告を発するでしょう。これは、Cにおけるオブジェクト指向のポリモーフィズムです!

ファーストネームとラストネームを空文字列に初期化しておきたいので、tp_new の実装を追加することにしましょう:

static PyObject *
Custom_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
    CustomObject *self;
    self = (CustomObject *) type->tp_alloc(type, 0);
    if (self != NULL) {
        self->first = PyUnicode_FromString("");
        if (self->first == NULL) {
            Py_DECREF(self);
            return NULL;
        }
        self->last = PyUnicode_FromString("");
        if (self->last == NULL) {
            Py_DECREF(self);
            return NULL;
        }
        self->number = 0;
    }
    return (PyObject *) self;
}

そしてこれを tp_new メンバとしてインストールします:

.tp_new = Custom_new,

The tp_new handler is responsible for creating (as opposed to initializing) objects of the type. It is exposed in Python as the __new__() method. It is not required to define a tp_new member, and indeed many extension types will simply reuse PyType_GenericNew() as done in the first version of the Custom type above. In this case, we use the tp_new handler to initialize the first and last attributes to non-NULL default values.

tp_new is passed the type being instantiated (not necessarily CustomType, if a subclass is instantiated) and any arguments passed when the type was called, and is expected to return the instance created. tp_new handlers always accept positional and keyword arguments, but they often ignore the arguments, leaving the argument handling to initializer (a.k.a. tp_init in C or __init__ in Python) methods.

注釈

tp_new は明示的に tp_init を呼び出してはいけません、これはインタープリタが自分で行うためです。

この tp_new の実装は、tp_alloc スロットを呼び出してメモリを割り当てます:

self = (CustomObject *) type->tp_alloc(type, 0);

Since memory allocation may fail, we must check the tp_alloc result against NULL before proceeding.

注釈

We didn't fill the tp_alloc slot ourselves. Rather PyType_Ready() fills it for us by inheriting it from our base class, which is object by default. Most types use the default allocation strategy.

注釈

If you are creating a co-operative tp_new (one that calls a base type's tp_new or __new__()), you must not try to determine what method to call using method resolution order at runtime. Always statically determine what type you are going to call, and call its tp_new directly, or via type->tp_base->tp_new. If you do not do this, Python subclasses of your type that also inherit from other Python-defined classes may not work correctly. (Specifically, you may not be able to create instances of such subclasses without getting a TypeError.)

We also define an initialization function which accepts arguments to provide initial values for our instance:

static int
Custom_init(CustomObject *self, PyObject *args, PyObject *kwds)
{
    static char *kwlist[] = {"first", "last", "number", NULL};
    PyObject *first = NULL, *last = NULL, *tmp;

    if (!PyArg_ParseTupleAndKeywords(args, kwds, "|OOi", kwlist,
                                     &first, &last,
                                     &self->number))
        return -1;

    if (first) {
        tmp = self->first;
        Py_INCREF(first);
        self->first = first;
        Py_XDECREF(tmp);
    }
    if (last) {
        tmp = self->last;
        Py_INCREF(last);
        self->last = last;
        Py_XDECREF(tmp);
    }
    return 0;
}

これは tp_init メンバに代入されます。

.tp_init = (initproc) Custom_init,

The tp_init slot is exposed in Python as the __init__() method. It is used to initialize an object after it's created. Initializers always accept positional and keyword arguments, and they should return either 0 on success or -1 on error.

Unlike the tp_new handler, there is no guarantee that tp_init is called at all (for example, the pickle module by default doesn't call __init__() on unpickled instances). It can also be called multiple times. Anyone can call the __init__() method on our objects. For this reason, we have to be extra careful when assigning the new attribute values. We might be tempted, for example to assign the first member like this:

if (first) {
    Py_XDECREF(self->first);
    Py_INCREF(first);
    self->first = first;
}

But this would be risky. Our type doesn't restrict the type of the first member, so it could be any kind of object. It could have a destructor that causes code to be executed that tries to access the first member; or that destructor could release the Global interpreter Lock and let arbitrary code run in other threads that accesses and modifies our object.

To be paranoid and protect ourselves against this possibility, we almost always reassign members before decrementing their reference counts. When don't we have to do this?

  • その参照カウントが 1 より大きいと確信できる場合

  • when we know that deallocation of the object [1] will neither release the GIL nor cause any calls back into our type's code;

  • when decrementing a reference count in a tp_dealloc handler on a type which doesn't support cyclic garbage collection [2].

ここではインスタンス変数を属性として見えるようにしたいのですが、これにはいくつもの方法があります。もっとも簡単な方法は、メンバの定義を与えることです:

static PyMemberDef Custom_members[] = {
    {"first", Py_T_OBJECT_EX, offsetof(CustomObject, first), 0,
     "first name"},
    {"last", Py_T_OBJECT_EX, offsetof(CustomObject, last), 0,
     "last name"},
    {"number", Py_T_INT, offsetof(CustomObject, number), 0,
     "custom number"},
    {NULL}  /* Sentinel */
};

そして、この定義を tp_members スロットに入れましょう:

.tp_members = Custom_members,

Each member definition has a member name, type, offset, access flags and documentation string. See the 総称的な属性を管理する section below for details.

A disadvantage of this approach is that it doesn't provide a way to restrict the types of objects that can be assigned to the Python attributes. We expect the first and last names to be strings, but any Python objects can be assigned. Further, the attributes can be deleted, setting the C pointers to NULL. Even though we can make sure the members are initialized to non-NULL values, the members can be set to NULL if the attributes are deleted.

We define a single method, Custom.name(), that outputs the objects name as the concatenation of the first and last names.

static PyObject *
Custom_name(CustomObject *self, PyObject *Py_UNUSED(ignored))
{
    if (self->first == NULL) {
        PyErr_SetString(PyExc_AttributeError, "first");
        return NULL;
    }
    if (self->last == NULL) {
        PyErr_SetString(PyExc_AttributeError, "last");
        return NULL;
    }
    return PyUnicode_FromFormat("%S %S", self->first, self->last);
}

The method is implemented as a C function that takes a Custom (or Custom subclass) instance as the first argument. Methods always take an instance as the first argument. Methods often take positional and keyword arguments as well, but in this case we don't take any and don't need to accept a positional argument tuple or keyword argument dictionary. This method is equivalent to the Python method:

def name(self):
    return "%s %s" % (self.first, self.last)

Note that we have to check for the possibility that our first and last members are NULL. This is because they can be deleted, in which case they are set to NULL. It would be better to prevent deletion of these attributes and to restrict the attribute values to be strings. We'll see how to do that in the next section.

さて、メソッドを定義したので、ここでメソッド定義用の配列を作成する必要があります:

static PyMethodDef Custom_methods[] = {
    {"name", (PyCFunction) Custom_name, METH_NOARGS,
     "Return the name, combining the first and last name"
    },
    {NULL}  /* Sentinel */
};

(note that we used the METH_NOARGS flag to indicate that the method is expecting no arguments other than self)

and assign it to the tp_methods slot:

.tp_methods = Custom_methods,

Finally, we'll make our type usable as a base class for subclassing. We've written our methods carefully so far so that they don't make any assumptions about the type of the object being created or used, so all we need to do is to add the Py_TPFLAGS_BASETYPE to our class flag definition:

.tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE,

We rename PyInit_custom() to PyInit_custom2(), update the module name in the PyModuleDef struct, and update the full class name in the PyTypeObject struct.

Finally, we update our setup.py file to include the new module,

from setuptools import Extension, setup
setup(ext_modules=[
    Extension("custom", ["custom.c"]),
    Extension("custom2", ["custom2.c"]),
])

and then we re-install so that we can import custom2:

$ python -m pip install .

2.3. データ属性をこまかく制御する

In this section, we'll provide finer control over how the first and last attributes are set in the Custom example. In the previous version of our module, the instance variables first and last could be set to non-string values or even deleted. We want to make sure that these attributes always contain strings.

#define PY_SSIZE_T_CLEAN
#include <Python.h>
#include <stddef.h> /* for offsetof() */

typedef struct {
    PyObject_HEAD
    PyObject *first; /* first name */
    PyObject *last;  /* last name */
    int number;
} CustomObject;

static void
Custom_dealloc(CustomObject *self)
{
    Py_XDECREF(self->first);
    Py_XDECREF(self->last);
    Py_TYPE(self)->tp_free((PyObject *) self);
}

static PyObject *
Custom_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
    CustomObject *self;
    self = (CustomObject *) type->tp_alloc(type, 0);
    if (self != NULL) {
        self->first = PyUnicode_FromString("");
        if (self->first == NULL) {
            Py_DECREF(self);
            return NULL;
        }
        self->last = PyUnicode_FromString("");
        if (self->last == NULL) {
            Py_DECREF(self);
            return NULL;
        }
        self->number = 0;
    }
    return (PyObject *) self;
}

static int
Custom_init(CustomObject *self, PyObject *args, PyObject *kwds)
{
    static char *kwlist[] = {"first", "last", "number", NULL};
    PyObject *first = NULL, *last = NULL;

    if (!PyArg_ParseTupleAndKeywords(args, kwds, "|UUi", kwlist,
                                     &first, &last,
                                     &self->number))
        return -1;

    if (first) {
        Py_SETREF(self->first, Py_NewRef(first));
    }
    if (last) {
        Py_SETREF(self->last, Py_NewRef(last));
    }
    return 0;
}

static PyMemberDef Custom_members[] = {
    {"number", Py_T_INT, offsetof(CustomObject, number), 0,
     "custom number"},
    {NULL}  /* Sentinel */
};

static PyObject *
Custom_getfirst(CustomObject *self, void *closure)
{
    return Py_NewRef(self->first);
}

static int
Custom_setfirst(CustomObject *self, PyObject *value, void *closure)
{
    if (value == NULL) {
        PyErr_SetString(PyExc_TypeError, "Cannot delete the first attribute");
        return -1;
    }
    if (!PyUnicode_Check(value)) {
        PyErr_SetString(PyExc_TypeError,
                        "The first attribute value must be a string");
        return -1;
    }
    Py_SETREF(self->first, Py_NewRef(value));
    return 0;
}

static PyObject *
Custom_getlast(CustomObject *self, void *closure)
{
    return Py_NewRef(self->last);
}

static int
Custom_setlast(CustomObject *self, PyObject *value, void *closure)
{
    if (value == NULL) {
        PyErr_SetString(PyExc_TypeError, "Cannot delete the last attribute");
        return -1;
    }
    if (!PyUnicode_Check(value)) {
        PyErr_SetString(PyExc_TypeError,
                        "The last attribute value must be a string");
        return -1;
    }
    Py_SETREF(self->last, Py_NewRef(value));
    return 0;
}

static PyGetSetDef Custom_getsetters[] = {
    {"first", (getter) Custom_getfirst, (setter) Custom_setfirst,
     "first name", NULL},
    {"last", (getter) Custom_getlast, (setter) Custom_setlast,
     "last name", NULL},
    {NULL}  /* Sentinel */
};

static PyObject *
Custom_name(CustomObject *self, PyObject *Py_UNUSED(ignored))
{
    return PyUnicode_FromFormat("%S %S", self->first, self->last);
}

static PyMethodDef Custom_methods[] = {
    {"name", (PyCFunction) Custom_name, METH_NOARGS,
     "Return the name, combining the first and last name"
    },
    {NULL}  /* Sentinel */
};

static PyTypeObject CustomType = {
    .ob_base = PyVarObject_HEAD_INIT(NULL, 0)
    .tp_name = "custom3.Custom",
    .tp_doc = PyDoc_STR("Custom objects"),
    .tp_basicsize = sizeof(CustomObject),
    .tp_itemsize = 0,
    .tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE,
    .tp_new = Custom_new,
    .tp_init = (initproc) Custom_init,
    .tp_dealloc = (destructor) Custom_dealloc,
    .tp_members = Custom_members,
    .tp_methods = Custom_methods,
    .tp_getset = Custom_getsetters,
};

static PyModuleDef custommodule = {
    .m_base = PyModuleDef_HEAD_INIT,
    .m_name = "custom3",
    .m_doc = "Example module that creates an extension type.",
    .m_size = -1,
};

PyMODINIT_FUNC
PyInit_custom3(void)
{
    PyObject *m;
    if (PyType_Ready(&CustomType) < 0)
        return NULL;

    m = PyModule_Create(&custommodule);
    if (m == NULL)
        return NULL;

    if (PyModule_AddObjectRef(m, "Custom", (PyObject *) &CustomType) < 0) {
        Py_DECREF(m);
        return NULL;
    }

    return m;
}

To provide greater control, over the first and last attributes, we'll use custom getter and setter functions. Here are the functions for getting and setting the first attribute:

static PyObject *
Custom_getfirst(CustomObject *self, void *closure)
{
    Py_INCREF(self->first);
    return self->first;
}

static int
Custom_setfirst(CustomObject *self, PyObject *value, void *closure)
{
    PyObject *tmp;
    if (value == NULL) {
        PyErr_SetString(PyExc_TypeError, "Cannot delete the first attribute");
        return -1;
    }
    if (!PyUnicode_Check(value)) {
        PyErr_SetString(PyExc_TypeError,
                        "The first attribute value must be a string");
        return -1;
    }
    tmp = self->first;
    Py_INCREF(value);
    self->first = value;
    Py_DECREF(tmp);
    return 0;
}

The getter function is passed a Custom object and a "closure", which is a void pointer. In this case, the closure is ignored. (The closure supports an advanced usage in which definition data is passed to the getter and setter. This could, for example, be used to allow a single set of getter and setter functions that decide the attribute to get or set based on data in the closure.)

The setter function is passed the Custom object, the new value, and the closure. The new value may be NULL, in which case the attribute is being deleted. In our setter, we raise an error if the attribute is deleted or if its new value is not a string.

ここでは PyGetSetDef 構造体の配列をつくります:

static PyGetSetDef Custom_getsetters[] = {
    {"first", (getter) Custom_getfirst, (setter) Custom_setfirst,
     "first name", NULL},
    {"last", (getter) Custom_getlast, (setter) Custom_setlast,
     "last name", NULL},
    {NULL}  /* Sentinel */
};

そしてこれを tp_getset スロットに登録します:

.tp_getset = Custom_getsetters,

The last item in a PyGetSetDef structure is the "closure" mentioned above. In this case, we aren't using a closure, so we just pass NULL.

また、メンバ定義からはこれらの属性を除いておきましょう:

static PyMemberDef Custom_members[] = {
    {"number", Py_T_INT, offsetof(CustomObject, number), 0,
     "custom number"},
    {NULL}  /* Sentinel */
};

また、ここでは tp_init ハンドラも渡されるものとして文字列のみを許可するように修正する必要があります [3]:

static int
Custom_init(CustomObject *self, PyObject *args, PyObject *kwds)
{
    static char *kwlist[] = {"first", "last", "number", NULL};
    PyObject *first = NULL, *last = NULL, *tmp;

    if (!PyArg_ParseTupleAndKeywords(args, kwds, "|UUi", kwlist,
                                     &first, &last,
                                     &self->number))
        return -1;

    if (first) {
        tmp = self->first;
        Py_INCREF(first);
        self->first = first;
        Py_DECREF(tmp);
    }
    if (last) {
        tmp = self->last;
        Py_INCREF(last);
        self->last = last;
        Py_DECREF(tmp);
    }
    return 0;
}

With these changes, we can assure that the first and last members are never NULL so we can remove checks for NULL values in almost all cases. This means that most of the Py_XDECREF() calls can be converted to Py_DECREF() calls. The only place we can't change these calls is in the tp_dealloc implementation, where there is the possibility that the initialization of these members failed in tp_new.

さて、先ほどもしたように、このモジュール初期化関数と初期化関数内にあるモジュール名を変更しましょう。そして setup.py ファイルに追加の定義をくわえます。

2.4. 循環ガベージコレクションをサポートする

Python は 循環ガベージコレクタ (GC) 機能 をもっており、これは不要なオブジェクトを、たとえ参照カウントがゼロでなくても発見することができます。そのような状況はオブジェクトの参照が循環しているときに起こりえます。たとえば以下の例を考えてください:

>>> l = []
>>> l.append(l)
>>> del l

この例では、自分自身をふくむリストを作りました。たとえこのリストを 削除しても、それは自分自身への参照をまだ持ちつづけますから、参照カウントはゼロにはなりません。嬉しいことに Python には循環ガベージコレクタは最終的にはこのリストが不要であることを検出し、解放できます。

In the second version of the Custom example, we allowed any kind of object to be stored in the first or last attributes [4]. Besides, in the second and third versions, we allowed subclassing Custom, and subclasses may add arbitrary attributes. For any of those two reasons, Custom objects can participate in cycles:

>>> import custom3
>>> class Derived(custom3.Custom): pass
...
>>> n = Derived()
>>> n.some_attribute = n

To allow a Custom instance participating in a reference cycle to be properly detected and collected by the cyclic GC, our Custom type needs to fill two additional slots and to enable a flag that enables these slots:

#define PY_SSIZE_T_CLEAN
#include <Python.h>
#include <stddef.h> /* for offsetof() */

typedef struct {
    PyObject_HEAD
    PyObject *first; /* first name */
    PyObject *last;  /* last name */
    int number;
} CustomObject;

static int
Custom_traverse(CustomObject *self, visitproc visit, void *arg)
{
    Py_VISIT(self->first);
    Py_VISIT(self->last);
    return 0;
}

static int
Custom_clear(CustomObject *self)
{
    Py_CLEAR(self->first);
    Py_CLEAR(self->last);
    return 0;
}

static void
Custom_dealloc(CustomObject *self)
{
    PyObject_GC_UnTrack(self);
    Custom_clear(self);
    Py_TYPE(self)->tp_free((PyObject *) self);
}

static PyObject *
Custom_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
    CustomObject *self;
    self = (CustomObject *) type->tp_alloc(type, 0);
    if (self != NULL) {
        self->first = PyUnicode_FromString("");
        if (self->first == NULL) {
            Py_DECREF(self);
            return NULL;
        }
        self->last = PyUnicode_FromString("");
        if (self->last == NULL) {
            Py_DECREF(self);
            return NULL;
        }
        self->number = 0;
    }
    return (PyObject *) self;
}

static int
Custom_init(CustomObject *self, PyObject *args, PyObject *kwds)
{
    static char *kwlist[] = {"first", "last", "number", NULL};
    PyObject *first = NULL, *last = NULL;

    if (!PyArg_ParseTupleAndKeywords(args, kwds, "|UUi", kwlist,
                                     &first, &last,
                                     &self->number))
        return -1;

    if (first) {
        Py_SETREF(self->first, Py_NewRef(first));
    }
    if (last) {
        Py_SETREF(self->last, Py_NewRef(last));
    }
    return 0;
}

static PyMemberDef Custom_members[] = {
    {"number", Py_T_INT, offsetof(CustomObject, number), 0,
     "custom number"},
    {NULL}  /* Sentinel */
};

static PyObject *
Custom_getfirst(CustomObject *self, void *closure)
{
    return Py_NewRef(self->first);
}

static int
Custom_setfirst(CustomObject *self, PyObject *value, void *closure)
{
    if (value == NULL) {
        PyErr_SetString(PyExc_TypeError, "Cannot delete the first attribute");
        return -1;
    }
    if (!PyUnicode_Check(value)) {
        PyErr_SetString(PyExc_TypeError,
                        "The first attribute value must be a string");
        return -1;
    }
    Py_XSETREF(self->first, Py_NewRef(value));
    return 0;
}

static PyObject *
Custom_getlast(CustomObject *self, void *closure)
{
    return Py_NewRef(self->last);
}

static int
Custom_setlast(CustomObject *self, PyObject *value, void *closure)
{
    if (value == NULL) {
        PyErr_SetString(PyExc_TypeError, "Cannot delete the last attribute");
        return -1;
    }
    if (!PyUnicode_Check(value)) {
        PyErr_SetString(PyExc_TypeError,
                        "The last attribute value must be a string");
        return -1;
    }
    Py_XSETREF(self->last, Py_NewRef(value));
    return 0;
}

static PyGetSetDef Custom_getsetters[] = {
    {"first", (getter) Custom_getfirst, (setter) Custom_setfirst,
     "first name", NULL},
    {"last", (getter) Custom_getlast, (setter) Custom_setlast,
     "last name", NULL},
    {NULL}  /* Sentinel */
};

static PyObject *
Custom_name(CustomObject *self, PyObject *Py_UNUSED(ignored))
{
    return PyUnicode_FromFormat("%S %S", self->first, self->last);
}

static PyMethodDef Custom_methods[] = {
    {"name", (PyCFunction) Custom_name, METH_NOARGS,
     "Return the name, combining the first and last name"
    },
    {NULL}  /* Sentinel */
};

static PyTypeObject CustomType = {
    .ob_base = PyVarObject_HEAD_INIT(NULL, 0)
    .tp_name = "custom4.Custom",
    .tp_doc = PyDoc_STR("Custom objects"),
    .tp_basicsize = sizeof(CustomObject),
    .tp_itemsize = 0,
    .tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE | Py_TPFLAGS_HAVE_GC,
    .tp_new = Custom_new,
    .tp_init = (initproc) Custom_init,
    .tp_dealloc = (destructor) Custom_dealloc,
    .tp_traverse = (traverseproc) Custom_traverse,
    .tp_clear = (inquiry) Custom_clear,
    .tp_members = Custom_members,
    .tp_methods = Custom_methods,
    .tp_getset = Custom_getsetters,
};

static PyModuleDef custommodule = {
    .m_base = PyModuleDef_HEAD_INIT,
    .m_name = "custom4",
    .m_doc = "Example module that creates an extension type.",
    .m_size = -1,
};

PyMODINIT_FUNC
PyInit_custom4(void)
{
    PyObject *m;
    if (PyType_Ready(&CustomType) < 0)
        return NULL;

    m = PyModule_Create(&custommodule);
    if (m == NULL)
        return NULL;

    if (PyModule_AddObjectRef(m, "Custom", (PyObject *) &CustomType) < 0) {
        Py_DECREF(m);
        return NULL;
    }

    return m;
}

First, the traversal method lets the cyclic GC know about subobjects that could participate in cycles:

static int
Custom_traverse(CustomObject *self, visitproc visit, void *arg)
{
    int vret;
    if (self->first) {
        vret = visit(self->first, arg);
        if (vret != 0)
            return vret;
    }
    if (self->last) {
        vret = visit(self->last, arg);
        if (vret != 0)
            return vret;
    }
    return 0;
}

For each subobject that can participate in cycles, we need to call the visit() function, which is passed to the traversal method. The visit() function takes as arguments the subobject and the extra argument arg passed to the traversal method. It returns an integer value that must be returned if it is non-zero.

Python provides a Py_VISIT() macro that automates calling visit functions. With Py_VISIT(), we can minimize the amount of boilerplate in Custom_traverse:

static int
Custom_traverse(CustomObject *self, visitproc visit, void *arg)
{
    Py_VISIT(self->first);
    Py_VISIT(self->last);
    return 0;
}

注釈

The tp_traverse implementation must name its arguments exactly visit and arg in order to use Py_VISIT().

Second, we need to provide a method for clearing any subobjects that can participate in cycles:

static int
Custom_clear(CustomObject *self)
{
    Py_CLEAR(self->first);
    Py_CLEAR(self->last);
    return 0;
}

Notice the use of the Py_CLEAR() macro. It is the recommended and safe way to clear data attributes of arbitrary types while decrementing their reference counts. If you were to call Py_XDECREF() instead on the attribute before setting it to NULL, there is a possibility that the attribute's destructor would call back into code that reads the attribute again (especially if there is a reference cycle).

注釈

You could emulate Py_CLEAR() by writing:

PyObject *tmp;
tmp = self->first;
self->first = NULL;
Py_XDECREF(tmp);

Nevertheless, it is much easier and less error-prone to always use Py_CLEAR() when deleting an attribute. Don't try to micro-optimize at the expense of robustness!

The deallocator Custom_dealloc may call arbitrary code when clearing attributes. It means the circular GC can be triggered inside the function. Since the GC assumes reference count is not zero, we need to untrack the object from the GC by calling PyObject_GC_UnTrack() before clearing members. Here is our reimplemented deallocator using PyObject_GC_UnTrack() and Custom_clear:

static void
Custom_dealloc(CustomObject *self)
{
    PyObject_GC_UnTrack(self);
    Custom_clear(self);
    Py_TYPE(self)->tp_free((PyObject *) self);
}

Finally, we add the Py_TPFLAGS_HAVE_GC flag to the class flags:

.tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE | Py_TPFLAGS_HAVE_GC,

これで完了です。 tp_alloc スロットまたは tp_free ハンドラが書かれていれば、それらを循環ガベージコレクションに使えるよう修正すればよいのです。ほとんどの拡張機能は自動的に提供されるバージョンを使うでしょう。

2.5. 他の型のサブクラスを作る

既存の型を継承した新しい拡張型を作成することができます。組み込み型から継承するのは特に簡単です。必要な PyTypeObject を簡単に利用できるからです。それに比べて、 PyTypeObject 構造体を拡張モジュール間で共有するのは難しいです。

In this example we will create a SubList type that inherits from the built-in list type. The new type will be completely compatible with regular lists, but will have an additional increment() method that increases an internal counter:

>>> import sublist
>>> s = sublist.SubList(range(3))
>>> s.extend(s)
>>> print(len(s))
6
>>> print(s.increment())
1
>>> print(s.increment())
2
#define PY_SSIZE_T_CLEAN
#include <Python.h>

typedef struct {
    PyListObject list;
    int state;
} SubListObject;

static PyObject *
SubList_increment(SubListObject *self, PyObject *unused)
{
    self->state++;
    return PyLong_FromLong(self->state);
}

static PyMethodDef SubList_methods[] = {
    {"increment", (PyCFunction) SubList_increment, METH_NOARGS,
     PyDoc_STR("increment state counter")},
    {NULL},
};

static int
SubList_init(SubListObject *self, PyObject *args, PyObject *kwds)
{
    if (PyList_Type.tp_init((PyObject *) self, args, kwds) < 0)
        return -1;
    self->state = 0;
    return 0;
}

static PyTypeObject SubListType = {
    PyVarObject_HEAD_INIT(NULL, 0)
    .tp_name = "sublist.SubList",
    .tp_doc = PyDoc_STR("SubList objects"),
    .tp_basicsize = sizeof(SubListObject),
    .tp_itemsize = 0,
    .tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE,
    .tp_init = (initproc) SubList_init,
    .tp_methods = SubList_methods,
};

static PyModuleDef sublistmodule = {
    PyModuleDef_HEAD_INIT,
    .m_name = "sublist",
    .m_doc = "Example module that creates an extension type.",
    .m_size = -1,
};

PyMODINIT_FUNC
PyInit_sublist(void)
{
    PyObject *m;
    SubListType.tp_base = &PyList_Type;
    if (PyType_Ready(&SubListType) < 0)
        return NULL;

    m = PyModule_Create(&sublistmodule);
    if (m == NULL)
        return NULL;

    Py_INCREF(&SubListType);
    if (PyModule_AddObject(m, "SubList", (PyObject *) &SubListType) < 0) {
        Py_DECREF(&SubListType);
        Py_DECREF(m);
        return NULL;
    }

    return m;
}

As you can see, the source code closely resembles the Custom examples in previous sections. We will break down the main differences between them.

typedef struct {
    PyListObject list;
    int state;
} SubListObject;

The primary difference for derived type objects is that the base type's object structure must be the first value. The base type will already include the PyObject_HEAD() at the beginning of its structure.

When a Python object is a SubList instance, its PyObject * pointer can be safely cast to both PyListObject * and SubListObject *:

static int
SubList_init(SubListObject *self, PyObject *args, PyObject *kwds)
{
    if (PyList_Type.tp_init((PyObject *) self, args, kwds) < 0)
        return -1;
    self->state = 0;
    return 0;
}

We see above how to call through to the __init__() method of the base type.

This pattern is important when writing a type with custom tp_new and tp_dealloc members. The tp_new handler should not actually create the memory for the object with its tp_alloc, but let the base class handle it by calling its own tp_new.

The PyTypeObject struct supports a tp_base specifying the type's concrete base class. Due to cross-platform compiler issues, you can't fill that field directly with a reference to PyList_Type; it should be done later in the module initialization function:

PyMODINIT_FUNC
PyInit_sublist(void)
{
    PyObject* m;
    SubListType.tp_base = &PyList_Type;
    if (PyType_Ready(&SubListType) < 0)
        return NULL;

    m = PyModule_Create(&sublistmodule);
    if (m == NULL)
        return NULL;

    Py_INCREF(&SubListType);
    if (PyModule_AddObject(m, "SubList", (PyObject *) &SubListType) < 0) {
        Py_DECREF(&SubListType);
        Py_DECREF(m);
        return NULL;
    }

    return m;
}

PyType_Read() を呼ぶ前に、型の構造体の tp_base スロットは埋められていなければなりません。既存の型を継承する際には、 tp_alloc スロットを PyType_GenericNew() で埋める必要はありません。 -- 基底型のアロケーション関数が継承されます。

After that, calling PyType_Ready() and adding the type object to the module is the same as with the basic Custom examples.

脚注