3. 数据类型

3.1. 对象、值与类型

对象 是 Python 中对数据的抽象。Python 程序中的所有数据都是由对象或对象间关系来表示的。(从某种意义上说,按照冯·诺依曼的 “存储程序计算机” 模型,代码本身也是由对象来表示的。)

每个对象都有各自的编号、类型和值。一个对象被创建后,它的 编号 就绝不会改变;你可以将其理解为该对象在内存中的地址。 ‘is’ 运算符可以比较两个对象的编号是否相同;id() 函数能返回一个代表其编号的整型数。

CPython implementation detail: 在 CPython 中,id(x) 就是存放 x 的内存的地址。

对象的类型决定该对象所支持的操作 (例如 “对象是否有长度属性?”) 并且定义了该类型的对象可能的取值。type() 函数能返回一个对象的类型 (类型本身也是对象)。与编号一样,一个对象的 类型 也是不可改变的。[1]

有些对象的 可以改变。值可以改变的对象被称为 可变的;值不可以改变的对象就被称为 不可变的。(一个不可变容器对象如果包含对可变对象的引用,当后者的值改变时,前者的值也会改变;但是该容器仍属于不可变对象,因为它所包含的对象集是不会改变的。因此,不可变并不严格等同于值不能改变,实际含义要更微妙。) 一个对象的可变性是由其类型决定的;例如,数字、字符串和元组是不可变的,而字典和列表是可变的。

对象绝不会被显式地销毁;然而,当无法访问时它们可能会被作为垃圾回收。允许具体的实现推迟垃圾回收或完全省略此机制 — 如何实现垃圾回收是实现的质量问题,只要可访问的对象不会被回收即可。

CPython implementation detail: CPython 目前使用带有 (可选) 延迟检测循环链接垃圾的引用计数方案,会在对象不可访问时立即回收其中的大部分,但不保证回收包含循环引用的垃圾。请查看 gc 模块的文档了解如何控制循环垃圾的收集相关信息。其他实现会有不同的行为方式,CPython 现有方式也可能改变。不要依赖不可访问对象的立即终结机制 (所以你应当总是显式地关闭文件)。

注意:使用实现的跟踪或调试功能可能令正常情况下会被回收的对象继续存活。还要注意通过 ‘tryexcept’ 语句捕捉异常也可能令对象保持存活。

有些对象包含对 “外部” 资源的引用,例如打开文件或窗口。当对象被作为垃圾回收时这些资源也应该会被释放,但由于垃圾回收并不确保发生,这些对象还提供了明确地释放外部资源的操作,通常为一个 close() 方法。强烈推荐在程序中显式关闭此类对象。’tryfinally’ 语句和 ‘with’ 语句提供了进行此种操作的更便捷方式。

有些对象包含对其他对象的引用;它们被称为 容器。容器的例子有元组、列表和字典等。这些引用是容器对象值的组成部分。在多数情况下,当谈论一个容器的值时,我们是指所包含对象的值而不是其编号;但是,当我们谈论一个容器的可变性时,则仅指其直接包含的对象的编号。因此,如果一个不可变容器 (例如元组) 包含对一个可变对象的引用,则当该可变对象被改变时容器的值也会改变。

类型会影响对象行为的几乎所有方面。甚至对象编号的重要性也在某种程度上受到影响: 对于不可变类型,会得出新值的运算实际上会返回对相同类型和取值的任一现有对象的引用,而对于可变类型来说这是不允许的。例如在 a = 1; b = 1 之后,ab 可能会也可能不会指向同一个值为一的对象,这取决于具体实现,但是在 c = []; d = [] 之后,cd 保证会指向两个不同、单独的新建空列表。(请注意 c = d = [] 则是将同一个对象赋值给 cd。)

3.2. 标准类型等级结构

以下是 Python 内置类型的列表。扩展模块 (具体实现会以 C, Java 或其他语言编写) 可以定义更多的类型。未来版本的 Python 可能会加入更多的类型 (例如有理数、高效存储的整型数组等等),不过新增类型往往都是通过标准库来提供的。

以下部分类型的描述中包含有 ‘特殊属性列表’ 段落。这些属性提供对具体实现的访问而非通常使用。它们的定义在未来可能会改变。

None

此类型只有一种取值。是一个具有此值的单独对象。此对象通过内置名称 None 访问。在许多情况下它被用来表示空值,例如未显式指明返回值的函数将返回 None。它的逻辑值为假。

NotImplemented

此类型只有一种取值。是一个具有此值的单独对象。此对象通过内置名称 NotImplemented 访问。数值方法和丰富比较方法如未实现指定运算符表示的运算则应返回此值。(解释器会根据指定运算符继续尝试反向运算或其他回退操作)。它的逻辑值为真。

详情参见 Implementing the arithmetic operations

Ellipsis

此类型只有一种取值。是一个具有此值的单独对象。此对象通过字面值 ... 或内置名称 Ellipsis 访问。它的逻辑值为真。

numbers.Number

此类对象由数字字面值创建,并会被作为算术运算符和算术内置函数的返回结果。数字对象是不可变的;一旦创建其值就不再改变。Python 中的数字当然非常类似数学中的数字,但也受限于计算机中的数字表示方法。

Python 区分整型数、浮点型数和复数:

numbers.Integral

此类对象表示数学中整数集合的成员 (包括正数和负数)。

整型数可细分为两种类型:

整型 (int)

此类对象表示任意大小的数字,仅受限于可用的内存 (包括虚拟内存)。在变换和掩码运算中会以二进制表示,负数会以 2 的补码表示,看起来像是符号位向左延伸补满空位。
布尔型 (bool)

此类对象表示逻辑值 False 和 True。代表 FalseTrue 值的两个对象是唯二的布尔对象。布尔类型是整型的子类型,两个布尔值在各种场合的行为分别类似于数值 0 和 1,例外情况只有在转换为字符串时分别返回字符串 "False""True"

整型数表示规则的目的是在涉及负整型数的变换和掩码运算时提供最为合理的解释。

numbers.Real (float)

此类对象表示机器级的双精度浮点数。其所接受的取值范围和溢出处理将受制于底层的机器架构 (以及 C 或 Java 实现)。Python 不支持单精度浮点数;支持后者通常的理由是节省处理器和内存消耗,但这会妨碍在 Python 中使用对象,因此没有理由包含两种浮点数而令该语言变得复杂。

numbers.Complex (complex)

此类对象以一对机器级的双精度浮点数来表示复数值。有关浮点数的附带规则对其同样有效。一个复数值 z 的实部和虚部可通过只读属性 z.realz.imag 来获取。

序列

此类对象表示以非负整数作为索引的有限有序集。内置函数 len() 可返回一个序列的条目数量。当一个序列的长度为 n 时,索引集包含数字 0, 1, …, n-1。序列 a 的条目 i 可通过 a[i] 选择。

序列还支持切片: a[i:j] 选择索引号为 k 的所有条目,i <= k < j。当用作表达式时,序列的切片就是一个与序列类型相同的新序列。新序列的索引还是从 0 开始。

有些序列还支持带有第三个 “step” 形参的 “扩展切片”: a[i:j:k] 选择 a 中索引号为 x 的所有条目,x = i + n*k, n >= 0i <= x < j

序列可根据其可变性来加以区分:

不可变序列

不可变序列类型的对象一旦创建就不能再改变。(如果对象包含对其他对象的引用,其中的可变对象就是可以改变的;但是,一个不可变对象所直接引用的对象集是不能改变的。)

以下类型属于不可变对象:

字符串

字符串是由 Unicode 码位值组成的序列。范围在 U+0000 - U+10FFFF 之内的所有码位值都可在字符串中使用。Python 没有 char 类型;而是将字符串中的每个码位表示为一个长度为 1 的字符串对象。内置函数 ord() 可将一个码位由字符串形式转换成一个范围在 0 - 10FFFF 之内的整型数;chr() 可将一个范围在 0 - 10FFFF 之内的整型数转换为长度为 1 的对应字符串对象。str.encode() 可以使用指定的文本编码将 str 转换为 bytes,而 bytes.decode() 则可以实现反向的解码。

元组

一个元组中的条目可以是任意 Python 对象。包含两个或以上条目的元组由逗号分隔的表达式构成。只有一个条目的元组 (‘单项元组’) 可通过在表达式后加一个逗号来构成 (一个表达式本身不能创建为元组,因为圆括号要用来设置表达式分组)。一个空元组可通过一对内容为空的圆括号创建。

字节串

字节串对象是不可变的数组。其中每个条目都是一个 8 位字节,以取值范围 0 <= x < 256 的整型数表示。字节串字面值 (例如 b'abc') 和内置的 bytes() 构造器可被用来创建字节串对象。字节串对象还可以通过 decode() 方法解码为字符串。

可变序列

可变序列在被创建后仍可被改变。下标和切片标注可被用作赋值和 del (删除) 语句的目标。

目前有两种内生可变序列类型:

列表

列表中的条目可以是任意 Python 对象。列表由用方括号括起并由逗号分隔的多个表达式构成。(注意创建长度为 0 或 1 的列表无需使用特殊规则。)

字节数组

字节数组对象属于可变数组。可以通过内置的 bytearray() 构造器来创建。除了是可变的 (因而也是不可哈希的),在其他方面字节数组提供的接口和功能都于不可变的 bytes 对象一致。

扩展模块 array 提供了一个额外的可变序列类型示例,collections 模块也是如此。

集合类型

此类对象表示由不重复且不可变对象组成的无序且有限的集合。因此它们不能通过下标来索引。但是它们可被迭代,也可用内置函数 len() 返回集合中的条目数。集合常见的用处是快速成员检测,去除序列中的重复项,以及进行交、并、差和对称差等数学运算。

对于集合元素所采用的不可变规则与字典的键相同。注意数字类型遵循正常的数字比较规则: 如果两个数字相等 (例如 11.0),则同一集合中只能包含其中一个。

目前有两种内生集合类型:

集合

此类对象表示可变集合。它们可通过内置的 set() 构造器创建,并且创建之后可以通过方法进行修改,例如 add()

冻结集合

此类对象表示不可变集合。它们可通过内置的 frozenset() 构造器创建。由于 frozenset 对象不可变且 可哈希,它可以被用作另一个集合的元素或是字典的键。

映射

此类对象表示由任意索引集合所索引的对象的集合。通过下标 a[k] 可在映射 a 中选择索引为 k 的条目;这可以在表达式中使用,也可作为赋值或 del 语句的目标。内置函数 len() 可返回一个映射中的条目数。

目前只有一种内生映射类型:

字典

此类对象表示由几乎任意值作为索引的有限个对象的集合。不可作为键的值类型只有包含列表或字典或其他可变类型,通过值而非对象编号进行比较的值,其原因在于高效的字典实现需要使用键的哈希值以保持一致性。用作键的数字类型遵循正常的数字比较规则: 如果两个数字相等 (例如 11.0) 则它们均可来用来索引同一个字典条目。

字典是可变的;它们可通过 {...} 标注来创建 (参见 Dictionary displays 小节)。

扩展模块 dbm.ndbmdbm.gnu 提供了额外的映射类型示例,collections 模块也是如此。

可调用类型

此类型可以被应用于函数调用操作 (参见 调用 小节):

用户定义函数

用户定义函数对象可通过函数定义来创建 (参见 Function definitions 小节)。它被调用时应附带一个参数列表,其中包含的条目应与函数所定义的形参列表一致。

特殊属性:

属性 意义  
__doc__ 该函数的文档字符串,没有则为 None;不会被子类所继承 可写
__name__ 该函数的名称 可写
__qualname__

该函数的 限定名称

3.3 新版功能.

可写
__module__ 该函数所属模块的名称,没有则为 None 可写
__defaults__ 由具有默认值的参数的默认参数值组成的元组,如无任何参数具有默认值则为 None 可写
__code__ 表示编译后的函数体的代码对象。 可写
__globals__ 对存放该函数中全局变量的字典的引用 — 函数所属模块的全局命名空间。 只读
__dict__ 命名空间支持的函数属性。 可写
__closure__ None or a tuple of cells that contain bindings for the function’s free variables. 只读
__annotations__ 包含参数标注的字典。字典的键是参数名,如存在返回标注则为 'return' 可写
__kwdefaults__ 仅包含关键字参数默认值的字典。 可写

大部分标有 “Writable” 的属性均会检查赋值的类型。

函数对象也支持获取和设置任意属性,例如这可以被用来给函数附加元数据。使用正规的属性点号标注获取和设置此类属性。注意当前实现仅支持用户定义函数属性。未来可能会增加支持内置函数属性。

有关函数定义的额外信息可以从其代码对象中提取;参见下文对内部类型的描述。

实例方法

实例方法用于结合类、类实例和任何可调用对象 (通常为用户定义函数)。

特殊的只读属性: __self__ 为类实例对象本身,__func__ 为函数对象;__doc__ 为方法的文档 (与 __func__.__doc__ 作用相同);__name__ 为方法名称 (与 __func__.__name__ 作用相同);__module__ 为方法所属模块的名称,没有则为 None

方法还支持获取 (但不能设置) 下层函数对象的任意函数属性。

用户定义方法对象可在获取一个类的属性时被创建 (也可能通过该类的一个实例),如果该属性为用户定义函数对象或类方法对象。

当通过从类实例获取一个用户定义函数对象的方式创建一个实例方法对象时,类实例对象的 __self__ 属性即为该实例,并会绑定方法对象。该新建方法的 __func__ 属性就是原来的函数对象。

当通过从类或实例获取另一个方法对象的方式创建一个用户定义方法对象时,其行为将等同于一个函数对象,例外的只有新实例的 __func__ 属性将不是原来的方法对象,而是其 __func__ 属性。

当通过从类或实例获取一个类方法对象的方式创建一个实例对象时,实例对象的 __self__ 属性为该类本身,其 __func__ 属性为类方法对应的下层函数对象。

当一个实例方法对象被调用时,会调用对应的下层函数 (__func__),并将类实例 (__self__) 插入参数列表的开头。例如,当 C 是一个包含了 f() 函数定义的类,而 xC 的一个实例,则调用 x.f(1) 就等同于调用 C.f(x, 1)

当一个实例方法对象是衍生自一个类方法对象时,保存在 __self__ 中的 “类实例” 实际上会是该类本身,因此无论是调用 x.f(1) 还是 C.f(1) 都等同于调用 f(C,1),其中 f 为对应的下层函数。

请注意从函数对象到实例方法对象的变换会在每一次从实例获取属性时发生。在某些情况下,一种高效的优化方式是将属性赋值给一个本地变量并调用该本地变量。还要注意这样的变换只发生于用户定义函数;其他可调用对象 (以及所有不可调用对象) 在被获取时都不会发生变换。还有一个需要关注的要点是作为一个类实例属性的用户定义函数不会被转换为绑定方法;这样的变换 仅当 函数是类属性时才会发生。

生成器函数

使用了 yield 语句 (参见 The yield statement 一节) 的函数或方法就被称为 生成器函数。这样的函数在被调用时总是返回一个迭代器对象并可被用来执行函数体: 调用迭代器的 iterator.__next__() 方法将使得函数继续执行直到其用 yield 语句返回一个值。当函数执行到 return 语句或是最后一条语句时,将会引发 StopIteration 异常,迭代器也会到达要返回的值集合的末尾。

协程函数

使用 async def 来定义的函数或方法就被称为 协程函数。这样的函数在被调用时会返回一个 协程 对象。它可能包含 await 表达式以及 async withasync for 语句。详情可参见 协程对象 一节。

异步生成器函数

使用 async def 来定义并包含 yield 语句的函数或方法就被称为 异步生成器函数。这样的函数在被调用时会返回一个异步迭代器对象,该对象可在 async for 语句中用来执行函数体。

调用异步迭代器的 aiterator.__anext__() 方法将会返回一个 可等待对象,此对象会在被等待时执行直到使用 yield 表达式输出一个值。当函数执行时到空的 return 语句或是最后一条语句时,将会引发 StopAsyncIteration 异常,异步迭代器也会到达要输出的值集合的末尾。

内置函数

内置函数对象是对于 C 函数的外部封装。内置函数的例子包括 len()math.sin() (math 是一个标准内置模块)。内置函数参数的数量和类型由 C 函数决定。特殊的只读属性: __doc__ 是函数的文档字符串,如果没有则为 None; __name__ 是函数的名称; __self__ 设定为 None (参见下一条目); __module__ 是函数所属模块的名称,如果没有则为 None

内置方法

此类型实际上是内置函数的另一种形式,只不过还包含了一个传入 C 函数的对象作为隐式的额外参数。内置方法的一个例子是 alist.append(),其中 alist 为一个列表对象。在此示例中,特殊的只读属性 __self__ 会被设为 alist 所标记的对象。

类是可调用的。此种对象通常是作为“工厂”来创建自身的实例,类也可以有重载 __new__() 的变体类型。调用的参数会传给 __new__(),而且通常也会传给 __init__() 来初始化新的实例。
类实例
任意类的实例通过在所属类中定义 __call__() 方法即能成为可调用的对象。
模块

模块是 Python 代码的基本组织单位,模块是由 导入系统 通过使用用 import 语句 (参见 import) 或调用 importlib.import_module() 以及内置的 __import__() 等函数来创建的。每个模块对象都有一个独立的命名空间,它是通过一个字典对象实现的 (就是由在模块中定义的函数的 __globals__ 属性所引用的字典)。属性引用会被转化为在此字典中查找,例如 m.x 等同于 m.__dict__["x"]。一个模块对象不包含用来初始化该模块的代码对象 (因为这种对象在初始化完成后就不再需要)。

属性赋值会更新模块的命名空间字典,例如 m.x = 1 等同于 m.__dict__["x"] = 1

预定义的 (可写) 属性: __name__ 为模块的名称; __doc__ 为模块的文档字符串,如果没有则为 None; __annotations__ (可选) 为一个包含 变量标注 的字典,它是在模块体执行时获取的; __file__ 是模块对应的被加载文件的路径名,如果它是加载自一个文件的话。某些类型的模块可能没有 __file__ 属性,例如 C 模块是静态链接到解释器内部的; 对于从一个共享库动态加载的扩展模块来说该属性为该共享库文件的路径名。

特殊的只读属性: __dict__ 为以字典对象表示的模块命名空间。

CPython implementation detail: 由于 CPython 清理模块字典的设定,当模块离开作用域时模块字典将会被清理,即使该字典还有活动的引用。想避免此问题,可复制该字典或保持模块状态以直接使用其字典。

自定义类

自定义类这种类型一般通过类定义来创建 (参见 Class definitions 一节)。每个类都有通过一个字典对象实现的独立命名空间。类属性引用会被转化为在此字典中查找,例如 C.x 会被转化为 C.__dict__["x"] (不过也存在一些钩子对象以允许其他定位属性的方式)。当未在其中发现某个属性名称时,会继续在基类中查找。这种基类查找使用 C3 方法解析顺序,即使存在 ‘钻石形’ 继承结构即有多条继承路径连到一个共同祖先也能保持正确的行为。有关 Python 使用的 C3 MRO 的详情可查看配合 2.3 版发布的文档 https://www.python.org/download/releases/2.3/mro/.

当一个类属性引用 (假设类名为 C) 会产生一个类方法对象时,它将转化为一个 __self__ 属性为 C 的实例方法对象。当其会产生一个静态方法对象时,它将转化为该静态方法对象所封装的对象。从类的 __dict__ 所包含内容以外获取属性的其他方式请参看 Implementing Descriptors 一节。

类属性赋值会更新类的字典,但不会更新基类的字典。

类对象可被调用 (见上文) 以产生一个类实例 (见下文)。

特殊属性: __name__ 为类的名称; __module__ 为类所在模块的名称; __dict__ 为包含类命名空间的字典; __bases__ 为包含基类的元组,按其在基类列表中的出现顺序排列; __doc__ 为类的文档字符串,如果没有则为 None; __annotations__ (可选) 为一个包含 变量标注 的字典,它是在类体执行时获取的。

类实例

类实例可通过调用类对象来创建 (见上文)。每个类实例都有通过一个字典对象实现的独立命名空间,属性引用会首先在此字典中查找。当未在其中发现某个属性,而实例对应的类中有该属性时,会继续在类属性中查找。如果找到的类属性为一个用户定义函数对象,它会被转化为实例方法对象,其 __self__ 属性即该实例。静态方法和类方法对象也会被转化;参见上文 “Classes” 一节。要了解其他通过类实例来获取相应类属性的方式可参见 Implementing Descriptors 一节,这样得到的属性可能与实际存放于类的 __dict__ 中的对象不同。如果未找到类属性,而对象对应的类具有 __getattr__() 方法,则会调用该方法来满足查找要求。

属性赋值和删除会更新实例的字典,但不会更新对应类的字典。如果类具有 __setattr__()__delattr__() 方法,则将调用方法而不再直接更新实例的字典。

如果类实例具有某些特殊名称的方法,就可以伪装为数字、序列或映射。参见 Special method names 一节。

特殊属性: __dict__ 为属性字典; __class__ 为实例对应的类。

I/O 对象 (或称文件对象)

A file object represents an open file. Various shortcuts are available to create file objects: the open() built-in function, and also os.popen(), os.fdopen(), and the makefile() method of socket objects (and perhaps by other functions or methods provided by extension modules).

The objects sys.stdin, sys.stdout and sys.stderr are initialized to file objects corresponding to the interpreter’s standard input, output and error streams; they are all open in text mode and therefore follow the interface defined by the io.TextIOBase abstract class.

Internal types

A few types used internally by the interpreter are exposed to the user. Their definitions may change with future versions of the interpreter, but they are mentioned here for completeness.

Code objects

Code objects represent byte-compiled executable Python code, or bytecode. The difference between a code object and a function object is that the function object contains an explicit reference to the function’s globals (the module in which it was defined), while a code object contains no context; also the default argument values are stored in the function object, not in the code object (because they represent values calculated at run-time). Unlike function objects, code objects are immutable and contain no references (directly or indirectly) to mutable objects.

Special read-only attributes: co_name gives the function name; co_argcount is the number of positional arguments (including arguments with default values); co_nlocals is the number of local variables used by the function (including arguments); co_varnames is a tuple containing the names of the local variables (starting with the argument names); co_cellvars is a tuple containing the names of local variables that are referenced by nested functions; co_freevars is a tuple containing the names of free variables; co_code is a string representing the sequence of bytecode instructions; co_consts is a tuple containing the literals used by the bytecode; co_names is a tuple containing the names used by the bytecode; co_filename is the filename from which the code was compiled; co_firstlineno is the first line number of the function; co_lnotab is a string encoding the mapping from bytecode offsets to line numbers (for details see the source code of the interpreter); co_stacksize is the required stack size (including local variables); co_flags is an integer encoding a number of flags for the interpreter.

The following flag bits are defined for co_flags: bit 0x04 is set if the function uses the *arguments syntax to accept an arbitrary number of positional arguments; bit 0x08 is set if the function uses the **keywords syntax to accept arbitrary keyword arguments; bit 0x20 is set if the function is a generator.

Future feature declarations (from __future__ import division) also use bits in co_flags to indicate whether a code object was compiled with a particular feature enabled: bit 0x2000 is set if the function was compiled with future division enabled; bits 0x10 and 0x1000 were used in earlier versions of Python.

Other bits in co_flags are reserved for internal use.

If a code object represents a function, the first item in co_consts is the documentation string of the function, or None if undefined.

Frame objects

Frame objects represent execution frames. They may occur in traceback objects (see below).

Special read-only attributes: f_back is to the previous stack frame (towards the caller), or None if this is the bottom stack frame; f_code is the code object being executed in this frame; f_locals is the dictionary used to look up local variables; f_globals is used for global variables; f_builtins is used for built-in (intrinsic) names; f_lasti gives the precise instruction (this is an index into the bytecode string of the code object).

Special writable attributes: f_trace, if not None, is a function called at the start of each source code line (this is used by the debugger); f_lineno is the current line number of the frame — writing to this from within a trace function jumps to the given line (only for the bottom-most frame). A debugger can implement a Jump command (aka Set Next Statement) by writing to f_lineno.

Frame objects support one method:

frame.clear()

This method clears all references to local variables held by the frame. Also, if the frame belonged to a generator, the generator is finalized. This helps break reference cycles involving frame objects (for example when catching an exception and storing its traceback for later use).

RuntimeError is raised if the frame is currently executing.

3.4 新版功能.

Traceback objects

Traceback objects represent a stack trace of an exception. A traceback object is created when an exception occurs. When the search for an exception handler unwinds the execution stack, at each unwound level a traceback object is inserted in front of the current traceback. When an exception handler is entered, the stack trace is made available to the program. (See section The try statement.) It is accessible as the third item of the tuple returned by sys.exc_info(). When the program contains no suitable handler, the stack trace is written (nicely formatted) to the standard error stream; if the interpreter is interactive, it is also made available to the user as sys.last_traceback.

Special read-only attributes: tb_next is the next level in the stack trace (towards the frame where the exception occurred), or None if there is no next level; tb_frame points to the execution frame of the current level; tb_lineno gives the line number where the exception occurred; tb_lasti indicates the precise instruction. The line number and last instruction in the traceback may differ from the line number of its frame object if the exception occurred in a try statement with no matching except clause or with a finally clause.

Slice objects

Slice objects are used to represent slices for __getitem__() methods. They are also created by the built-in slice() function.

Special read-only attributes: start is the lower bound; stop is the upper bound; step is the step value; each is None if omitted. These attributes can have any type.

Slice objects support one method:

slice.indices(self, length)

This method takes a single integer argument length and computes information about the slice that the slice object would describe if applied to a sequence of length items. It returns a tuple of three integers; respectively these are the start and stop indices and the step or stride length of the slice. Missing or out-of-bounds indices are handled in a manner consistent with regular slices.

Static method objects
Static method objects provide a way of defeating the transformation of function objects to method objects described above. A static method object is a wrapper around any other object, usually a user-defined method object. When a static method object is retrieved from a class or a class instance, the object actually returned is the wrapped object, which is not subject to any further transformation. Static method objects are not themselves callable, although the objects they wrap usually are. Static method objects are created by the built-in staticmethod() constructor.
Class method objects
A class method object, like a static method object, is a wrapper around another object that alters the way in which that object is retrieved from classes and class instances. The behaviour of class method objects upon such retrieval is described above, under “User-defined methods”. Class method objects are created by the built-in classmethod() constructor.

3.3. Special method names

A class can implement certain operations that are invoked by special syntax (such as arithmetic operations or subscripting and slicing) by defining methods with special names. This is Python’s approach to operator overloading, allowing classes to define their own behavior with respect to language operators. For instance, if a class defines a method named __getitem__(), and x is an instance of this class, then x[i] is roughly equivalent to type(x).__getitem__(x, i). Except where mentioned, attempts to execute an operation raise an exception when no appropriate method is defined (typically AttributeError or TypeError).

Setting a special method to None indicates that the corresponding operation is not available. For example, if a class sets __iter__() to None, the class is not iterable, so calling iter() on its instances will raise a TypeError (without falling back to __getitem__()). [2]

When implementing a class that emulates any built-in type, it is important that the emulation only be implemented to the degree that it makes sense for the object being modelled. For example, some sequences may work well with retrieval of individual elements, but extracting a slice may not make sense. (One example of this is the NodeList interface in the W3C’s Document Object Model.)

3.3.1. Basic customization

object.__new__(cls[, ...])

Called to create a new instance of class cls. __new__() is a static method (special-cased so you need not declare it as such) that takes the class of which an instance was requested as its first argument. The remaining arguments are those passed to the object constructor expression (the call to the class). The return value of __new__() should be the new object instance (usually an instance of cls).

Typical implementations create a new instance of the class by invoking the superclass’s __new__() method using super().__new__(cls[, ...]) with appropriate arguments and then modifying the newly-created instance as necessary before returning it.

If __new__() returns an instance of cls, then the new instance’s __init__() method will be invoked like __init__(self[, ...]), where self is the new instance and the remaining arguments are the same as were passed to __new__().

If __new__() does not return an instance of cls, then the new instance’s __init__() method will not be invoked.

__new__() is intended mainly to allow subclasses of immutable types (like int, str, or tuple) to customize instance creation. It is also commonly overridden in custom metaclasses in order to customize class creation.

object.__init__(self[, ...])

Called after the instance has been created (by __new__()), but before it is returned to the caller. The arguments are those passed to the class constructor expression. If a base class has an __init__() method, the derived class’s __init__() method, if any, must explicitly call it to ensure proper initialization of the base class part of the instance; for example: super().__init__([args...]).

Because __new__() and __init__() work together in constructing objects (__new__() to create it, and __init__() to customize it), no non-None value may be returned by __init__(); doing so will cause a TypeError to be raised at runtime.

object.__del__(self)

Called when the instance is about to be destroyed. This is also called a finalizer or (improperly) a destructor. If a base class has a __del__() method, the derived class’s __del__() method, if any, must explicitly call it to ensure proper deletion of the base class part of the instance.

It is possible (though not recommended!) for the __del__() method to postpone destruction of the instance by creating a new reference to it. This is called object resurrection. It is implementation-dependent whether __del__() is called a second time when a resurrected object is about to be destroyed; the current CPython implementation only calls it once.

It is not guaranteed that __del__() methods are called for objects that still exist when the interpreter exits.

注解

del x doesn’t directly call x.__del__() — the former decrements the reference count for x by one, and the latter is only called when x’s reference count reaches zero.

CPython implementation detail: It is possible for a reference cycle to prevent the reference count of an object from going to zero. In this case, the cycle will be later detected and deleted by the cyclic garbage collector. A common cause of reference cycles is when an exception has been caught in a local variable. The frame’s locals then reference the exception, which references its own traceback, which references the locals of all frames caught in the traceback.

参见

Documentation for the gc module.

警告

Due to the precarious circumstances under which __del__() methods are invoked, exceptions that occur during their execution are ignored, and a warning is printed to sys.stderr instead. In particular:

  • __del__() can be invoked when arbitrary code is being executed, including from any arbitrary thread. If __del__() needs to take a lock or invoke any other blocking resource, it may deadlock as the resource may already be taken by the code that gets interrupted to execute __del__().
  • __del__() can be executed during interpreter shutdown. As a consequence, the global variables it needs to access (including other modules) may already have been deleted or set to None. Python guarantees that globals whose name begins with a single underscore are deleted from their module before other globals are deleted; if no other references to such globals exist, this may help in assuring that imported modules are still available at the time when the __del__() method is called.
object.__repr__(self)

Called by the repr() built-in function to compute the “official” string representation of an object. If at all possible, this should look like a valid Python expression that could be used to recreate an object with the same value (given an appropriate environment). If this is not possible, a string of the form <...some useful description...> should be returned. The return value must be a string object. If a class defines __repr__() but not __str__(), then __repr__() is also used when an “informal” string representation of instances of that class is required.

This is typically used for debugging, so it is important that the representation is information-rich and unambiguous.

object.__str__(self)

Called by str(object) and the built-in functions format() and print() to compute the “informal” or nicely printable string representation of an object. The return value must be a string object.

This method differs from object.__repr__() in that there is no expectation that __str__() return a valid Python expression: a more convenient or concise representation can be used.

The default implementation defined by the built-in type object calls object.__repr__().

object.__bytes__(self)

Called by bytes to compute a byte-string representation of an object. This should return a bytes object.

object.__format__(self, format_spec)

Called by the format() built-in function, and by extension, evaluation of formatted string literals and the str.format() method, to produce a “formatted” string representation of an object. The format_spec argument is a string that contains a description of the formatting options desired. The interpretation of the format_spec argument is up to the type implementing __format__(), however most classes will either delegate formatting to one of the built-in types, or use a similar formatting option syntax.

See Format Specification Mini-Language for a description of the standard formatting syntax.

The return value must be a string object.

在 3.4 版更改: The __format__ method of object itself raises a TypeError if passed any non-empty string.

object.__lt__(self, other)
object.__le__(self, other)
object.__eq__(self, other)
object.__ne__(self, other)
object.__gt__(self, other)
object.__ge__(self, other)

These are the so-called “rich comparison” methods. The correspondence between operator symbols and method names is as follows: x<y calls x.__lt__(y), x<=y calls x.__le__(y), x==y calls x.__eq__(y), x!=y calls x.__ne__(y), x>y calls x.__gt__(y), and x>=y calls x.__ge__(y).

A rich comparison method may return the singleton NotImplemented if it does not implement the operation for a given pair of arguments. By convention, False and True are returned for a successful comparison. However, these methods can return any value, so if the comparison operator is used in a Boolean context (e.g., in the condition of an if statement), Python will call bool() on the value to determine if the result is true or false.

By default, __ne__() delegates to __eq__() and inverts the result unless it is NotImplemented. There are no other implied relationships among the comparison operators, for example, the truth of (x<y or x==y) does not imply x<=y. To automatically generate ordering operations from a single root operation, see functools.total_ordering().

See the paragraph on __hash__() for some important notes on creating hashable objects which support custom comparison operations and are usable as dictionary keys.

There are no swapped-argument versions of these methods (to be used when the left argument does not support the operation but the right argument does); rather, __lt__() and __gt__() are each other’s reflection, __le__() and __ge__() are each other’s reflection, and __eq__() and __ne__() are their own reflection. If the operands are of different types, and right operand’s type is a direct or indirect subclass of the left operand’s type, the reflected method of the right operand has priority, otherwise the left operand’s method has priority. Virtual subclassing is not considered.

object.__hash__(self)

Called by built-in function hash() and for operations on members of hashed collections including set, frozenset, and dict. __hash__() should return an integer. The only required property is that objects which compare equal have the same hash value; it is advised to mix together the hash values of the components of the object that also play a part in comparison of objects by packing them into a tuple and hashing the tuple. Example:

def __hash__(self):
    return hash((self.name, self.nick, self.color))

注解

hash() truncates the value returned from an object’s custom __hash__() method to the size of a Py_ssize_t. This is typically 8 bytes on 64-bit builds and 4 bytes on 32-bit builds. If an object’s __hash__() must interoperate on builds of different bit sizes, be sure to check the width on all supported builds. An easy way to do this is with python -c "import sys; print(sys.hash_info.width)".

If a class does not define an __eq__() method it should not define a __hash__() operation either; if it defines __eq__() but not __hash__(), its instances will not be usable as items in hashable collections. If a class defines mutable objects and implements an __eq__() method, it should not implement __hash__(), since the implementation of hashable collections requires that a key’s hash value is immutable (if the object’s hash value changes, it will be in the wrong hash bucket).

User-defined classes have __eq__() and __hash__() methods by default; with them, all objects compare unequal (except with themselves) and x.__hash__() returns an appropriate value such that x == y implies both that x is y and hash(x) == hash(y).

A class that overrides __eq__() and does not define __hash__() will have its __hash__() implicitly set to None. When the __hash__() method of a class is None, instances of the class will raise an appropriate TypeError when a program attempts to retrieve their hash value, and will also be correctly identified as unhashable when checking isinstance(obj, collections.Hashable).

If a class that overrides __eq__() needs to retain the implementation of __hash__() from a parent class, the interpreter must be told this explicitly by setting __hash__ = <ParentClass>.__hash__.

If a class that does not override __eq__() wishes to suppress hash support, it should include __hash__ = None in the class definition. A class which defines its own __hash__() that explicitly raises a TypeError would be incorrectly identified as hashable by an isinstance(obj, collections.Hashable) call.

注解

By default, the __hash__() values of str, bytes and datetime objects are “salted” with an unpredictable random value. Although they remain constant within an individual Python process, they are not predictable between repeated invocations of Python.

This is intended to provide protection against a denial-of-service caused by carefully-chosen inputs that exploit the worst case performance of a dict insertion, O(n^2) complexity. See http://www.ocert.org/advisories/ocert-2011-003.html for details.

Changing hash values affects the iteration order of dicts, sets and other mappings. Python has never made guarantees about this ordering (and it typically varies between 32-bit and 64-bit builds).

See also PYTHONHASHSEED.

在 3.3 版更改: Hash randomization is enabled by default.

object.__bool__(self)

Called to implement truth value testing and the built-in operation bool(); should return False or True. When this method is not defined, __len__() is called, if it is defined, and the object is considered true if its result is nonzero. If a class defines neither __len__() nor __bool__(), all its instances are considered true.

3.3.2. Customizing attribute access

The following methods can be defined to customize the meaning of attribute access (use of, assignment to, or deletion of x.name) for class instances.

object.__getattr__(self, name)

Called when the default attribute access fails with an AttributeError (either __getattribute__() raises an AttributeError because name is not an instance attribute or an attribute in the class tree for self; or __get__() of a name property raises AttributeError). This method should either return the (computed) attribute value or raise an AttributeError exception.

Note that if the attribute is found through the normal mechanism, __getattr__() is not called. (This is an intentional asymmetry between __getattr__() and __setattr__().) This is done both for efficiency reasons and because otherwise __getattr__() would have no way to access other attributes of the instance. Note that at least for instance variables, you can fake total control by not inserting any values in the instance attribute dictionary (but instead inserting them in another object). See the __getattribute__() method below for a way to actually get total control over attribute access.

object.__getattribute__(self, name)

Called unconditionally to implement attribute accesses for instances of the class. If the class also defines __getattr__(), the latter will not be called unless __getattribute__() either calls it explicitly or raises an AttributeError. This method should return the (computed) attribute value or raise an AttributeError exception. In order to avoid infinite recursion in this method, its implementation should always call the base class method with the same name to access any attributes it needs, for example, object.__getattribute__(self, name).

注解

This method may still be bypassed when looking up special methods as the result of implicit invocation via language syntax or built-in functions. See Special method lookup.

object.__setattr__(self, name, value)

Called when an attribute assignment is attempted. This is called instead of the normal mechanism (i.e. store the value in the instance dictionary). name is the attribute name, value is the value to be assigned to it.

If __setattr__() wants to assign to an instance attribute, it should call the base class method with the same name, for example, object.__setattr__(self, name, value).

object.__delattr__(self, name)

Like __setattr__() but for attribute deletion instead of assignment. This should only be implemented if del obj.name is meaningful for the object.

object.__dir__(self)

Called when dir() is called on the object. A sequence must be returned. dir() converts the returned sequence to a list and sorts it.

3.3.2.1. Customizing module attribute access

For a more fine grained customization of the module behavior (setting attributes, properties, etc.), one can set the __class__ attribute of a module object to a subclass of types.ModuleType. For example:

import sys
from types import ModuleType

class VerboseModule(ModuleType):
    def __repr__(self):
        return f'Verbose {self.__name__}'

    def __setattr__(self, attr, value):
        print(f'Setting {attr}...')
        setattr(self, attr, value)

sys.modules[__name__].__class__ = VerboseModule

注解

Setting module __class__ only affects lookups made using the attribute access syntax – directly accessing the module globals (whether by code within the module, or via a reference to the module’s globals dictionary) is unaffected.

在 3.5 版更改: __class__ module attribute is now writable.

3.3.2.2. Implementing Descriptors

The following methods only apply when an instance of the class containing the method (a so-called descriptor class) appears in an owner class (the descriptor must be in either the owner’s class dictionary or in the class dictionary for one of its parents). In the examples below, “the attribute” refers to the attribute whose name is the key of the property in the owner class’ __dict__.

object.__get__(self, instance, owner)

Called to get the attribute of the owner class (class attribute access) or of an instance of that class (instance attribute access). owner is always the owner class, while instance is the instance that the attribute was accessed through, or None when the attribute is accessed through the owner. This method should return the (computed) attribute value or raise an AttributeError exception.

object.__set__(self, instance, value)

Called to set the attribute on an instance instance of the owner class to a new value, value.

object.__delete__(self, instance)

Called to delete the attribute on an instance instance of the owner class.

object.__set_name__(self, owner, name)

Called at the time the owning class owner is created. The descriptor has been assigned to name.

3.6 新版功能.

The attribute __objclass__ is interpreted by the inspect module as specifying the class where this object was defined (setting this appropriately can assist in runtime introspection of dynamic class attributes). For callables, it may indicate that an instance of the given type (or a subclass) is expected or required as the first positional argument (for example, CPython sets this attribute for unbound methods that are implemented in C).

3.3.2.3. Invoking Descriptors

In general, a descriptor is an object attribute with “binding behavior”, one whose attribute access has been overridden by methods in the descriptor protocol: __get__(), __set__(), and __delete__(). If any of those methods are defined for an object, it is said to be a descriptor.

The default behavior for attribute access is to get, set, or delete the attribute from an object’s dictionary. For instance, a.x has a lookup chain starting with a.__dict__['x'], then type(a).__dict__['x'], and continuing through the base classes of type(a) excluding metaclasses.

However, if the looked-up value is an object defining one of the descriptor methods, then Python may override the default behavior and invoke the descriptor method instead. Where this occurs in the precedence chain depends on which descriptor methods were defined and how they were called.

The starting point for descriptor invocation is a binding, a.x. How the arguments are assembled depends on a:

Direct Call
The simplest and least common call is when user code directly invokes a descriptor method: x.__get__(a).
Instance Binding
If binding to an object instance, a.x is transformed into the call: type(a).__dict__['x'].__get__(a, type(a)).
Class Binding
If binding to a class, A.x is transformed into the call: A.__dict__['x'].__get__(None, A).
Super Binding
If a is an instance of super, then the binding super(B, obj).m() searches obj.__class__.__mro__ for the base class A immediately preceding B and then invokes the descriptor with the call: A.__dict__['m'].__get__(obj, obj.__class__).

For instance bindings, the precedence of descriptor invocation depends on the which descriptor methods are defined. A descriptor can define any combination of __get__(), __set__() and __delete__(). If it does not define __get__(), then accessing the attribute will return the descriptor object itself unless there is a value in the object’s instance dictionary. If the descriptor defines __set__() and/or __delete__(), it is a data descriptor; if it defines neither, it is a non-data descriptor. Normally, data descriptors define both __get__() and __set__(), while non-data descriptors have just the __get__() method. Data descriptors with __set__() and __get__() defined always override a redefinition in an instance dictionary. In contrast, non-data descriptors can be overridden by instances.

Python methods (including staticmethod() and classmethod()) are implemented as non-data descriptors. Accordingly, instances can redefine and override methods. This allows individual instances to acquire behaviors that differ from other instances of the same class.

The property() function is implemented as a data descriptor. Accordingly, instances cannot override the behavior of a property.

3.3.2.4. __slots__

__slots__ allow us to explicitly declare data members (like properties) and deny the creation of __dict__ and __weakref__ (unless explicitly declared in __slots__ or available in a parent.)

The space saved over using __dict__ can be significant.

object.__slots__

This class variable can be assigned a string, iterable, or sequence of strings with variable names used by instances. __slots__ reserves space for the declared variables and prevents the automatic creation of __dict__ and __weakref__ for each instance.

3.3.2.4.1. Notes on using __slots__
  • When inheriting from a class without __slots__, the __dict__ and __weakref__ attribute of the instances will always be accessible.
  • Without a __dict__ variable, instances cannot be assigned new variables not listed in the __slots__ definition. Attempts to assign to an unlisted variable name raises AttributeError. If dynamic assignment of new variables is desired, then add '__dict__' to the sequence of strings in the __slots__ declaration.
  • Without a __weakref__ variable for each instance, classes defining __slots__ do not support weak references to its instances. If weak reference support is needed, then add '__weakref__' to the sequence of strings in the __slots__ declaration.
  • __slots__ are implemented at the class level by creating descriptors (Implementing Descriptors) for each variable name. As a result, class attributes cannot be used to set default values for instance variables defined by __slots__; otherwise, the class attribute would overwrite the descriptor assignment.
  • The action of a __slots__ declaration is not limited to the class where it is defined. __slots__ declared in parents are available in child classes. However, child subclasses will get a __dict__ and __weakref__ unless they also define __slots__ (which should only contain names of any additional slots).
  • If a class defines a slot also defined in a base class, the instance variable defined by the base class slot is inaccessible (except by retrieving its descriptor directly from the base class). This renders the meaning of the program undefined. In the future, a check may be added to prevent this.
  • Nonempty __slots__ does not work for classes derived from “variable-length” built-in types such as int, bytes and tuple.
  • Any non-string iterable may be assigned to __slots__. Mappings may also be used; however, in the future, special meaning may be assigned to the values corresponding to each key.
  • __class__ assignment works only if both classes have the same __slots__.
  • Multiple inheritance with multiple slotted parent classes can be used, but only one parent is allowed to have attributes created by slots (the other bases must have empty slot layouts) - violations raise TypeError.

3.3.3. Customizing class creation

Whenever a class inherits from another class, __init_subclass__ is called on that class. This way, it is possible to write classes which change the behavior of subclasses. This is closely related to class decorators, but where class decorators only affect the specific class they’re applied to, __init_subclass__ solely applies to future subclasses of the class defining the method.

classmethod object.__init_subclass__(cls)

This method is called whenever the containing class is subclassed. cls is then the new subclass. If defined as a normal instance method, this method is implicitly converted to a class method.

Keyword arguments which are given to a new class are passed to the parent’s class __init_subclass__. For compatibility with other classes using __init_subclass__, one should take out the needed keyword arguments and pass the others over to the base class, as in:

class Philosopher:
    def __init_subclass__(cls, default_name, **kwargs):
        super().__init_subclass__(**kwargs)
        cls.default_name = default_name

class AustralianPhilosopher(Philosopher, default_name="Bruce"):
    pass

The default implementation object.__init_subclass__ does nothing, but raises an error if it is called with any arguments.

注解

The metaclass hint metaclass is consumed by the rest of the type machinery, and is never passed to __init_subclass__ implementations. The actual metaclass (rather than the explicit hint) can be accessed as type(cls).

3.6 新版功能.

3.3.3.1. Metaclasses

By default, classes are constructed using type(). The class body is executed in a new namespace and the class name is bound locally to the result of type(name, bases, namespace).

The class creation process can be customized by passing the metaclass keyword argument in the class definition line, or by inheriting from an existing class that included such an argument. In the following example, both MyClass and MySubclass are instances of Meta:

class Meta(type):
    pass

class MyClass(metaclass=Meta):
    pass

class MySubclass(MyClass):
    pass

Any other keyword arguments that are specified in the class definition are passed through to all metaclass operations described below.

When a class definition is executed, the following steps occur:

  • the appropriate metaclass is determined
  • the class namespace is prepared
  • the class body is executed
  • the class object is created

3.3.3.2. Determining the appropriate metaclass

The appropriate metaclass for a class definition is determined as follows:

  • if no bases and no explicit metaclass are given, then type() is used
  • if an explicit metaclass is given and it is not an instance of type(), then it is used directly as the metaclass
  • if an instance of type() is given as the explicit metaclass, or bases are defined, then the most derived metaclass is used

The most derived metaclass is selected from the explicitly specified metaclass (if any) and the metaclasses (i.e. type(cls)) of all specified base classes. The most derived metaclass is one which is a subtype of all of these candidate metaclasses. If none of the candidate metaclasses meets that criterion, then the class definition will fail with TypeError.

3.3.3.3. Preparing the class namespace

Once the appropriate metaclass has been identified, then the class namespace is prepared. If the metaclass has a __prepare__ attribute, it is called as namespace = metaclass.__prepare__(name, bases, **kwds) (where the additional keyword arguments, if any, come from the class definition).

If the metaclass has no __prepare__ attribute, then the class namespace is initialised as an empty ordered mapping.

参见

PEP 3115 - Metaclasses in Python 3000
Introduced the __prepare__ namespace hook

3.3.3.4. Executing the class body

The class body is executed (approximately) as exec(body, globals(), namespace). The key difference from a normal call to exec() is that lexical scoping allows the class body (including any methods) to reference names from the current and outer scopes when the class definition occurs inside a function.

However, even when the class definition occurs inside the function, methods defined inside the class still cannot see names defined at the class scope. Class variables must be accessed through the first parameter of instance or class methods, or through the implicit lexically scoped __class__ reference described in the next section.

3.3.3.5. Creating the class object

Once the class namespace has been populated by executing the class body, the class object is created by calling metaclass(name, bases, namespace, **kwds) (the additional keywords passed here are the same as those passed to __prepare__).

This class object is the one that will be referenced by the zero-argument form of super(). __class__ is an implicit closure reference created by the compiler if any methods in a class body refer to either __class__ or super. This allows the zero argument form of super() to correctly identify the class being defined based on lexical scoping, while the class or instance that was used to make the current call is identified based on the first argument passed to the method.

CPython implementation detail: In CPython 3.6 and later, the __class__ cell is passed to the metaclass as a __classcell__ entry in the class namespace. If present, this must be propagated up to the type.__new__ call in order for the class to be initialised correctly. Failing to do so will result in a DeprecationWarning in Python 3.6, and a RuntimeError in Python 3.8.

When using the default metaclass type, or any metaclass that ultimately calls type.__new__, the following additional customisation steps are invoked after creating the class object:

  • first, type.__new__ collects all of the descriptors in the class namespace that define a __set_name__() method;
  • second, all of these __set_name__ methods are called with the class being defined and the assigned name of that particular descriptor; and
  • finally, the __init_subclass__() hook is called on the immediate parent of the new class in its method resolution order.

After the class object is created, it is passed to the class decorators included in the class definition (if any) and the resulting object is bound in the local namespace as the defined class.

When a new class is created by type.__new__, the object provided as the namespace parameter is copied to a new ordered mapping and the original object is discarded. The new copy is wrapped in a read-only proxy, which becomes the __dict__ attribute of the class object.

参见

PEP 3135 - New super
Describes the implicit __class__ closure reference

3.3.3.6. Metaclass example

The potential uses for metaclasses are boundless. Some ideas that have been explored include enum, logging, interface checking, automatic delegation, automatic property creation, proxies, frameworks, and automatic resource locking/synchronization.

Here is an example of a metaclass that uses an collections.OrderedDict to remember the order that class variables are defined:

class OrderedClass(type):

    @classmethod
    def __prepare__(metacls, name, bases, **kwds):
        return collections.OrderedDict()

    def __new__(cls, name, bases, namespace, **kwds):
        result = type.__new__(cls, name, bases, dict(namespace))
        result.members = tuple(namespace)
        return result

class A(metaclass=OrderedClass):
    def one(self): pass
    def two(self): pass
    def three(self): pass
    def four(self): pass

>>> A.members
('__module__', 'one', 'two', 'three', 'four')

When the class definition for A gets executed, the process begins with calling the metaclass’s __prepare__() method which returns an empty collections.OrderedDict. That mapping records the methods and attributes of A as they are defined within the body of the class statement. Once those definitions are executed, the ordered dictionary is fully populated and the metaclass’s __new__() method gets invoked. That method builds the new type and it saves the ordered dictionary keys in an attribute called members.

3.3.4. Customizing instance and subclass checks

The following methods are used to override the default behavior of the isinstance() and issubclass() built-in functions.

In particular, the metaclass abc.ABCMeta implements these methods in order to allow the addition of Abstract Base Classes (ABCs) as “virtual base classes” to any class or type (including built-in types), including other ABCs.

class.__instancecheck__(self, instance)

Return true if instance should be considered a (direct or indirect) instance of class. If defined, called to implement isinstance(instance, class).

class.__subclasscheck__(self, subclass)

Return true if subclass should be considered a (direct or indirect) subclass of class. If defined, called to implement issubclass(subclass, class).

Note that these methods are looked up on the type (metaclass) of a class. They cannot be defined as class methods in the actual class. This is consistent with the lookup of special methods that are called on instances, only in this case the instance is itself a class.

参见

PEP 3119 - Introducing Abstract Base Classes
Includes the specification for customizing isinstance() and issubclass() behavior through __instancecheck__() and __subclasscheck__(), with motivation for this functionality in the context of adding Abstract Base Classes (see the abc module) to the language.

3.3.5. Emulating callable objects

object.__call__(self[, args...])

Called when the instance is “called” as a function; if this method is defined, x(arg1, arg2, ...) is a shorthand for x.__call__(arg1, arg2, ...).

3.3.6. Emulating container types

The following methods can be defined to implement container objects. Containers usually are sequences (such as lists or tuples) or mappings (like dictionaries), but can represent other containers as well. The first set of methods is used either to emulate a sequence or to emulate a mapping; the difference is that for a sequence, the allowable keys should be the integers k for which 0 <= k < N where N is the length of the sequence, or slice objects, which define a range of items. It is also recommended that mappings provide the methods keys(), values(), items(), get(), clear(), setdefault(), pop(), popitem(), copy(), and update() behaving similar to those for Python’s standard dictionary objects. The collections module provides a MutableMapping abstract base class to help create those methods from a base set of __getitem__(), __setitem__(), __delitem__(), and keys(). Mutable sequences should provide methods append(), count(), index(), extend(), insert(), pop(), remove(), reverse() and sort(), like Python standard list objects. Finally, sequence types should implement addition (meaning concatenation) and multiplication (meaning repetition) by defining the methods __add__(), __radd__(), __iadd__(), __mul__(), __rmul__() and __imul__() described below; they should not define other numerical operators. It is recommended that both mappings and sequences implement the __contains__() method to allow efficient use of the in operator; for mappings, in should search the mapping’s keys; for sequences, it should search through the values. It is further recommended that both mappings and sequences implement the __iter__() method to allow efficient iteration through the container; for mappings, __iter__() should be the same as keys(); for sequences, it should iterate through the values.

object.__len__(self)

Called to implement the built-in function len(). Should return the length of the object, an integer >= 0. Also, an object that doesn’t define a __bool__() method and whose __len__() method returns zero is considered to be false in a Boolean context.

CPython implementation detail: In CPython, the length is required to be at most sys.maxsize. If the length is larger than sys.maxsize some features (such as len()) may raise OverflowError. To prevent raising OverflowError by truth value testing, an object must define a __bool__() method.

object.__length_hint__(self)

Called to implement operator.length_hint(). Should return an estimated length for the object (which may be greater or less than the actual length). The length must be an integer >= 0. This method is purely an optimization and is never required for correctness.

3.4 新版功能.

注解

Slicing is done exclusively with the following three methods. A call like

a[1:2] = b

is translated to

a[slice(1, 2, None)] = b

and so forth. Missing slice items are always filled in with None.

object.__getitem__(self, key)

Called to implement evaluation of self[key]. For sequence types, the accepted keys should be integers and slice objects. Note that the special interpretation of negative indexes (if the class wishes to emulate a sequence type) is up to the __getitem__() method. If key is of an inappropriate type, TypeError may be raised; if of a value outside the set of indexes for the sequence (after any special interpretation of negative values), IndexError should be raised. For mapping types, if key is missing (not in the container), KeyError should be raised.

注解

for loops expect that an IndexError will be raised for illegal indexes to allow proper detection of the end of the sequence.

object.__missing__(self, key)

Called by dict.__getitem__() to implement self[key] for dict subclasses when key is not in the dictionary.

object.__setitem__(self, key, value)

Called to implement assignment to self[key]. Same note as for __getitem__(). This should only be implemented for mappings if the objects support changes to the values for keys, or if new keys can be added, or for sequences if elements can be replaced. The same exceptions should be raised for improper key values as for the __getitem__() method.

object.__delitem__(self, key)

Called to implement deletion of self[key]. Same note as for __getitem__(). This should only be implemented for mappings if the objects support removal of keys, or for sequences if elements can be removed from the sequence. The same exceptions should be raised for improper key values as for the __getitem__() method.

object.__iter__(self)

This method is called when an iterator is required for a container. This method should return a new iterator object that can iterate over all the objects in the container. For mappings, it should iterate over the keys of the container.

Iterator objects also need to implement this method; they are required to return themselves. For more information on iterator objects, see Iterator Types.

object.__reversed__(self)

Called (if present) by the reversed() built-in to implement reverse iteration. It should return a new iterator object that iterates over all the objects in the container in reverse order.

If the __reversed__() method is not provided, the reversed() built-in will fall back to using the sequence protocol (__len__() and __getitem__()). Objects that support the sequence protocol should only provide __reversed__() if they can provide an implementation that is more efficient than the one provided by reversed().

The membership test operators (in and not in) are normally implemented as an iteration through a sequence. However, container objects can supply the following special method with a more efficient implementation, which also does not require the object be a sequence.

object.__contains__(self, item)

Called to implement membership test operators. Should return true if item is in self, false otherwise. For mapping objects, this should consider the keys of the mapping rather than the values or the key-item pairs.

For objects that don’t define __contains__(), the membership test first tries iteration via __iter__(), then the old sequence iteration protocol via __getitem__(), see this section in the language reference.

3.3.7. Emulating numeric types

The following methods can be defined to emulate numeric objects. Methods corresponding to operations that are not supported by the particular kind of number implemented (e.g., bitwise operations for non-integral numbers) should be left undefined.

object.__add__(self, other)
object.__sub__(self, other)
object.__mul__(self, other)
object.__matmul__(self, other)
object.__truediv__(self, other)
object.__floordiv__(self, other)
object.__mod__(self, other)
object.__divmod__(self, other)
object.__pow__(self, other[, modulo])
object.__lshift__(self, other)
object.__rshift__(self, other)
object.__and__(self, other)
object.__xor__(self, other)
object.__or__(self, other)

These methods are called to implement the binary arithmetic operations (+, -, *, @, /, //, %, divmod(), pow(), **, <<, >>, &, ^, |). For instance, to evaluate the expression x + y, where x is an instance of a class that has an __add__() method, x.__add__(y) is called. The __divmod__() method should be the equivalent to using __floordiv__() and __mod__(); it should not be related to __truediv__(). Note that __pow__() should be defined to accept an optional third argument if the ternary version of the built-in pow() function is to be supported.

If one of those methods does not support the operation with the supplied arguments, it should return NotImplemented.

object.__radd__(self, other)
object.__rsub__(self, other)
object.__rmul__(self, other)
object.__rmatmul__(self, other)
object.__rtruediv__(self, other)
object.__rfloordiv__(self, other)
object.__rmod__(self, other)
object.__rdivmod__(self, other)
object.__rpow__(self, other)
object.__rlshift__(self, other)
object.__rrshift__(self, other)
object.__rand__(self, other)
object.__rxor__(self, other)
object.__ror__(self, other)

These methods are called to implement the binary arithmetic operations (+, -, *, @, /, //, %, divmod(), pow(), **, <<, >>, &, ^, |) with reflected (swapped) operands. These functions are only called if the left operand does not support the corresponding operation [3] and the operands are of different types. [4] For instance, to evaluate the expression x - y, where y is an instance of a class that has an __rsub__() method, y.__rsub__(x) is called if x.__sub__(y) returns NotImplemented.

Note that ternary pow() will not try calling __rpow__() (the coercion rules would become too complicated).

注解

If the right operand’s type is a subclass of the left operand’s type and that subclass provides the reflected method for the operation, this method will be called before the left operand’s non-reflected method. This behavior allows subclasses to override their ancestors’ operations.

object.__iadd__(self, other)
object.__isub__(self, other)
object.__imul__(self, other)
object.__imatmul__(self, other)
object.__itruediv__(self, other)
object.__ifloordiv__(self, other)
object.__imod__(self, other)
object.__ipow__(self, other[, modulo])
object.__ilshift__(self, other)
object.__irshift__(self, other)
object.__iand__(self, other)
object.__ixor__(self, other)
object.__ior__(self, other)

These methods are called to implement the augmented arithmetic assignments (+=, -=, *=, @=, /=, //=, %=, **=, <<=, >>=, &=, ^=, |=). These methods should attempt to do the operation in-place (modifying self) and return the result (which could be, but does not have to be, self). If a specific method is not defined, the augmented assignment falls back to the normal methods. For instance, if x is an instance of a class with an __iadd__() method, x += y is equivalent to x = x.__iadd__(y) . Otherwise, x.__add__(y) and y.__radd__(x) are considered, as with the evaluation of x + y. In certain situations, augmented assignment can result in unexpected errors (see Why does a_tuple[i] += [‘item’] raise an exception when the addition works?), but this behavior is in fact part of the data model.

object.__neg__(self)
object.__pos__(self)
object.__abs__(self)
object.__invert__(self)

Called to implement the unary arithmetic operations (-, +, abs() and ~).

object.__complex__(self)
object.__int__(self)
object.__float__(self)

Called to implement the built-in functions complex(), int() and float(). Should return a value of the appropriate type.

object.__index__(self)

Called to implement operator.index(), and whenever Python needs to losslessly convert the numeric object to an integer object (such as in slicing, or in the built-in bin(), hex() and oct() functions). Presence of this method indicates that the numeric object is an integer type. Must return an integer.

注解

In order to have a coherent integer type class, when __index__() is defined __int__() should also be defined, and both should return the same value.

object.__round__(self[, ndigits])
object.__trunc__(self)
object.__floor__(self)
object.__ceil__(self)

Called to implement the built-in function round() and math functions trunc(), floor() and ceil(). Unless ndigits is passed to __round__() all these methods should return the value of the object truncated to an Integral (typically an int).

If __int__() is not defined then the built-in function int() falls back to __trunc__().

3.3.8. With Statement Context Managers

A context manager is an object that defines the runtime context to be established when executing a with statement. The context manager handles the entry into, and the exit from, the desired runtime context for the execution of the block of code. Context managers are normally invoked using the with statement (described in section The with statement), but can also be used by directly invoking their methods.

Typical uses of context managers include saving and restoring various kinds of global state, locking and unlocking resources, closing opened files, etc.

For more information on context managers, see Context Manager Types.

object.__enter__(self)

Enter the runtime context related to this object. The with statement will bind this method’s return value to the target(s) specified in the as clause of the statement, if any.

object.__exit__(self, exc_type, exc_value, traceback)

Exit the runtime context related to this object. The parameters describe the exception that caused the context to be exited. If the context was exited without an exception, all three arguments will be None.

If an exception is supplied, and the method wishes to suppress the exception (i.e., prevent it from being propagated), it should return a true value. Otherwise, the exception will be processed normally upon exit from this method.

Note that __exit__() methods should not reraise the passed-in exception; this is the caller’s responsibility.

参见

PEP 343 - The “with” statement
The specification, background, and examples for the Python with statement.

3.3.9. Special method lookup

For custom classes, implicit invocations of special methods are only guaranteed to work correctly if defined on an object’s type, not in the object’s instance dictionary. That behaviour is the reason why the following code raises an exception:

>>> class C:
...     pass
...
>>> c = C()
>>> c.__len__ = lambda: 5
>>> len(c)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: object of type 'C' has no len()

The rationale behind this behaviour lies with a number of special methods such as __hash__() and __repr__() that are implemented by all objects, including type objects. If the implicit lookup of these methods used the conventional lookup process, they would fail when invoked on the type object itself:

>>> 1 .__hash__() == hash(1)
True
>>> int.__hash__() == hash(int)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: descriptor '__hash__' of 'int' object needs an argument

Incorrectly attempting to invoke an unbound method of a class in this way is sometimes referred to as ‘metaclass confusion’, and is avoided by bypassing the instance when looking up special methods:

>>> type(1).__hash__(1) == hash(1)
True
>>> type(int).__hash__(int) == hash(int)
True

In addition to bypassing any instance attributes in the interest of correctness, implicit special method lookup generally also bypasses the __getattribute__() method even of the object’s metaclass:

>>> class Meta(type):
...     def __getattribute__(*args):
...         print("Metaclass getattribute invoked")
...         return type.__getattribute__(*args)
...
>>> class C(object, metaclass=Meta):
...     def __len__(self):
...         return 10
...     def __getattribute__(*args):
...         print("Class getattribute invoked")
...         return object.__getattribute__(*args)
...
>>> c = C()
>>> c.__len__()                 # Explicit lookup via instance
Class getattribute invoked
10
>>> type(c).__len__(c)          # Explicit lookup via type
Metaclass getattribute invoked
10
>>> len(c)                      # Implicit lookup
10

Bypassing the __getattribute__() machinery in this fashion provides significant scope for speed optimisations within the interpreter, at the cost of some flexibility in the handling of special methods (the special method must be set on the class object itself in order to be consistently invoked by the interpreter).

3.4. 协程

3.4.1. Awaitable Objects

An awaitable object generally implements an __await__() method. Coroutine objects returned from async def functions are awaitable.

注解

The generator iterator objects returned from generators decorated with types.coroutine() or asyncio.coroutine() are also awaitable, but they do not implement __await__().

object.__await__(self)

Must return an iterator. Should be used to implement awaitable objects. For instance, asyncio.Future implements this method to be compatible with the await expression.

3.5 新版功能.

参见

PEP 492 for additional information about awaitable objects.

3.4.2. 协程对象

Coroutine objects are awaitable objects. A coroutine’s execution can be controlled by calling __await__() and iterating over the result. When the coroutine has finished executing and returns, the iterator raises StopIteration, and the exception’s value attribute holds the return value. If the coroutine raises an exception, it is propagated by the iterator. Coroutines should not directly raise unhandled StopIteration exceptions.

Coroutines also have the methods listed below, which are analogous to those of generators (see 生成器-迭代器的方法). However, unlike generators, coroutines do not directly support iteration.

在 3.5.2 版更改: It is a RuntimeError to await on a coroutine more than once.

coroutine.send(value)

Starts or resumes execution of the coroutine. If value is None, this is equivalent to advancing the iterator returned by __await__(). If value is not None, this method delegates to the send() method of the iterator that caused the coroutine to suspend. The result (return value, StopIteration, or other exception) is the same as when iterating over the __await__() return value, described above.

coroutine.throw(type[, value[, traceback]])

Raises the specified exception in the coroutine. This method delegates to the throw() method of the iterator that caused the coroutine to suspend, if it has such a method. Otherwise, the exception is raised at the suspension point. The result (return value, StopIteration, or other exception) is the same as when iterating over the __await__() return value, described above. If the exception is not caught in the coroutine, it propagates back to the caller.

coroutine.close()

Causes the coroutine to clean itself up and exit. If the coroutine is suspended, this method first delegates to the close() method of the iterator that caused the coroutine to suspend, if it has such a method. Then it raises GeneratorExit at the suspension point, causing the coroutine to immediately clean itself up. Finally, the coroutine is marked as having finished executing, even if it was never started.

Coroutine objects are automatically closed using the above process when they are about to be destroyed.

3.4.3. Asynchronous Iterators

An asynchronous iterable is able to call asynchronous code in its __aiter__ implementation, and an asynchronous iterator can call asynchronous code in its __anext__ method.

Asynchronous iterators can be used in an async for statement.

object.__aiter__(self)

Must return an asynchronous iterator object.

object.__anext__(self)

Must return an awaitable resulting in a next value of the iterator. Should raise a StopAsyncIteration error when the iteration is over.

An example of an asynchronous iterable object:

class Reader:
    async def readline(self):
        ...

    def __aiter__(self):
        return self

    async def __anext__(self):
        val = await self.readline()
        if val == b'':
            raise StopAsyncIteration
        return val

3.5 新版功能.

注解

在 3.5.2 版更改: Starting with CPython 3.5.2, __aiter__ can directly return asynchronous iterators. Returning an awaitable object will result in a PendingDeprecationWarning.

The recommended way of writing backwards compatible code in CPython 3.5.x is to continue returning awaitables from __aiter__. If you want to avoid the PendingDeprecationWarning and keep the code backwards compatible, the following decorator can be used:

import functools
import sys

if sys.version_info < (3, 5, 2):
    def aiter_compat(func):
        @functools.wraps(func)
        async def wrapper(self):
            return func(self)
        return wrapper
else:
    def aiter_compat(func):
        return func

Example:

class AsyncIterator:

    @aiter_compat
    def __aiter__(self):
        return self

    async def __anext__(self):
        ...

Starting with CPython 3.6, the PendingDeprecationWarning will be replaced with the DeprecationWarning. In CPython 3.7, returning an awaitable from __aiter__ will result in a RuntimeError.

3.4.4. Asynchronous Context Managers

An asynchronous context manager is a context manager that is able to suspend execution in its __aenter__ and __aexit__ methods.

Asynchronous context managers can be used in an async with statement.

object.__aenter__(self)

This method is semantically similar to the __enter__(), with only difference that it must return an awaitable.

object.__aexit__(self, exc_type, exc_value, traceback)

This method is semantically similar to the __exit__(), with only difference that it must return an awaitable.

An example of an asynchronous context manager class:

class AsyncContextManager:
    async def __aenter__(self):
        await log('entering context')

    async def __aexit__(self, exc_type, exc, tb):
        await log('exiting context')

3.5 新版功能.

脚注

[1]It is possible in some cases to change an object’s type, under certain controlled conditions. It generally isn’t a good idea though, since it can lead to some very strange behaviour if it is handled incorrectly.
[2]The __hash__(), __iter__(), __reversed__(), and __contains__() methods have special handling for this; others will still raise a TypeError, but may do so by relying on the behavior that None is not callable.
[3]“Does not support” here means that the class has no such method, or the method returns NotImplemented. Do not set the method to None if you want to force fallback to the right operand’s reflected method—that will instead have the opposite effect of explicitly blocking such fallback.
[4]For operands of the same type, it is assumed that if the non-reflected method (such as __add__()) fails the operation is not supported, which is why the reflected method is not called.