Python 3.3 有什么新变化¶
本文介绍了 Python 3.3 相比 3.2 的新增特性。 Python 3.3 于 2012 年 9 月 29 日 发布。 有关完整详细信息,请参见 changelog。
也參考
PEP 398 - Python 3.3 发布计划
摘要 -- 发布重点¶
新的语法特性:
新的库模块:
faulthandler(帮助调试低层级的崩溃)ipaddress(代表 IP 地址和掩码的高层级对象)lzma(使用 XZ / LZMA 算法压缩数据)unittest.mock(使用模拟对象替换你的受测试系统中的某些部分)
新的内置特性:
重写 I/O 异常的层次结构.
实现的改进:
基于
importlib重写 import machinery更紧凑的 Unicode 字符串。
更紧凑的 属性字典。
显著改进的库模块:
安全改进:
哈希随机化被默认启用。
请继续阅读有关面向用户的改变的详细清单。
PEP 405: 虚拟环境¶
Virtual environments help create separate Python setups while sharing a
system-wide base install, for ease of maintenance. Virtual environments
have their own set of private site packages (i.e. locally-installed
libraries), and are optionally segregated from the system-wide site
packages. Their concept and implementation are inspired by the popular
virtualenv third-party package, but benefit from tighter integration
with the interpreter core.
本 PEP 添加了 venv 模块用于编程访问,以及 pyvenv 脚本用于命令在线访问和管理。 Python 解释器会检查 pyvenv.cfg,文件的存在标志着虚拟环境目录树的基础。
也參考
- PEP 405 - Python虚拟环境
PEP 由 Carl Meyer 撰写 ; 由 Carl Meyer 和 Vinay Sajip 实现。
PEP 420: 隐式命名空间包¶
原生支持不要求 __init__.py 标记文件和可以自动跨越多个路径节的包目录(灵感来自多个命名空间包的第三方方案,如 PEP 420 中所述)
也參考
- PEP 420 - 隐式命名空间包
PEP 由 Eric V. Smith 撰写,由 Eric V. Smith 和 Barry Warsaw 实现
PEP 3118: 新的内存视图实现和缓冲协议文档¶
PEP 3118 的实现已获得大幅改进。
新的 memoryview 实现全面修复了 Py_buffer 结构体中曾导致多起崩溃报告的动态分配字段的所有权和生命周期问题。 此外,还修复了多个函数在非连续或多维输入时崩溃或返回不正确结果的问题。
现在 memoryview 对象具有符合 PEP-3118 标准的 getbufferproc(),可以检查使用者的请求类型。 新增了许多新的特性,其中的大部分已适用于非连续数组和带有子偏移量的数组。
文档已进行更新,清楚地列出了导出方和使用方的责任。 缓冲区请求旗标志被划分为基本旗标和复合旗标。 对非连续和多维的 NumPy 风格数组的内存布局进行了说明。
相关特性¶
现在 struct 模块语法中所有原生单字符格式指示符(可以选择添加 '@' 前缀)均受到支持。
在某些限制条件下,cast() 方法允许改变 C 连续数组的格式和形状。
任何数组类型都支持多维列表的表示形式。
任何数组类型都支持多维比较操作。
格式为 B、b 或 c 的可哈希(只读)类型的一维 memoryview 现在将是可哈希的。 (由 Antoine Pitrou 在 bpo-13411 中贡献。)
Arbitrary slicing of any 1-D arrays type is supported. For example, it is now possible to reverse a memoryview in O(1) by using a negative step.
API 的变化¶
官方的最大维度数量限制已更改为 64。
空形状、区间和子偏移量的表示形式现在是空元组而不是
None。现在对格式为 'B' (无符号字节型) 的 memoryview 元素的访问将返回一个整数(遵循结构体模块语法)。 要返回字节串对象则必须先将视图强制转换为 'c'。
现在 memoryview 比较将使用操作数的逻辑结构并会按值来比较所有数组元素。 结构体模块语法中的所有格式化字符串均受到支持。 带有不可识别的格式化字符串的视图仍然被允许,但无论视图内容如何比较结果总是不相等。
更多改变请参阅 Build and C API Changes 和 Porting C code。
(由 Stefan Krah 在 bpo-10181 中贡献。)
也參考
PEP 3118 - 修改缓冲区协议
PEP 393: 灵活的字符串表示¶
Unicode字符串类型已改为支持多种内部表示法,具体取决于所表示的字符串中具有最大 Unicode 序号(1、2 或 4 字节)的字符 。 这样,在常见情况下可以节省空间,但在所有系统上都能使用完整的 UCS-4。 对于使用现有应用程序接口的兼容性 来说,可能会并行存在几种表示法;随着时间的推移,这种兼容性 应逐步淘汰。
在 Python 一方,此项改变应当没有任何缺点。
在 C API 方面,PEP 393 完全向下兼容。 旧的 API 至少还能使用五年。 使用传统 API 的应用程序不会完全受益于内存的减少,或者更糟的是,可能会使用更多的内存,因为 Python 可能需要维护每个字符串的两个版本(传统格式和新的高效存储)。
功能¶
由 PEP 393 引入的改变如下:
Python 现在始终支持全部 Unicode 码位,包括非 BMP 码位 (即从``U+0000`` 到
U+10FFFF)。 窄编译版本和宽编译版本之间的区别已不复存在,Python 现在的行为就像宽编译版本,甚至在 Windows 下也是如此。随着窄编译版本的消亡,窄编译版本特有的问题也得到了解决,例如:
现在
len()对于非 BMP 字符总是返回 1,因此len('\U0010FFFF') == 1;替换对不会在字符串字面值中重新合并,因此
'\uDBFF\uDFFF' != '\U0010FFFF';索引或切分非 BMP 字符会返回预期的值,因此
'\U0010FFFF'[0]现在会返回'\U0010FFFF'而不是'\uDBFF';标准库中的所有其他函数现在都能正确处理非 BMP 代码点。
The value of
sys.maxunicodeis now always1114111(0x10FFFFin hexadecimal). ThePyUnicode_GetMax()function still returns either0xFFFFor0x10FFFFfor backward compatibility, and it should not be used with the new Unicode API (see bpo-13054)../configure标志--with-wide-unicode已被移除。
性能和资源使用情况¶
现在,Unicode 字符串的存储取决于字符串中的最高码位:
纯 ASCII 和 Latin1 字符串 (
U+0000-U+00FF) 每个码位使用 1 个字节;BMP 字符串 (
U+0000-U+FFFF) 每个码位使用 2 个字节;非 BMP 字符串 (
U+10000-U+10FFFF) 每个码位使用 4 个字节。
这样做的效果是,对于大多数应用而言,字符串存储的内存使用量应该会大幅减少 —— 尤其是与以前的宽 unicode 版本相比 —— 因为在许多情况下,即使在国际环境中,字符串也将是纯 ASCII 格式(因为许多字符串存储的是非人类语言数据,如 XML 片段、HTTP 标头、JSON 编码数据等)。 出于同样的原因,我们还希望它能提高非小应用程序的 CPU 缓存效率。 在 Django 基准测试中,Python 3.3 的内存使用量比 Python 3.2 少两到三倍,比 Python 2.7 略好一些(详情请参见 PEP)。
也參考
- PEP 393 - 灵活的字符串表示
PEP 由 Martin von Löwis 撰写 ; 由 Torsten Becker 和 Martin von Löwis 实现。
PEP 397: 适用于Windows的Python启动器¶
Python 3.3 的 Windows 安装程序现在包含一个 py 启动程序,可用于以版本无关的方式启动 Python 应用程序。
双击 *.py 文件时会隐式调用该启动器。 如果系统中只安装了一个 Python 版本,则将使用该版本运行文件。 如果安装了多个版本,则默认使用最新版本,但也可以通过在 Python 脚本中加入 Unix 风格的“shebang 行”来覆盖该版本。
启动器也可以作为 py 应用程序在命令行中显式使用。运行 py 遵循与隐式启动脚本相同的版本选择规则,但可以通过传递适当的参数来选择更具体的版本(例如,当 Python 2 也已安装时,使用 -3 来请求 Python 3;当安装了较新的 Python 版本时,使用 -2.6 来特别请求较早的 Python 版本)。
除了启动器之外,Windows 安装程序现在还包含一个选项,可将新安装的 Python 添加到系统 PATH 中。 (由 Brian Curtin 在 bpo-3561 中贡献)。
也參考
- PEP 397 - 适用于Windows的Python启动器
PEP 由 Mark Hammond 和 Martin v. Löwis 撰写 ; 由 Vinay Sajip实现。
启动器文档: 适用于Windows的Python启动器
安装器 PATH 修改: 查找Python可执行文件
PEP 3151: 重写 OS 和 IO 异常的层次结构¶
现在,由操作系统错误引发的异常层次结构既得到了简化,又更加精细。
您不必再为在 OSError、IOError、EnvironmentError、WindowsError、mmap.error、socket.error 或 select.error 之间选择合适的异常类型而烦恼。 所有这些异常类型现在都只有一个: OSError。 出于兼容性考虑,其他名称将作为别名保留。
此外,现在捕捉特定错误条件也更容易了。无需从 errno 模块中检查 errno 属性(或 args[0] )中的特定常量,您可以捕捉适当的 OSError 子类。可用的子类如下:
并且 ConnectionError 本身具有细粒度的子类:
有了新的异常,现在就可以避免 errno 的常见用法了。 例如,下面是为 Python 3.2 编写的代码:
from errno import ENOENT, EACCES, EPERM
try:
with open("document.txt") as f:
content = f.read()
except IOError as err:
if err.errno == ENOENT:
print("document.txt file is missing")
elif err.errno in (EACCES, EPERM):
print("You are not allowed to read document.txt")
else:
raise
现在无需导入 errno,也无需手动检查异常属性:
try:
with open("document.txt") as f:
content = f.read()
except FileNotFoundError:
print("document.txt file is missing")
except PermissionError:
print("You are not allowed to read document.txt")
也參考
- PEP 3151 - 重写 OS 和 IO 异常的层次结构
PEP 由 Antoine Pitrou 撰写并实现
PEP 380: 委托给子生成器的语法¶
PEP 380 增加了 yield from 表达式,允许 generator 将其部分操作委托给另一个生成器。 这样,包含 yield 的代码部分就可以被分解出来,放在另一个生成器中。 此外,还允许子生成器返回一个值,并将该值提供给委托生成器。
虽然 yield from 表达式主要用于委托给子生成器,但它实际上允许委托给任意子生成器。
对于简单的迭代器而言,yield from iterable 本质上只是 for item in iterable: yield item 的简写形式:
>>> def g(x):
... yield from range(x, 0, -1)
... yield from range(x)
...
>>> list(g(5))
[5, 4, 3, 2, 1, 0, 1, 2, 3, 4]
但是,与普通的循环不同,yield from 允许子生成器直接从调用方作用域获取、发送和抛出值,并向外层生成器返回一个最终值:
>>> def accumulate():
... tally = 0
... while 1:
... next = yield
... if next is None:
... return tally
... tally += next
...
>>> def gather_tallies(tallies):
... while 1:
... tally = yield from accumulate()
... tallies.append(tally)
...
>>> tallies = []
>>> acc = gather_tallies(tallies)
>>> next(acc) # Ensure the accumulator is ready to accept values
>>> for i in range(4):
... acc.send(i)
...
>>> acc.send(None) # Finish the first tally
>>> for i in range(5):
... acc.send(i)
...
>>> acc.send(None) # Finish the second tally
>>> tallies
[6, 10]
推动这项改变的主要原则是允许即便被设计用来配合 send 和 throw 方法使用的生成器也能像一个大函数能拆分成多个子函数那样容易地拆分为多个子生成器。
也參考
- PEP 380 - 委托给子生成器的语法
PEP 由 Greg Ewing 撰写,由 Greg Ewing 实现。由 Renaud Blanch,Ryan Kelly 和 Nick Coghlan 集成到3.3,由 Zbigniew Jędrzejewski-Szmek 和 Nick Coghlan 编写文档
PEP 409: 清除异常上下文¶
PEP 409 引入了允许禁用串连的异常上下文显示的新语法。 这允许在不同异常类型间进行转换的应用程序具有更清晰的错误消息:
>>> class D:
... def __init__(self, extra):
... self._extra_attributes = extra
... def __getattr__(self, attr):
... try:
... return self._extra_attributes[attr]
... except KeyError:
... raise AttributeError(attr) from None
...
>>> D({}).x
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 8, in __getattr__
AttributeError: x
如果后面没有 from None 来屏蔽异常原因,则默认原始异常将被显示:
>>> class C:
... def __init__(self, extra):
... self._extra_attributes = extra
... def __getattr__(self, attr):
... try:
... return self._extra_attributes[attr]
... except KeyError:
... raise AttributeError(attr)
...
>>> C({}).x
Traceback (most recent call last):
File "<stdin>", line 6, in __getattr__
KeyError: 'x'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 8, in __getattr__
AttributeError: x
调试功能并未丢失,因为原始异常上下文在需要时仍然可用(举例来说,如果某个中间库不正确地抑制了有价值的下层细节):
>>> try:
... D({}).x
... except AttributeError as exc:
... print(repr(exc.__context__))
...
KeyError('x',)
也參考
- PEP 409 - 清除异常上下文
PEP 由 Ethan Furman 撰写 ,由 Ethan Furman 和 Nick Coghlan 实现。
PEP 414: 显式的Unicode文本¶
为使从 Python 2 迁移重度使用 Unicode 字面值的 Unicode 自适应型 Python 应用程序更为容易,Python 3.3 重新支持字符串字面值使用 "u" 前缀。 该前缀在 Python 3 中并无语法意义,提供它只是为了减少在迁移到 Python 3 时纯粹机械性的修改数量,让开发者能更轻松的关注更重要的语法变化(比如默认更严格的二进制和文本数据的区分)。
也參考
- PEP 414 - 显式的Unicode文本
PEP 由 Armin Ronacher 撰写
PEP 3155: 类和函数的限定名称¶
Functions and class objects have a new __qualname__ attribute representing
the "path" from the module top-level to their definition. For global functions
and classes, this is the same as __name__. For other functions and classes,
it provides better information about where they were actually defined, and
how they might be accessible from the global scope.
包含(未绑定)方法的示例:
>>> class C:
... def meth(self):
... pass
>>> C.meth.__name__
'meth'
>>> C.meth.__qualname__
'C.meth'
包含嵌套类的示例:
>>> class C:
... class D:
... def meth(self):
... pass
...
>>> C.D.__name__
'D'
>>> C.D.__qualname__
'C.D'
>>> C.D.meth.__name__
'meth'
>>> C.D.meth.__qualname__
'C.D.meth'
包含嵌套函数的示例:
>>> def outer():
... def inner():
... pass
... return inner
...
>>> outer().__name__
'inner'
>>> outer().__qualname__
'outer.<locals>.inner'
这些对象的字符串表示形式也被修改以包括新的更准确的信息:
>>> str(C.D)
"<class '__main__.C.D'>"
>>> str(C.D.meth)
'<function C.D.meth at 0x7f46b9fe31e0>'
也參考
- PEP 3155 - 类和函数的限定名称
PEP 由 Antoine Pitrou 撰写并实现
PEP 412: Key-Sharing Dictionary¶
用于存储对象属性的字典现在能够在彼此之间共享部分内部存储(比如说,存储键及其对应哈希值的部分)。 这减少了程序创建多个非内置类型实例的内存消耗。
也參考
- PEP 412 - Key-Sharing Dictionary
PEP 由 Mark Shannon 撰写并实现。
PEP 362: 函数签名对象¶
新增的函数 inspect.signature() 使得对 python 可调用对象的内省更为简单直观。 多种可调用对象均受到支持:不论是否带装饰器的 python 函数,类以及 functools.partial() 对象。 新增的类 inspect.Signature, inspect.Parameter 和 inspect.BoundArguments 保存了有关调用签名的信息,如标注、默认值、形参类别和绑定参数等,这极大地简化了编写装饰器和其他任何验证或修改调用签名或参数的工作。
也參考
- PEP 362: - 函数签名对象
PEP 由 Brett Cannon,Yury Selivanov,Larry Hastings,Jiwon Seo 撰写,由 Yury Selivanov 实现
PEP 421: 添加 sys.implementation¶
A new attribute on the sys module exposes details specific to the
implementation of the currently running interpreter. The initial set of
attributes on sys.implementation are name, version,
hexversion, and cache_tag.
sys.implementation 的目的是将标准库所使用的具体实现专属数据合并到一个命名空间中。 这允许不同的 Python 实现能更方便地共享同一个标准库基准代码。 在其初始状态中,sys.implementation 只保持具体实现专属数据的一小部分。 随着时间推移这个比例将进行调整以使标准库具有更好的可移植性。
标准库可移植性提高的一个例子是 cache_tag。 在 Python 3.3 中,sys.implementation.cache_tag 被 importlib 用来支持与 PEP 3147 的一致性。 任何使用 importlib 来处理内置导入系统的 Python 实现都可使用 cache_tag 来控制各个模块的缓存行为。
SimpleNamespace¶
sys.implementation 的实现还为 Python 引入了一个新类型: types.SimpleNamespace。 相比基于映射的命名空间像是 dict,SimpleNamespace 是基于属性的,像是 object。 不过,与 object 不同,SimpleNamespace 实例是可写的。 这意味着你可以通过常规的属性访问来添加、移除和修改命名空间。
也參考
- PEP 421 - 添加 sys.implementation
PEP 由 Eric Snow 撰写并实现
使用 importlib 作为导入的实现¶
bpo-2377 - 替代 __import__ w/ importlib.__import__ bpo-13959 - 以纯 Python 重新实现部分 imp bpo-14605 - 使导入机制成为显式的 bpo-14646 - 要求加载器设置 __loader__ 和 __package__
现在 __import__() 函数是由 importlib.__import__() 驱动的。 这项工作标志着 PEP 302 的“第 2 阶段”的结束。 这一变化带来了许多好处。 首先,它允许更多驱动导入的机制对外公开而不是保持隐式并藏在 C 代码内部。 它还提供了单一的实现给所有支持 Python 3.3 的 Python VM 使用,有助于结束导入语义中有关特定 VM 的偏差。 最后它还减轻了导入的维护负担,以允许未来的更多改进。
对于普通用户来说,语义上的变化应该是不可见的。 对于目前直接操纵 import 或以程序方式调用 import 的代码来说,可能需要进行的代码修改将在本文档的 Porting Python code 一节中阐述。
新的API¶
此项工作的一个好处是对外公开了实现 import 语句所涉及的机制。 这意味着过去保持隐式的各种导入器现在都作为 importlib 包的组成部分被完整暴露出来。
The abstract base classes defined in importlib.abc have been expanded
to properly delineate between meta path finders
and path entry finders by introducing
importlib.abc.MetaPathFinder and
importlib.abc.PathEntryFinder, respectively. The old ABC of
importlib.abc.Finder is now only provided for backwards-compatibility
and does not enforce any method requirements.
In terms of finders, importlib.machinery.FileFinder exposes the
mechanism used to search for source and bytecode files of a module. Previously
this class was an implicit member of sys.path_hooks.
对于加载器,新的抽象基类 importlib.abc.FileLoader 可帮助编写使用文件系统作为模块代码的存储机制的加载器。 针对源代码的 (importlib.machinery.SourceFileLoader), 不带源代码的字节码文件的 (importlib.machinery.SourcelessFileLoader) 以及扩展模块的 (importlib.machinery.ExtensionFileLoader) 加载器现在均可被直接使用。
现在 ImportError 具有 name 和 path 属性并会在有相关数据要提供时被设置。 针对导入失败的消息现在也会提供模块的完整名称而不是仅有模块名称的末尾部分。
The importlib.invalidate_caches() function will now call the method with
the same name on all finders cached in sys.path_importer_cache to help
clean up any stored state as necessary.
可见的改变¶
对于可能需要修改的代码,请参阅 Porting Python code 一节。
Beyond the expanse of what importlib now exposes, there are other
visible changes to import. The biggest is that sys.meta_path and
sys.path_hooks now store all of the meta path finders and path entry
hooks used by import. Previously the finders were implicit and hidden within
the C code of import instead of being directly exposed. This means that one can
now easily remove or change the order of the various finders to fit one's needs.
另一个变化是所有模块都具有 __loader__ 属性,它储存被用于创建该模块的加载器。 PEP 302 已被更新以强制要求加载器实现该属性,因此未来的第 3 方加载器一旦被更新人们将能确定该属性的存在。 但在此之前,import 都需要在加载之后设置模块。
加载器现在还需要根据 PEP 366 设置 __package__ 属性。 同样地,import 本身已经在所有来自 importlib 的加载器上设置了该属性并且 import 是在加载之后自行设置该属性。
None is now inserted into sys.path_importer_cache when no finder
can be found on sys.path_hooks. Since imp.NullImporter is not
directly exposed on sys.path_hooks it could no longer be relied upon to
always be available to use as a value representing no finder found.
有关语法变化的所有其他修改在为 Python 3.3 更新代码时都应当被纳入考虑,因此应当仔细阅读本文档的 Porting Python code 章节。
(由 Brett Cannon 编写的实现)
其他语言特性修改¶
对Python 语言核心进行的小改动:
Added support for Unicode name aliases and named sequences. Both
unicodedata.lookup()and'\N{...}'now resolve name aliases, andunicodedata.lookup()resolves named sequences too.(由 Ezio Melotti 在 bpo-12753 中贡献。)
Unicode 数据库更新至 UCD 版本 6.1.0
现在
range()对象的相等性比较将返回反映下层的由这些 range 对象生成的序列的相等性的结果。 (bpo-13201)现在
bytes和bytearray对象的count(),find(),rfind(),index()和rindex()方法将接受一个 0 到 255 范围内的整数作为其第一个参数。(由 Petri Lehtinen 在 bpo-12170 中贡献。)
现在
bytes和bytearray的rjust(),ljust()和center()方法将接受一个bytearray作为fill参数。 (由 Petri Lehtinen 在 bpo-12380 中贡献。)list和bytearray增加了新的方法:copy()和clear()(bpo-10516)。 相应地,MutableSequence现在也定义了一个clear()方法 (bpo-11388)。原始字节串字面值现在可以写成
rb"..."也可以写成br"..."。(由 Antoine Pitrou 在 bpo-13748 中贡献。)
现在
dict.setdefault()对给定的键将只执行一次查找,使其在配合内置类型使用时是原子化的。(由 Filip Gruszczyński 在 bpo-13521 中贡献。)
当函数调用与函数签名不匹配时产生的错误消息已获得大幅改进。
(由 Benjamin Peterson 贡献。)
更细粒度的导入锁¶
之前版本的 CPython 是始终依赖于全局导入锁的。 这会导致预料之外的麻烦,比如当导入一个模块会触发代码在其他线程中执行作为附带影响导致的死锁。 有时需要应用一些笨拙的绕过方式,比如 PyImport_ImportModuleNoBlock() C API 函数。
在 Python 3.3 中,导入一个模块会使用单独的模块级锁。 这能正确地从多个线程序列化给定模块的导入操作(防止出现被不完整初始化的模块),同时消除之前提到的困扰。
(由 Antoine Pitrou 在 bpo-9260 中贡献。)
内置函数和类型¶
open()gets a new opener parameter: the underlying file descriptor for the file object is then obtained by calling opener with (file, flags). It can be used to use custom flags likeos.O_CLOEXECfor example. The'x'mode was added: open for exclusive creation, failing if the file already exists.print(): 增加了 flush 关键字参数。 如果 flush 关键字参数为真值,流会被强制刷新。hash(): 默认将启用哈希随机化,参见object.__hash__()和PYTHONHASHSEED。str类型新增了一个casefold()方法:返回字符串的大小写折叠副本,大小写折叠形式的字符串可被用于不区分大小写的匹配。 例如,'ß'.casefold()将返回'ss'。序列的文档已被大幅重写以更好地解释二进制/序列的区别并为各种内置序列类型提供专属的文档章节 (bpo-4966)。
新增模块¶
faulthandler¶
新增的调试模块 faulthandler 包含用于在发生错误(如段错误之类的程序崩溃),达到超时限制或收到用户信号时显式转储 Python 回溯的函数。 调用 faulthandler.enable() 可安装针对 SIGSEGV, SIGFPE, SIGABRT, SIGBUS, 和 SIGILL 信号的错误处理器。 你还可以在启动时通过设置 PYTHONFAULTHANDLER 环境变量或使用 -X faulthandler 命令行选项来启用它们。
Linux 上的段错误示例:
$ python -q -X faulthandler
>>> import ctypes
>>> ctypes.string_at(0)
Fatal Python error: Segmentation fault
Current thread 0x00007fb899f39700:
File "/home/python/cpython/Lib/ctypes/__init__.py", line 486 in string_at
File "<stdin>", line 1 in <module>
Segmentation fault
ipaddress¶
新的 ipaddress 模块提供了用于创建和操作代表 IPv4 和 IPv6 地址、网络和接口(即关联到特定 IP 子网的 IP 地址)的工具。
(由 Google 和 Peter Moody 在 bpo-3144 中贡献。)
lzma¶
The newly-added lzma module provides data compression and decompression
using the LZMA algorithm, including support for the .xz and .lzma
file formats.
(由 Nadeem Vawda 和 Per Øyvind Karlsen 在 bpo-6715 中贡献。)
改进的模块¶
abc¶
改进了对包含由抽象方法组成的描述器的抽象基类的支持。 现在声明抽象描述器的推荐方式是提供 __isabstractmethod__ 作为动态更新的特性属性。 内置的描述器已获得到相应的更新。
abc.abstractproperty已被弃用,改为property配合abc.abstractmethod()使用。
abc.abstractclassmethod已被弃用,改为classmethod配合abc.abstractmethod()使用。
abc.abstractstaticmethod已被弃用,改为staticmethod配合abc.abstractmethod()使用。
(由 Pablo Galindo 在 bpo-11610 中贡献。)
现在 abc.ABCMeta.register() 将返回已注册的子类,这意味着它现在可被用作类装饰器 (bpo-10868)。
array¶
The array module supports the long long type using q and
Q type codes.
(由 Oren Tirosh 和 Hirokazu Yamamoto 在 bpo-1172711 中贡献。)
base64¶
现在 base64 现代接口的解码函数可接受仅包含 ASCII 字符的 Unicode 字符串。 例如,base64.b64decode('YWJj') 将返回 b'abc'。 (由 Catalin Iacob 在 bpo-13641 中贡献。)
binascii¶
除了它们通常接受的二进制对象,a2b_ 现在还接受仅包含 ASCII 字符的字符串作为输入。 (由 Antoine Pitrou 在 bpo-13637 中贡献。)
bz2¶
bz2 模块已被重新编写。 在此过程中,添加了一些新的特征:
新的
bz2.open()函数:以二进制或文本模式打开 bzip2 压缩文件。bz2.BZ2File现在可以读写任意文件型对象,具体方式是通过其构造器的 fileobj 参数。(由 Nadeem Vawda 在 bpo-5863 中贡献。)
现在
bz2.BZ2File和bz2.decompress()能解压缩多流输入(比如由 pbzip2 工具所产生的输入)。bz2.BZ2File现在还可被用来创建这种类型的文件,具体做法是使用'a'(append) 模式。(由 Nir Aides 在 bpo-1625 中贡献。)
现在
bz2.BZ2File实现了所有的io.BufferedIOBaseAPI,但detach()和truncate()等方法除外。
编码器¶
mbcs 编解码器被重写以能够在所有 Windows 版本上正确处理 replace 和 ignore 错误处理器。 mbcs 编解码器现在支持所有错误处理器,而不是只能将 replace 用于编码并将 ignore 用于解码。
新增了一个 Windows 专属的编解码器: cp65001 (bpo-13216)。 即 Windows 代码页 65001 (Windows UTF-8, CP_UTF8)。 举例来说,如果控制台输出代码页被设为 cp65001(例如使用 chcp 65001 命令)则 sys.stdout 就会使用它。
多字节 CJK 解码器现在能更快地进行再同步。 它们将只忽略非法字节序列的第一个字节。 例如,现在 b'\xff\n'.decode('gb2312', 'replace') 将在替换字符后返回一个 \n。
增量式 CJK 编解码器的编码器在每次调用其 encode() 方法时将不再重置。 例如:
>>> import codecs
>>> encoder = codecs.getincrementalencoder('hz')('strict')
>>> b''.join(encoder.encode(x) for x in '\u52ff\u65bd\u65bc\u4eba\u3002 Bye.')
b'~{NpJ)l6HK!#~} Bye.'
对于旧版 Python 此示例将给出 b'~{Np~}~{J)~}~{l6~}~{HK~}~{!#~} Bye.'。
unicode_internal 编解码器已被弃用。
collections¶
新增了一个 ChainMap 类以允许将多个映射当作一个单元来处理。 (由 Raymond Hettinger 针对 bpo-11089 编写,在 bpo-11297 中对外公开。)
The abstract base classes have been moved in a new collections.abc
module, to better differentiate between the abstract and the concrete
collections classes. Aliases for ABCs are still present in the
collections module to preserve existing imports. (bpo-11085)
The Counter class now supports the unary + and -
operators, as well as the in-place operators +=, -=, |=, and
&=. (Contributed by Raymond Hettinger in bpo-13121.)
contextlib¶
ExitStack now provides a solid foundation for
programmatic manipulation of context managers and similar cleanup
functionality. Unlike the previous contextlib.nested API (which was
deprecated and removed), the new API is designed to work correctly
regardless of whether context managers acquire their resources in
their __init__ method (for example, file objects) or in their
__enter__ method (for example, synchronisation objects from the
threading module).
crypt¶
Addition of salt and modular crypt format (hashing method) and the mksalt()
function to the crypt module.
curses¶
If the
cursesmodule is linked to the ncursesw library, use Unicode functions when Unicode strings or characters are passed (e.g.waddwstr()), and bytes functions otherwise (e.g.waddstr()).Use the locale encoding instead of
utf-8to encode Unicode strings.
curses.window添加了新的curses.window.encoding属性。
curses.window类有一个新的get_wch()方法用来获取一个宽字符。
curses模块有一个新的unget_wch()函数用来推入一个宽字符以便下一个get_wch()将返回它。
(由 Iñigo Serna 在 bpo-6755 中贡献。)
datetime¶
Equality comparisons between naive and aware
datetimeinstances now returnFalseinstead of raisingTypeError(bpo-15006).New
datetime.datetime.timestamp()method: Return POSIX timestamp corresponding to thedatetimeinstance.The
datetime.datetime.strftime()method supports formatting years older than 1000.The
datetime.datetime.astimezone()method can now be called without arguments to convert datetime instance to the system timezone.
decimal¶
- bpo-7652 - integrate fast native decimal arithmetic.
C-module and libmpdec written by Stefan Krah.
The new C version of the decimal module integrates the high speed libmpdec library for arbitrary precision correctly-rounded decimal floating point arithmetic. libmpdec conforms to IBM's General Decimal Arithmetic Specification.
Performance gains range from 10x for database applications to 100x for numerically intensive applications. These numbers are expected gains for standard precisions used in decimal floating point arithmetic. Since the precision is user configurable, the exact figures may vary. For example, in integer bignum arithmetic the differences can be significantly higher.
The following table is meant as an illustration. Benchmarks are available at http://www.bytereef.org/mpdecimal/quickstart.html.
decimal.py
_decimal
加速
pi
42.02秒
0.345秒
120倍
telco
172.19秒
5.68秒
30倍
psycopg
3.57秒
0.29秒
12倍
相关特性¶
FloatOperation信号可选择启用针对混用 float 和 Decimal 时更严格的语义限制。If Python is compiled without threads, the C version automatically disables the expensive thread local context machinery. In this case, the variable
HAVE_THREADSis set toFalse.
API 的变化¶
C模块上下文限制(如下表),具体取决于计算机体系结构:
32位
64位
MAX_PREC425000000999999999999999999MAX_EMAX425000000999999999999999999MIN_EMIN-425000000-999999999999999999In the context templates (
DefaultContext,BasicContextandExtendedContext) the magnitude ofEmaxandEminhas changed to999999.The
Decimalconstructor in decimal.py does not observe the context limits and converts values with arbitrary exponents or precision exactly. Since the C version has internal limits, the following scheme is used: If possible, values are converted exactly, otherwiseInvalidOperationis raised and the result is NaN. In the latter case it is always possible to usecreate_decimal()in order to obtain a rounded or inexact value.The power function in decimal.py is always correctly-rounded. In the C version, it is defined in terms of the correctly-rounded
exp()andln()functions, but the final result is only "almost always correctly rounded".In the C version, the context dictionary containing the signals is a
MutableMapping. For speed reasons,flagsandtrapsalways refer to the sameMutableMappingthat the context was initialized with. If a new signal dictionary is assigned,flagsandtrapsare updated with the new values, but they do not reference the RHS dictionary.Pickling a
Contextproduces a different output in order to have a common interchange format for the Python and C versions.The order of arguments in the
Contextconstructor has been changed to match the order displayed byrepr().quantize()方法的watchexp形参已被弃用。
email¶
策略框架¶
The email package now has a policy framework. A
Policy is an object with several methods and properties
that control how the email package behaves. The primary policy for Python 3.3
is the Compat32 policy, which provides backward
compatibility with the email package in Python 3.2. A policy can be
specified when an email message is parsed by a parser, or when a
Message object is created, or when an email is
serialized using a generator. Unless overridden, a policy passed
to a parser is inherited by all the Message object and sub-objects
created by the parser. By default a generator will use the policy of
the Message object it is serializing. The default policy is
compat32.
The minimum set of controls implemented by all policy objects are:
max_line_length
The maximum length, excluding the linesep character(s), individual lines may have when a
Messageis serialized. Defaults to 78.linesep
The character used to separate individual lines when a
Messageis serialized. Defaults to\n.cte_type
7bitor8bit.8bitapplies only to aBytesgenerator, and means that non-ASCII may be used where allowed by the protocol (or where it exists in the original input).raise_on_defect
导致一个
parser在遇到缺陷时引发错误而不是将它们添加到Message对象的defects列表。
A new policy instance, with new settings, is created using the
clone() method of policy objects. clone takes
any of the above controls as keyword arguments. Any control not specified in
the call retains its default value. Thus you can create a policy that uses
\r\n linesep characters like this:
mypolicy = compat32.clone(linesep='\r\n')
Policies can be used to make the generation of messages in the format needed by
your application simpler. Instead of having to remember to specify
linesep='\r\n' in all the places you call a generator, you can specify
it once, when you set the policy used by the parser or the Message,
whichever your program uses to create Message objects. On the other hand,
if you need to generate messages in multiple forms, you can still specify the
parameters in the appropriate generator call. Or you can have custom
policy instances for your different cases, and pass those in when you create
the generator.
Provisional Policy with New Header API¶
While the policy framework is worthwhile all by itself, the main motivation for introducing it is to allow the creation of new policies that implement new features for the email package in a way that maintains backward compatibility for those who do not use the new policies. Because the new policies introduce a new API, we are releasing them in Python 3.3 as a provisional policy. Backwards incompatible changes (up to and including removal of the code) may occur if deemed necessary by the core developers.
The new policies are instances of EmailPolicy,
and add the following additional controls:
refold_source
Controls whether or not headers parsed by a
parserare refolded by thegenerator. It can benone,long, orall. The default islong, which means that source headers with a line longer thanmax_line_lengthget refolded.nonemeans no line get refolded, andallmeans that all lines get refolded.header_factory
A callable that take a
nameandvalueand produces a custom header object.
The header_factory is the key to the new features provided by the new
policies. When one of the new policies is used, any header retrieved from
a Message object is an object produced by the header_factory, and any
time you set a header on a Message it becomes an object produced by
header_factory. All such header objects have a name attribute equal
to the header name. Address and Date headers have additional attributes
that give you access to the parsed data of the header. This means you can now
do things like this:
>>> m = Message(policy=SMTP)
>>> m['To'] = 'Éric <foo@example.com>'
>>> m['to']
'Éric <foo@example.com>'
>>> m['to'].addresses
(Address(display_name='Éric', username='foo', domain='example.com'),)
>>> m['to'].addresses[0].username
'foo'
>>> m['to'].addresses[0].display_name
'Éric'
>>> m['Date'] = email.utils.localtime()
>>> m['Date'].datetime
datetime.datetime(2012, 5, 25, 21, 39, 24, 465484, tzinfo=datetime.timezone(datetime.timedelta(-1, 72000), 'EDT'))
>>> m['Date']
'Fri, 25 May 2012 21:44:27 -0400'
>>> print(m)
To: =?utf-8?q?=C3=89ric?= <foo@example.com>
Date: Fri, 25 May 2012 21:44:27 -0400
You will note that the unicode display name is automatically encoded as
utf-8 when the message is serialized, but that when the header is accessed
directly, you get the unicode version. This eliminates any need to deal with
the email.header decode_header() or
make_header() functions.
You can also create addresses from parts:
>>> m['cc'] = [Group('pals', [Address('Bob', 'bob', 'example.com'),
... Address('Sally', 'sally', 'example.com')]),
... Address('Bonzo', addr_spec='bonz@laugh.com')]
>>> print(m)
To: =?utf-8?q?=C3=89ric?= <foo@example.com>
Date: Fri, 25 May 2012 21:44:27 -0400
cc: pals: Bob <bob@example.com>, Sally <sally@example.com>;, Bonzo <bonz@laugh.com>
Decoding to unicode is done automatically:
>>> m2 = message_from_string(str(m))
>>> m2['to']
'Éric <foo@example.com>'
When you parse a message, you can use the addresses and groups
attributes of the header objects to access the groups and individual
addresses:
>>> m2['cc'].addresses
(Address(display_name='Bob', username='bob', domain='example.com'), Address(display_name='Sally', username='sally', domain='example.com'), Address(display_name='Bonzo', username='bonz', domain='laugh.com'))
>>> m2['cc'].groups
(Group(display_name='pals', addresses=(Address(display_name='Bob', username='bob', domain='example.com'), Address(display_name='Sally', username='sally', domain='example.com')), Group(display_name=None, addresses=(Address(display_name='Bonzo', username='bonz', domain='laugh.com'),))
In summary, if you use one of the new policies, header manipulation works the way it ought to: your application works with unicode strings, and the email package transparently encodes and decodes the unicode to and from the RFC standard Content Transfer Encodings.
Other API Changes¶
New BytesHeaderParser, added to the parser
module to complement HeaderParser and complete the Bytes
API.
New utility functions:
format_datetime(): given adatetime, produce a string formatted for use in an email header.
parsedate_to_datetime(): given a date string from an email header, convert it into an awaredatetime, or a naivedatetimeif the offset is-0000.
localtime(): With no argument, returns the current local time as an awaredatetimeusing the localtimezone. Given an awaredatetime, converts it into an awaredatetimeusing the localtimezone.
ftplib¶
现在
ftplib.FTP接受一个source_address关键字参数用于在创建外发套接字时指定(host, port)作为绑定调用中的源地址。 (由 Giampaolo Rodolà 在 bpo-8594 中贡献。)The
FTP_TLSclass now provides a newccc()function to revert control channel back to plaintext. This can be useful to take advantage of firewalls that know how to handle NAT with non-secure FTP without opening fixed ports. (Contributed by Giampaolo Rodolà in bpo-12139.)Added
ftplib.FTP.mlsd()method which provides a parsable directory listing format and deprecatesftplib.FTP.nlst()andftplib.FTP.dir(). (Contributed by Giampaolo Rodolà in bpo-11072.)
functools¶
The functools.lru_cache() decorator now accepts a typed keyword
argument (that defaults to False to ensure that it caches values of
different types that compare equal in separate cache slots. (Contributed
by Raymond Hettinger in bpo-13227.)
gc¶
It is now possible to register callbacks invoked by the garbage collector
before and after collection using the new callbacks list.
hmac¶
A new compare_digest() function has been added to prevent side
channel attacks on digests through timing analysis. (Contributed by Nick
Coghlan and Christian Heimes in bpo-15061.)
http¶
http.server.BaseHTTPRequestHandler now buffers the headers and writes
them all at once when end_headers() is
called. A new method flush_headers()
can be used to directly manage when the accumulated headers are sent.
(Contributed by Andrew Schaaf in bpo-3709.)
http.server now produces valid HTML 4.01 strict output.
(Contributed by Ezio Melotti in bpo-13295.)
http.client.HTTPResponse now has a
readinto() method, which means it can be used
as an io.RawIOBase class. (Contributed by John Kuhn in
bpo-13464.)
html¶
html.parser.HTMLParser is now able to parse broken markup without
raising errors, therefore the strict argument of the constructor and the
HTMLParseError exception are now deprecated.
The ability to parse broken markup is the result of a number of bug fixes that
are also available on the latest bug fix releases of Python 2.7/3.2.
(Contributed by Ezio Melotti in bpo-15114, and bpo-14538,
bpo-13993, bpo-13960, bpo-13358, bpo-1745761,
bpo-755670, bpo-13357, bpo-12629, bpo-1200313,
bpo-670664, bpo-13273, bpo-12888, bpo-7311.)
A new html5 dictionary that maps HTML5 named character
references to the equivalent Unicode character(s) (e.g. html5['gt;'] ==
'>') has been added to the html.entities module. The dictionary is
now also used by HTMLParser. (Contributed by Ezio
Melotti in bpo-11113 and bpo-15156.)
imaplib¶
The IMAP4_SSL constructor now accepts an SSLContext
parameter to control parameters of the secure channel.
(由 Sijin Joseph 在 bpo-8808 中贡献。)
inspect¶
A new getclosurevars() function has been added. This function
reports the current binding of all names referenced from the function body and
where those names were resolved, making it easier to verify correct internal
state when testing code that relies on stateful closures.
(由 Meador Inge 和 Nick Coghlan 在 bpo-13062 中贡献。)
A new getgeneratorlocals() function has been added. This
function reports the current binding of local variables in the generator's
stack frame, making it easier to verify correct internal state when testing
generators.
(由 Meador Inge 在 bpo-15153 中贡献。)
io¶
The open() function has a new 'x' mode that can be used to
exclusively create a new file, and raise a FileExistsError if the file
already exists. It is based on the C11 'x' mode to fopen().
(由 David Townshend 在 bpo-12760 中贡献。)
The constructor of the TextIOWrapper class has a new
write_through optional argument. If write_through is True, calls to
write() are guaranteed not to be buffered: any data
written on the TextIOWrapper object is immediately handled to its
underlying binary buffer.
itertools¶
accumulate() now takes an optional func argument for
providing a user-supplied binary function.
logging¶
The basicConfig() function now supports an optional handlers
argument taking an iterable of handlers to be added to the root logger.
A class level attribute append_nul has
been added to SysLogHandler to allow control of the
appending of the NUL (\000) byte to syslog records, since for some
daemons it is required while for others it is passed through to the log.
math¶
math 模块新增了一个函数 log2(),它返回 x 的以 2 为底的对数。
(由 Mark Dickinson 在 bpo-11888 中编写。)
mmap¶
现在 read() 方法能更好地兼容其他文件型对象:如果参数被省略或指定为 None,它将返回从当前文件位置到映射对象末尾的字节数据。 (由 Petri Lehtinen 在 bpo-12021 中贡献。)
multiprocessing¶
新增的 multiprocessing.connection.wait() 函数允许附带超时限制轮询多个对象(如连接、套接字和管道)。 (由 Richard Oudkerk 在 bpo-12328 中贡献。)
现在 multiprocessing.Connection 对象可通过多进程连接进行传输。 (由 Richard Oudkerk 在 bpo-4892 中贡献。)
现在 multiprocessing.Process 可接受 daemon 关键字参数来覆盖继承来自父进程的 daemon 旗标的默认行为 (bpo-6064)。
新增的属性 multiprocessing.Process.sentinel 允许程序使用适当的 OS 原语来同时等待多个 Process 对象 (例如,在 posix 系统上是使用 select )。
新增的方法 multiprocessing.pool.Pool.starmap() 和 starmap_async() 提供了针对现有 multiprocessing.pool.Pool.map() 和 map_async() 函数的 itertools.starmap() 对应物。 (由 Hynek Schlawack 在 bpo-12708 中贡献。)
nntplib¶
The nntplib.NNTP class now supports the context management protocol to
unconditionally consume socket.error exceptions and to close the NNTP
connection when done:
>>> from nntplib import NNTP
>>> with NNTP('news.gmane.org') as n:
... n.group('gmane.comp.python.committers')
...
('211 1755 1 1755 gmane.comp.python.committers', 1755, 1, 1755, 'gmane.comp.python.committers')
>>>
(由 Giampaolo Rodolà 在 bpo-9795 中贡献。)
os¶
The
osmodule has a newpipe2()function that makes it possible to create a pipe withO_CLOEXECorO_NONBLOCKflags set atomically. This is especially useful to avoid race conditions in multi-threaded programs.The
osmodule has a newsendfile()function which provides an efficient "zero-copy" way for copying data from one file (or socket) descriptor to another. The phrase "zero-copy" refers to the fact that all of the copying of data between the two descriptors is done entirely by the kernel, with no copying of data into userspace buffers.sendfile()can be used to efficiently copy data from a file on disk to a network socket, e.g. for downloading a file.(Patch submitted by Ross Lagerwall and Giampaolo Rodolà in bpo-10882.)
To avoid race conditions like symlink attacks and issues with temporary files and directories, it is more reliable (and also faster) to manipulate file descriptors instead of file names. Python 3.3 enhances existing functions and introduces new functions to work on file descriptors (bpo-4761, bpo-10755 and bpo-14626).
The
osmodule has a newfwalk()function similar towalk()except that it also yields file descriptors referring to the directories visited. This is especially useful to avoid symlink races.The following functions get new optional dir_fd (paths relative to directory descriptors) and/or follow_symlinks (not following symlinks):
access(),chflags(),chmod(),chown(),link(),lstat(),mkdir(),mkfifo(),mknod(),open(),readlink(),remove(),rename(),replace(),rmdir(),stat(),symlink(),unlink(),utime(). Platform support for using these parameters can be checked via the setsos.supports_dir_fdandos.supports_follows_symlinks.The following functions now support a file descriptor for their path argument:
chdir(),chmod(),chown(),execve(),listdir(),pathconf(),exists(),stat(),statvfs(),utime(). Platform support for this can be checked via theos.supports_fdset.
access()accepts aneffective_idskeyword argument to turn on using the effective uid/gid rather than the real uid/gid in the access check. Platform support for this can be checked via thesupports_effective_idsset.The
osmodule has two new functions:getpriority()andsetpriority(). They can be used to get or set process niceness/priority in a fashion similar toos.nice()but extended to all processes instead of just the current one.(Patch submitted by Giampaolo Rodolà in bpo-10784.)
The new
os.replace()function allows cross-platform renaming of a file with overwriting the destination. Withos.rename(), an existing destination file is overwritten under POSIX, but raises an error under Windows. (Contributed by Antoine Pitrou in bpo-8828.)The stat family of functions (
stat(),fstat(), andlstat()) now support reading a file's timestamps with nanosecond precision. Symmetrically,utime()can now write file timestamps with nanosecond precision. (Contributed by Larry Hastings in bpo-14127.)The new
os.get_terminal_size()function queries the size of the terminal attached to a file descriptor. See alsoshutil.get_terminal_size(). (Contributed by Zbigniew Jędrzejewski-Szmek in bpo-13609.)
New functions to support Linux extended attributes (bpo-12720):
getxattr(),listxattr(),removexattr(),setxattr().New interface to the scheduler. These functions control how a process is allocated CPU time by the operating system. New functions:
sched_get_priority_max(),sched_get_priority_min(),sched_getaffinity(),sched_getparam(),sched_getscheduler(),sched_rr_get_interval(),sched_setaffinity(),sched_setparam(),sched_setscheduler(),sched_yield(),New functions to control the file system:
posix_fadvise(): Announces an intention to access data in a specific pattern thus allowing the kernel to make optimizations.posix_fallocate(): Ensures that enough disk space is allocated for a file.sync(): Force write of everything to disk.
Additional new posix functions:
lockf(): Apply, test or remove a POSIX lock on an open file descriptor.pread(): Read from a file descriptor at an offset, the file offset remains unchanged.pwrite(): Write to a file descriptor from an offset, leaving the file offset unchanged.readv(): Read from a file descriptor into a number of writable buffers.truncate(): Truncate the file corresponding to path, so that it is at most length bytes in size.waitid(): Wait for the completion of one or more child processes.writev(): Write the contents of buffers to a file descriptor, where buffers is an arbitrary sequence of buffers.getgrouplist()(bpo-9344): Return list of group ids that specified user belongs to.
times()anduname(): Return type changed from a tuple to a tuple-like object with named attributes.Some platforms now support additional constants for the
lseek()function, such asos.SEEK_HOLEandos.SEEK_DATA.New constants
RTLD_LAZY,RTLD_NOW,RTLD_GLOBAL,RTLD_LOCAL,RTLD_NODELETE,RTLD_NOLOAD, andRTLD_DEEPBINDare available on platforms that support them. These are for use with thesys.setdlopenflags()function, and supersede the similar constants defined inctypesandDLFCN. (Contributed by Victor Stinner in bpo-13226.)os.symlink()now accepts (and ignores) thetarget_is_directorykeyword argument on non-Windows platforms, to ease cross-platform support.
pdb¶
Tab 补全现在不仅适用于命令名称,也适用于它们的参数。 例如,对于 break 命令,函数和文件名将被补全。
(由 Georg Brandl 在 bpo-14210 中贡献)
pickle¶
现在 pickle.Pickler 对象具有一个可选的 dispatch_table 属性以允许针对每个 pickler 设置缩减函数。
(由 Richard Oudkerk 在 bpo-14166 中贡献。)
pydoc¶
Tk GUI 和 serve() 函数已从 pydoc 模块中被移除: pydoc -g 和 serve() 在 Python 3.2 中已被弃用。
re¶
现在 str 正则表达式已支持 \u 和 \U 转义符。
(由 Serhiy Storchaka 在 bpo-3665 中贡献。)
sched¶
run()now accepts a blocking parameter which when set to false makes the method execute the scheduled events due to expire soonest (if any) and then return immediately. This is useful in case you want to use theschedulerin non-blocking applications. (Contributed by Giampaolo Rodolà in bpo-13449.)schedulerclass can now be safely used in multi-threaded environments. (Contributed by Josiah Carlson and Giampaolo Rodolà in bpo-8684.)timefunc and delayfunct parameters of
schedulerclass constructor are now optional and defaults totime.time()andtime.sleep()respectively. (Contributed by Chris Clark in bpo-13245.)enter()andenterabs()argument parameter is now optional. (Contributed by Chris Clark in bpo-13245.)enter()andenterabs()now accept a kwargs parameter. (Contributed by Chris Clark in bpo-13245.)
select¶
Solaris and derivative platforms have a new class select.devpoll
for high performance asynchronous sockets via /dev/poll.
(Contributed by Jesús Cea Avión in bpo-6397.)
shlex¶
The previously undocumented helper function quote from the
pipes modules has been moved to the shlex module and
documented. quote() properly escapes all characters in a string
that might be otherwise given special meaning by the shell.
shutil¶
新的函数:
disk_usage(): provides total, used and free disk space statistics. (Contributed by Giampaolo Rodolà in bpo-12442.)chown(): allows one to change user and/or group of the given path also specifying the user/group names and not only their numeric ids. (Contributed by Sandro Tosi in bpo-12191.)shutil.get_terminal_size(): returns the size of the terminal window to which the interpreter is attached. (Contributed by Zbigniew Jędrzejewski-Szmek in bpo-13609.)
copy2()andcopystat()now preserve file timestamps with nanosecond precision on platforms that support it. They also preserve file "extended attributes" on Linux. (Contributed by Larry Hastings in bpo-14127 and bpo-15238.)Several functions now take an optional
symlinksargument: when that parameter is true, symlinks aren't dereferenced and the operation instead acts on the symlink itself (or creates one, if relevant). (Contributed by Hynek Schlawack in bpo-12715.)When copying files to a different file system,
move()now handles symlinks the way the posixmvcommand does, recreating the symlink rather than copying the target file contents. (Contributed by Jonathan Niehof in bpo-9993.)move()now also returns thedstargument as its result.rmtree()is now resistant to symlink attacks on platforms which support the newdir_fdparameter inos.open()andos.unlink(). (Contributed by Martin von Löwis and Hynek Schlawack in bpo-4489.)
signal¶
signal模块新增的函数:pthread_sigmask(): 获取和/或改变调用方线程的信号掩码(由 Jean-Paul Calderone 在 bpo-8407 中贡献);pthread_kill(): 向指定线程发送信号;sigpending(): 检查挂起的函数;sigwait(): 等待一个信号;sigwaitinfo(): 等待信号,返回相关的详细信息;sigtimedwait(): likesigwaitinfo()but with a timeout.
The signal handler writes the signal number as a single byte instead of a nul byte into the wakeup file descriptor. So it is possible to wait more than one signal and know which signals were raised.
signal.signal()andsignal.siginterrupt()raise an OSError, instead of a RuntimeError: OSError has an errno attribute.
smtpd¶
The smtpd module now supports RFC 5321 (extended SMTP) and RFC 1870
(size extension). Per the standard, these extensions are enabled if and only
if the client initiates the session with an EHLO command.
(Initial ELHO support by Alberto Trevino. Size extension by Juhana
Jauhiainen. Substantial additional work on the patch contributed by Michele
Orrù and Dan Boswell. bpo-8739)
smtplib¶
The SMTP, SMTP_SSL, and
LMTP classes now accept a source_address keyword argument
to specify the (host, port) to use as the source address in the bind call
when creating the outgoing socket. (Contributed by Paulo Scardine in
bpo-11281.)
SMTP now supports the context management protocol, allowing an
SMTP instance to be used in a with statement. (Contributed
by Giampaolo Rodolà in bpo-11289.)
The SMTP_SSL constructor and the starttls()
method now accept an SSLContext parameter to control parameters of the secure
channel. (Contributed by Kasun Herath in bpo-8809.)
socket¶
The
socketclass now exposes additional methods to process ancillary data when supported by the underlying platform:(Contributed by David Watson in bpo-6560, based on an earlier patch by Heiko Wundram)
The
socketclass now supports the PF_CAN protocol family (https://en.wikipedia.org/wiki/Socketcan), on Linux (https://lwn.net/Articles/253425).(Contributed by Matthias Fuchs, updated by Tiago Gonçalves in bpo-10141.)
The
socketclass now supports the PF_RDS protocol family (https://en.wikipedia.org/wiki/Reliable_Datagram_Sockets and https://oss.oracle.com/projects/rds/).The
socketclass now supports thePF_SYSTEMprotocol family on OS X. (Contributed by Michael Goderbauer in bpo-13777.)New function
sethostname()allows the hostname to be set on unix systems if the calling process has sufficient privileges. (Contributed by Ross Lagerwall in bpo-10866.)
socketserver¶
BaseServer now has an overridable method
service_actions() that is called by the
serve_forever() method in the service loop.
ForkingMixIn now uses this to clean up zombie
child processes. (Contributed by Justin Warkentin in bpo-11109.)
sqlite3¶
新增的 sqlite3.Connection 方法 set_trace_callback() 可被用于捕获由 sqlite 处理的所有 sql 命令的追踪信息。 (由 Torsten Landschoff 在 bpo-11688 中贡献。)
ssl¶
ssl新增了两个随机生成函数:RAND_bytes(): 生成高加密强度的伪随机字节数据。RAND_pseudo_bytes(): 生成伪随机字节。
(由 Victor Stinner 在 bpo-12049 中贡献。)
The
sslmodule now exposes a finer-grained exception hierarchy in order to make it easier to inspect the various kinds of errors. (Contributed by Antoine Pitrou in bpo-11183.)load_cert_chain()now accepts a password argument to be used if the private key is encrypted. (Contributed by Adam Simpkins in bpo-12803.)Diffie-Hellman key exchange, both regular and Elliptic Curve-based, is now supported through the
load_dh_params()andset_ecdh_curve()methods. (Contributed by Antoine Pitrou in bpo-13626 and bpo-13627.)SSL sockets have a new
get_channel_binding()method allowing the implementation of certain authentication mechanisms such as SCRAM-SHA-1-PLUS. (Contributed by Jacek Konieczny in bpo-12551.)You can query the SSL compression algorithm used by an SSL socket, thanks to its new
compression()method. The new attributeOP_NO_COMPRESSIONcan be used to disable compression. (Contributed by Antoine Pitrou in bpo-13634.)Support has been added for the Next Protocol Negotiation extension using the
ssl.SSLContext.set_npn_protocols()method. (Contributed by Colin Marc in bpo-14204.)SSL errors can now be introspected more easily thanks to
libraryandreasonattributes. (Contributed by Antoine Pitrou in bpo-14837.)The
get_server_certificate()function now supports IPv6. (Contributed by Charles-François Natali in bpo-11811.)New attribute
OP_CIPHER_SERVER_PREFERENCEallows setting SSLv3 server sockets to use the server's cipher ordering preference rather than the client's (bpo-13635).
stat¶
The undocumented tarfile.filemode function has been moved to
stat.filemode(). It can be used to convert a file's mode to a string of
the form '-rwxrwxrwx'.
(由 Giampaolo Rodolà 在 bpo-14807 中贡献。)
struct¶
The struct module now supports ssize_t and size_t via the
new codes n and N, respectively. (Contributed by Antoine Pitrou
in bpo-3163.)
subprocess¶
Command strings can now be bytes objects on posix platforms. (Contributed by Victor Stinner in bpo-8513.)
A new constant DEVNULL allows suppressing output in a
platform-independent fashion. (Contributed by Ross Lagerwall in
bpo-5870.)
sys¶
The sys module has a new thread_info named
tuple holding information about the thread implementation
(bpo-11223).
tarfile¶
tarfile now supports lzma encoding via the lzma module.
(Contributed by Lars Gustäbel in bpo-5689.)
tempfile¶
tempfile.SpooledTemporaryFile's
truncate() method now accepts
a size parameter. (Contributed by Ryan Kelly in bpo-9957.)
textwrap¶
The textwrap module has a new indent() that makes
it straightforward to add a common prefix to selected lines in a block
of text (bpo-13857).
threading¶
threading.Condition, threading.Semaphore,
threading.BoundedSemaphore, threading.Event, and
threading.Timer, all of which used to be factory functions returning a
class instance, are now classes and may be subclassed. (Contributed by Éric
Araujo in bpo-10968.)
The threading.Thread constructor now accepts a daemon keyword
argument to override the default behavior of inheriting the daemon flag
value from the parent thread (bpo-6064).
The formerly private function _thread.get_ident is now available as the
public function threading.get_ident(). This eliminates several cases of
direct access to the _thread module in the stdlib. Third party code that
used _thread.get_ident should likewise be changed to use the new public
interface.
time¶
The PEP 418 added new functions to the time module:
get_clock_info(): Get information on a clock.monotonic(): Monotonic clock (cannot go backward), not affected by system clock updates.perf_counter(): Performance counter with the highest available resolution to measure a short duration.process_time(): Sum of the system and user CPU time of the current process.
Other new functions:
clock_getres(),clock_gettime()andclock_settime()functions withCLOCK_xxxconstants. (Contributed by Victor Stinner in bpo-10278.)
To improve cross platform consistency, sleep() now raises a
ValueError when passed a negative sleep value. Previously this was an
error on posix, but produced an infinite sleep on Windows.
types¶
Add a new types.MappingProxyType class: Read-only proxy of a mapping.
(bpo-14386)
The new functions types.new_class() and types.prepare_class() provide support
for PEP 3115 compliant dynamic type creation. (bpo-14588)
unittest¶
assertRaises(), assertRaisesRegex(), assertWarns(), and
assertWarnsRegex() now accept a keyword argument msg when used as
context managers. (Contributed by Ezio Melotti and Winston Ewert in
bpo-10775.)
unittest.TestCase.run() now returns the TestResult
object.
urllib¶
The Request class, now accepts a method argument
used by get_method() to determine what HTTP method
should be used. For example, this will send a 'HEAD' request:
>>> urlopen(Request('https://www.python.org', method='HEAD'))
webbrowser¶
The webbrowser module supports more "browsers": Google Chrome (named
chrome, chromium, chrome-browser or
chromium-browser depending on the version and operating system),
and the generic launchers xdg-open, from the FreeDesktop.org
project, and gvfs-open, which is the default URI handler for GNOME
3. (The former contributed by Arnaud Calmettes in bpo-13620, the latter
by Matthias Klose in bpo-14493.)
xml.etree.ElementTree¶
The xml.etree.ElementTree module now imports its C accelerator by
default; there is no longer a need to explicitly import
xml.etree.cElementTree (this module stays for backwards compatibility,
but is now deprecated). In addition, the iter family of methods of
Element has been optimized (rewritten in C).
The module's documentation has also been greatly improved with added examples
and a more detailed reference.
zlib¶
New attribute zlib.Decompress.eof makes it possible to distinguish
between a properly-formed compressed stream and an incomplete or truncated one.
(Contributed by Nadeem Vawda in bpo-12646.)
New attribute zlib.ZLIB_RUNTIME_VERSION reports the version string of
the underlying zlib library that is loaded at runtime. (Contributed by
Torsten Landschoff in bpo-12306.)
性能优化¶
已增加的主要性能改善:
得益于:pep:393 ,Unicode 字符串的某些操作已得到优化:
the memory footprint is divided by 2 to 4 depending on the text
将 ASCII 字符串编码为 UTF-8 不再需要对字符进行编码,UTF-8 的表示法与 ASCII 的表示法是共享的
the UTF-8 encoder has been optimized
repeating a single ASCII letter and getting a substring of an ASCII string is 4 times faster
UTF-8 编码现在快 2 到 4 倍。 UTF-16 编码的速度现在提高了 10 倍。
(由 Serhiy Storchaka 在 bpo-14624, bpo-14738 和 bpo-15026 中贡献)。
构建和 C API 的改变¶
针对 Python 构建过程和 C API 的改变包括:
新的 PEP 3118 相关功能:
PEP 393 添加了新的 Unicode 类型,宏和函数
高阶 API
低阶 API:
PyArg_ParseTuple现在接受c格式的bytearray(bpo-12380)。
弃用¶
不支持的操作系统¶
由于缺少维护人员,不再支持 OS/2 和 VMS 系统 。
由于维护负担,将 COMSPEC 设置为 command.com 的 Windows平台(含Windows 2000)不再受支持。
OSF支持在3.2中被弃用,现在已经被完全删除。
已弃用的 Python 模块、函数和方法¶
向
object.__format__()传递非空字符串的做法已被弃用,在 Python 3.4 中会产生一个TypeError(bpo-9856)。由于:pep:393,
unicode_internal编解码器已被弃用 。请使用 UTF-8、UTF-16 (utf-16-le或``utf-16-be``) 或 UTF-32 (utf-32-le或``utf-32-be``)ftplib.FTP.nlst()andftplib.FTP.dir(): useftplib.FTP.mlsd()platform.popen(): use thesubprocessmodule. Check especially the 使用 subprocess 模块替换旧函数 section (bpo-11377).bpo-13374: The Windows bytes API has been deprecated in the
osmodule. Use Unicode filenames, instead of bytes filenames, to not depend on the ANSI code page anymore and to support any filename.bpo-13988:
xml.etree.cElementTree模块已被弃用。 只要有可能就会自动使用加速版本。The behaviour of
time.clock()depends on the platform: use the newtime.perf_counter()ortime.process_time()function instead, depending on your requirements, to have a well defined behaviour.os.stat_float_times()函数已被弃用。abc模块:abc.abstractproperty已被弃用,改为property配合abc.abstractmethod()使用。abc.abstractclassmethod已被弃用,改为classmethod配合abc.abstractmethod()使用。abc.abstractstaticmethod已被弃用,改为staticmethod配合abc.abstractmethod()使用。
importlib包:现在
importlib.abc.SourceLoader.path_mtime()已被弃用而应改用importlib.abc.SourceLoader.path_stats()因为字节码文件现在会同时储存修改时间和编译出该字节码文件的源文件的大小。
已弃用的 C API 函数和类型¶
Py_UNICODE 已经在 PEP 393 弃用,并将于 Python 4 中移除。所有使用此类型的函数都已弃用:
Unicode functions and methods using Py_UNICODE and
Py_UNICODE* types:
PyUnicode_FromUnicode: 使用PyUnicode_FromWideChar()或PyUnicode_FromKindAndData()PyUnicode_AS_UNICODE,PyUnicode_AsUnicode(),PyUnicode_AsUnicodeAndSize(): 使用PyUnicode_AsWideCharString()PyUnicode_AS_DATA: 使用PyUnicode_DATA以及PyUnicode_READ和PyUnicode_WRITEPyUnicode_GET_SIZE,PyUnicode_GetSize(): 使用PyUnicode_GET_LENGTH或PyUnicode_GetLength()PyUnicode_GET_DATA_SIZE: 使用PyUnicode_GET_LENGTH(str) * PyUnicode_KIND(str)(仅适用于现成的字符串)PyUnicode_AsUnicodeCopy(): 使用PyUnicode_AsUCS4Copy()或PyUnicode_AsWideCharString()PyUnicode_GetMax()
Functions and macros manipulating Py_UNICODE* strings:
Py_UNICODE_strlen: 使用PyUnicode_GetLength()或PyUnicode_GET_LENGTHPy_UNICODE_strcat: 使用PyUnicode_CopyCharacters()或PyUnicode_FromFormat()Py_UNICODE_strcpy,Py_UNICODE_strncpy,Py_UNICODE_COPY: 使用PyUnicode_CopyCharacters()或PyUnicode_Substring()Py_UNICODE_strcmp: 使用PyUnicode_Compare()Py_UNICODE_strncmp: 使用PyUnicode_Tailmatch()Py_UNICODE_strchr,Py_UNICODE_strrchr: 使用PyUnicode_FindChar()Py_UNICODE_FILL: 使用PyUnicode_Fill()Py_UNICODE_MATCH
编码器:
PyUnicode_Encode(): 使用PyUnicode_AsEncodedObject()PyUnicode_EncodeUTF8(): 使用PyUnicode_AsUTF8()或PyUnicode_AsUTF8String()PyUnicode_EncodeUnicodeEscape()usePyUnicode_AsUnicodeEscapeString()PyUnicode_EncodeRawUnicodeEscape()usePyUnicode_AsRawUnicodeEscapeString()PyUnicode_EncodeMBCS(): 使用PyUnicode_AsMBCSString()或PyUnicode_EncodeCodePage()(和CP_ACPcode_page)PyUnicode_EncodeDecimal(),PyUnicode_TransformDecimalToASCII()
弃用的功能¶
array 模块的``'u'`` 格式代码现已弃用,将在 Python 4 中与 (Py_UNICODE) API 的其他部分一起删除。
移植到 Python 3.3¶
本节列出了先前描述的更改以及可能需要更改代码的其他错误修正.
移植 Python 代码¶
默认启用哈希随机化。 将
PYTHONHASHSEED环境变量设为0可禁用哈希随机化。 另请参阅object.__hash__()方法。bpo-12326: 在 Linux 上,sys.platform 不再包含主要版本。现在它始终是 "linux",而不是 "linux2" 或 "linux3",这取决于用于构建 Python 的 Linux 版本。请用 sys.platform.startswith('linux') 替换 sys.platform == 'linux2',如果不需要支持较旧的 Python 版本,则可直接替换成 sys.platform == 'linux'。
bpo-13847, bpo-14180:
time和datetime: 现在如果时间戳超出范围将会引发OverflowError而不是ValueError。 现在如果 C 函数gmtime()或localtime()失败 将会引发OSError。The default finders used by import now utilize a cache of what is contained within a specific directory. If you create a Python source file or sourceless bytecode file, make sure to call
importlib.invalidate_caches()to clear out the cache for the finders to notice the new file.ImportErrornow uses the full name of the module that was attempted to be imported. Doctests that check ImportErrors' message will need to be updated to use the full name of the module instead of just the tail of the name.The index argument to
__import__()now defaults to 0 instead of -1 and no longer support negative values. It was an oversight when PEP 328 was implemented that the default value remained -1. If you need to continue to perform a relative import followed by an absolute import, then perform the relative import using an index of 1, followed by another import using an index of 0. It is preferred, though, that you useimportlib.import_module()rather than call__import__()directly.__import__()no longer allows one to use an index value other than 0 for top-level modules. E.g.__import__('sys', level=1)is now an error.Because
sys.meta_pathandsys.path_hooksnow have finders on them by default, you will most likely want to uselist.insert()instead oflist.append()to add to those lists.Because
Noneis now inserted intosys.path_importer_cache, if you are clearing out entries in the dictionary of paths that do not have a finder, you will need to remove keys paired with values ofNoneandimp.NullImporterto be backwards-compatible. This will lead to extra overhead on older versions of Python that re-insertNoneintosys.path_importer_cachewhere it represents the use of implicit finders, but semantically it should not change anything.importlib.abc.Finderno longer specifies a find_module() abstract method that must be implemented. If you were relying on subclasses to implement that method, make sure to check for the method's existence first. You will probably want to check for find_loader() first, though, in the case of working with path entry finders.pkgutilhas been converted to useimportlibinternally. This eliminates many edge cases where the old behaviour of the PEP 302 import emulation failed to match the behaviour of the real import system. The import emulation itself is still present, but is now deprecated. Thepkgutil.iter_importers()andpkgutil.walk_packages()functions special case the standard import hooks so they are still supported even though they do not provide the non-standarditer_modules()method.A longstanding RFC-compliance bug (bpo-1079) in the parsing done by
email.header.decode_header()has been fixed. Code that uses the standard idiom to convert encoded headers into unicode (str(make_header(decode_header(h))) will see no change, but code that looks at the individual tuples returned by decode_header will see that whitespace that precedes or followsASCIIsections is now included in theASCIIsection. Code that builds headers usingmake_headershould also continue to work without change, sincemake_headercontinues to add whitespace betweenASCIIand non-ASCIIsections if it is not already present in the input strings.email.utils.formataddr()now does the correct content transfer encoding when passed non-ASCIIdisplay names. Any code that depended on the previous buggy behavior that preserved the non-ASCIIunicode in the formatted output string will need to be changed (bpo-1690608).poplib.POP3.quit()may now raise protocol errors like all otherpoplibmethods. Code that assumesquitdoes not raisepoplib.error_protoerrors may need to be changed if errors onquitare encountered by a particular application (bpo-11291).The
strictargument toemail.parser.Parser, deprecated since Python 2.4, has finally been removed.The deprecated method
unittest.TestCase.assertSameElementshas been removed.The deprecated variable
time.accept2dyearhas been removed.被弃用的
Context._clamp属性已从decimal模块中移除。 在此之前它已被公有属性clamp取代。 (参见 bpo-8540。)The undocumented internal helper class
SSLFakeFilehas been removed fromsmtplib, since its functionality has long been provided directly bysocket.socket.makefile().Passing a negative value to
time.sleep()on Windows now raises an error instead of sleeping forever. It has always raised an error on posix.The
ast.__version__constant has been removed. If you need to make decisions affected by the AST version, usesys.version_infoto make the decision.Code that used to work around the fact that the
threadingmodule used factory functions by subclassing the private classes will need to change to subclass the now-public classes.The undocumented debugging machinery in the threading module has been removed, simplifying the code. This should have no effect on production code, but is mentioned here in case any application debug frameworks were interacting with it (bpo-13550).
移植 C 代码¶
In the course of changes to the buffer API the undocumented
smalltablemember of thePy_bufferstructure has been removed and the layout of thePyMemoryViewObjecthas changed.All extensions relying on the relevant parts in
memoryobject.horobject.hmust be rebuilt.Due to PEP 393, the
Py_UNICODEtype and all functions using this type are deprecated (but will stay available for at least five years). If you were using low-level Unicode APIs to construct and access unicode objects and you want to benefit of the memory footprint reduction provided by PEP 393, you have to convert your code to the new Unicode API.However, if you only have been using high-level functions such as
PyUnicode_Concat(),PyUnicode_Join()orPyUnicode_FromFormat(), your code will automatically take advantage of the new unicode representations.PyImport_GetMagicNumber()now returns-1upon failure.As a negative value for the level argument to
__import__()is no longer valid, the same now holds forPyImport_ImportModuleLevel(). This also means that the value of level used byPyImport_ImportModuleEx()is now0instead of-1.
Building C extensions¶
The range of possible file names for C extensions has been narrowed. Very rarely used spellings have been suppressed: under POSIX, files named
xxxmodule.so,xxxmodule.abi3.soandxxxmodule.cpython-*.soare no longer recognized as implementing thexxxmodule. If you had been generating such files, you have to switch to the other spellings (i.e., remove themodulestring from the file names).(在 bpo-14040 中实现。)