What's New in Python 2.5¶

著者:: A.M. Kuchling

This article explains the new features in Python 2.5. The final release of Python 2.5 is scheduled for August 2006; PEP 356 describes the planned release schedule. Python 2.5 was released on September 19, 2006.

The changes in Python 2.5 are an interesting mix of language and library improvements. The library enhancements will be more important to Python's user community, I think, because several widely useful packages were added. New modules include ElementTree for XML processing (xml.etree.ElementTree), the SQLite database module (sqlite3), and the ctypes module for calling C functions.

言語の変更の意義は、中程度のものです。いくつかの心地よい機能が入りましたが、そのほとんどの機能は毎日使うというものでもありません。条件式(conditional expressions)がついに言語に追加されました。ちょっと変わった文法です。 PEP 308: 条件式 (Conditional Expressions) のセクションを参照してください。新しい 'with' 文は後始末のコードを書くのを簡単にします(PEP 343: "with" ステートメントセクション)。ジェネレータに値を返すことが出来るようになりました(PEP 342: ジェネレータの新機能セクション)。インポートが、絶対パス、相対パスのどちらかで可視に出来るようになりました(PEP 328: 絶対インポート、相対インポートセクション)。例外ハンドリングのいくつかの境界のケースで、より良いハンドリングを行えるようになりました(PEP 341: try/except/finally の一体化セクション)。これらのどの改善も価値が高いですが、それらは言語機能を限定的に改善するものであって、いずれも Python のセマンティクスを広範に修正するものではありません。

言語とライブラリの追加と同様に、ほかの改善とバグフィックスもソースツリー全体に渡っています。SVN 変更ログを検索すると、Python 2.4 から 2.5 の間で適用されたパッチは 353、フィックスされたバグは 458 ありました(ともに少なく見積もってです)。

このドキュメントは新機能の完全な詳細を提供するのではなくて、変更について、役に立つ実例を使った簡易な概要を提供することを目的にしています。完全な詳細が知りたければ常に、 https://docs.python.org の Python 2.6 のドキュメントを参照すべきです。設計と実装の根拠を理解したい場合は、新機能に関する PEP を参照してください。可能な限り、 "What’s New in Python" は各変更の bug や patch に対してリンクしています。

このドキュメントについてのコメント、提案と誤りの報告は歓迎です。著者に電子メールを送るか、または Python バグトラッカーにバグをあげてください。

PEP 308: 条件式 (Conditional Expressions)¶

もう長いことずっと、人々は条件式を書くための手段を要望し続けてきました。それは Boolean 値が真か偽かによって値 A または B を返す式です。条件式により以下と同じ効果を持つ単一行代入式を書くことが出来ます:

if condition:
    x = true_value
else:
    x = false_value

構文についての退屈で終わりのない議論が python-dev と comp.lang.python の両方で続いてきました。投票では、大半は何らかの形で条件式を望んでいることを示していましたが、明らかな過半数によって支持される構文はありませんでした。候補には C の cond ? true_v : false_v や if cond then true_v else false_v を含む、ほか 16 のバリエーションがありました。

Guido van Rossum は結果として意外な構文を選択しました:

x = true_value if condition else false_value

評価はこれでも既存の Boolean 式のように遅延的であり、評価順は少々前後します。真ん中の condition はまさに最初に評価されるのであって、 true_value 式はその condition が真の場合にのみ評価されます。同じように false_value は condition が偽の場合にのみ評価されるのです。

この構文は奇妙で退行しているようにも思えます; どうして condition が 真ん中 にあって C の c ? x : y のように前にないのでしょう? この決定は、標準ライブラリ内のモジュールにその新構文を適用し、結果のコードをどう読むかを見ることでチェックされました。条件式が使われる多くの場合で、一つの値はどうやら「一般の場合」で一つの値は「例外的な場合」のようで、条件を満たさないことが稀な場合にのみ使われていました。条件の構文はこのパターンで少しばかり明快さを増します:

contents = ((doc + '\n') if doc else '')

私には上記ステートメントはこう読めます。「 contents は普通は doc+'\n' の値に代入される。ただし時々 doc が空になる。その特別な場合には空の値が代入されるのだ」。私は普通と普通でないのが明らかでない場合に頻繁に条件式を使うとは思いません。

言語が条件式に括弧を必要とすべきかどうかに関していくつか議論がありました。決定は Python 言語文法として括弧は 必要ない とされましたが、スタイルの問題としては私は常に括弧を付けるべきと思います。以下 2 つの例を考えてみてください:

# First version -- no parens
level = 1 if logging else 0

# Second version -- with parens
level = (1 if logging else 0)

最初の版の場合、私は読者の目はおそらくステートメントを「level = 1」「if logging」「else 0」にグループ化すると思います。そして condition は level への代入が行われるのかどうかを決めるのだ、と考えるであろうと。後者の版は読みやすいです、少なくとも私の意見では。代入はいつでも行われることも、選択が 2 つの値の間で起こることもはっきりしますから。

括弧を含めるもう一つの理由があります: 少し奇妙なリスト内包とラムダの組合せが、不完全な条件式になることがあります。 PEP 308 にいくつかの例があるので見てください。条件式の周りをいつでも括弧で囲むなら、このケースにぶち当たることもありません。

参考

PEP 308 - 条件式 (Conditional Expressions): PEP 著 Guido van Rossum と Raymond D. Hettinger; 実装 Thomas Wouters.

PEP 309: 関数の部分適用¶

functools モジュールは、関数型スタイルのプログラミングのためのツールを含むことを意図したものです。

One useful tool in this module is the partial() function. For programs written in a functional style, you'll sometimes want to construct variants of existing functions that have some of the parameters filled in. Consider a Python function f(a, b, c); you could create a new function g(b, c) that was equivalent to f(1, b, c). This is called "partial function application".

partial() takes the arguments (function, arg1, arg2, ... kwarg1=value1, kwarg2=value2). The resulting object is callable, so you can just call it to invoke function with the filled-in arguments.

以下にあるのは、小さいけれども現実的な一つの例です:

import functools

def log (message, subsystem):
    "Write the contents of 'message' to the specified subsystem."
    print '%s: %s' % (subsystem, message)
    ...

server_log = functools.partial(log, subsystem='server')
server_log('Unable to open socket')

Here's another example, from a program that uses PyGTK. Here a context-sensitive pop-up menu is being constructed dynamically. The callback provided for the menu option is a partially applied version of the open_item() method, where the first argument has been provided.

...
class Application:
    def open_item(self, path):
       ...
    def init (self):
        open_func = functools.partial(self.open_item, item_path)
        popup_menu.append( ("Open", open_func, 1) )

Another function in the functools module is the update_wrapper(wrapper, wrapped) function that helps you write well-behaved decorators. update_wrapper() copies the name, module, and docstring attribute to a wrapper function so that tracebacks inside the wrapped function are easier to understand. For example, you might write:

def my_decorator(f):
    def wrapper(*args, **kwds):
        print 'Calling decorated function'
        return f(*args, **kwds)
    functools.update_wrapper(wrapper, f)
    return wrapper

wraps() is a decorator that can be used inside your own decorators to copy the wrapped function's information. An alternate version of the previous example would be:

def my_decorator(f):
    @functools.wraps(f)
    def wrapper(*args, **kwds):
        print 'Calling decorated function'
        return f(*args, **kwds)
    return wrapper

参考

PEP 309 - 関数の部分適用: PEP の提案と著: Peter Harris; 実装: Hye-Shik Chang と Nick Coghlan, また、適応物が Raymond Hettinger により。 (---訳注: この PEP のタイトルに違和感を憶える人は多いと思います。functools のテリトリが「関数型スタイルのサポート」にあるのに。部分引数サポートのためのパッチが最終的に関数型スタイルに行き着いた (正確には、「partial のような高階関数を置く場所として functools という場所を使いましょう、partial から始めましょう」というのが提案の趣旨)、という経緯がこのタイトルに反映しています。---)

PEP 314: Metadata for Python Software Packages v1.1¶

Some simple dependency support was added to Distutils. The setup() function now has requires, provides, and obsoletes keyword parameters. When you build a source distribution using the sdist command, the dependency information will be recorded in the PKG-INFO file.

もう一つ追加されたキーワードパラメータ download_url は、パッケージのソースコードの URL をセットします。これによりパッケージインデクス内のエントリを探し出すことが出来、パッケージの依存性を決めることが出来、必要パッケージをダウンロード出来ます:

VERSION = '1.0'
setup(name='PyPackage',
      version=VERSION,
      requires=['numarray', 'zlib (>=1.1.4)'],
      obsoletes=['OldPackage']
      download_url=('http://www.example.com/pypackage/dist/pkg-%s.tar.gz'
                    % VERSION),
     )

Python Package Index (https://pypi.org) へのもう一つの新たな拡張は、パッケージについてのソースコードとバイナリアーカイブの保存についてです。新たな Distutils コマンド upload は、レポジトリへパッケージをアップロードします。

パッケージがアップロードされる前には、Distutils sdist コマンドを使って配布物をビルド可能でなければなりません。それさえ出来ていれば、 python setup.py upload がパッケージを PyPI アーカイブに追加してくれます。追加で、パッケージに GPG 署名出来ます。これには --sign オプションと --identity オプションを使います。

パッケージのアップロードは Martin von Löwis と Richard Jones により実装されました。

参考

PEP 314 - Metadata for Python Software Packages v1.1: PEP 提案と著 A.M. Kuchling、 Richard Jones、 Fred Drake; 実装 Richard Jones と Fred Drake.

PEP 328: 絶対インポート、相対インポート¶

The simpler part of PEP 328 was implemented in Python 2.4: parentheses could now be used to enclose the names imported from a module using the from ... import ... statement, making it easier to import many different names.

より複雑なほうは Python 2.5 で実装されました: モジュールのインポートにおいて、それが絶対インポートなのかパッケージ相対のインポートなのかを指定出来るようにする、というものです。将来のバージョンの Python では絶対インポートをデフォルトにする方向に向かう予定です。

いま、あなたは以下のようなパッケージディレクトリを持っているとします:

pkg/
pkg/__init__.py
pkg/main.py
pkg/string.py

This defines a package named pkg containing the pkg.main and pkg.string submodules.

Consider the code in the main.py module. What happens if it executes the statement import string? In Python 2.4 and earlier, it will first look in the package's directory to perform a relative import, finds pkg/string.py, imports the contents of that file as the pkg.string module, and that module is bound to the name string in the pkg.main module's namespace.

That's fine if pkg.string was what you wanted. But what if you wanted Python's standard string module? There's no clean way to ignore pkg.string and look for the standard module; generally you had to look at the contents of sys.modules, which is slightly unclean. Holger Krekel's py.std package provides a tidier way to perform imports from the standard library, import py; py.std.string.join(), but that package isn't available on all Python installations.

Reading code which relies on relative imports is also less clear, because a reader may be confused about which module, string or pkg.string, is intended to be used. Python users soon learned not to duplicate the names of standard library modules in the names of their packages' submodules, but you can't protect against having your submodule's name being used for a new module added in a future version of Python.

Python 2.5 では import の振る舞いを、 from __future__ import absolute_import ディレクティブを使うことによって絶対インポートに切り替えられます。この絶対インポートの振る舞いは将来バージョンの Python でデフォルトになります(おそらく Python 2.7 で)。(訳注: ご承知のとおり Python 2.7 でもこれはデフォルトとはなっていません。Python 3.x への移行のためにも 2.5 - 2.7 では from __future__ import absolute_import を使うべきです。) 絶対インポートをデフォルトにしてしまえば、 import string はいつでも標準ライブラリ版を見つけます。ユーザは可能な限り絶対インポートを使い始めるべきで、あなたのコードでは from pkg import string と書き始めることが望まれます。

相対インポートは、 from ... import 形式を使う際にモジュール名の前にピリオドを付ければまだ可能です:

# Import names from pkg.string
from .string import name1, name2
# Import pkg.string
from . import string

This imports the string module relative to the current package, so in pkg.main this will import name1 and name2 from pkg.string. Additional leading periods perform the relative import starting from the parent of the current package. For example, code in the A.B.C module can do:

from . import D                 # Imports A.B.D
from .. import E                # Imports A.E
from ..F import G               # Imports A.F.G

このピリオドによる書き方は import modname 形式のインポート文では使えず、 from ... import 形式でのみ使えます。

参考

PEP 328 - マルチラインインポートと、絶対/相対インポート: PEP 著 Aahz; 実装 Thomas Wouters。
https://pylib.readthedocs.io/: The py library by Holger Krekel, which contains the py.std package.

PEP 338 - モジュールをスクリプトとして実行する¶

Python 2.4 で追加されたモジュールをスクリプトとして実行するための -m スイッチが、少し能力を高めました。Python インタプリタ内での C 実装に代わり、これの実現には新しいモジュール runpy を使うようになっています。

The runpy module implements a more sophisticated import mechanism so that it's now possible to run modules in a package such as pychecker.checker. The module also supports alternative import mechanisms such as the zipimport module. This means you can add a .zip archive's path to sys.path and then use the -m switch to execute code from the archive.

参考

PEP 338 - モジュールをスクリプトとして実行する: PEP 著と実装 Nick Coghlan.

PEP 341: try/except/finally の一体化¶

Until Python 2.5, the try statement came in two flavours. You could use a finally block to ensure that code is always executed, or one or more except blocks to catch specific exceptions. You couldn't combine both except blocks and a finally block, because generating the right bytecode for the combined version was complicated and it wasn't clear what the semantics of the combined statement should be.

Guido van Rossum はいくらかの時間 Java での作業に時間を割きました。それは except ブロックと finally ブロックを組み合わせるのと等価なものをサポートしています。そしてこの作業が、このステートメントがどういう意味であるべきかを明らかにしたのです。Python 2.5 では、あなたは今ではこう書くことが出来ます:

try:
    block-1 ...
except Exception1:
    handler-1 ...
except Exception2:
    handler-2 ...
else:
    else-block
finally:
    final-block

The code in block-1 is executed. If the code raises an exception, the various except blocks are tested: if the exception is of class Exception1, handler-1 is executed; otherwise if it's of class Exception2, handler-2 is executed, and so forth. If no exception is raised, the else-block is executed.

以前に何が起こったのかに拠らず、コードブロック完了時やどれかの例外が処理されれば final-block が実行されます。例外ハンドラ内や else-block でエラーが起こったり、新しく例外を投げる場合でさえ、 final-block 内のコードはそれでも実行されます。

参考

PEP 341 - try-except と try-finally を一体化する: PEP 著 Georg Brandl; 実装 Thomas Lee。

PEP 342: ジェネレータの新機能¶

Python 2.5 は、ジェネレータ に向けて 値を渡すための単純な手段を追加しました。Python 2.3 でジェネレータが導入された時点ではジェネレータが出来たのは出力だけでした。いったんジェネレータのコードが呼び出されてイテレータが作られたあとは、新しい情報をそのジェネレータ関数の再開位置に渡す手段はありませんでした。ヤクザな解決法としては、ジェネレータがグローバル変数を見るようにしたり、ミュータブルなオブジェクトを渡しておいて呼び出し元であとからそれを変更したり、といったものがありました。

基本的なジェネレータについて思い出してもらいましょう、単純な例はこんなです:

def counter (maximum):
    i = 0
    while i < maximum:
        yield i
        i += 1

When you call counter(10), the result is an iterator that returns the values from 0 up to 9. On encountering the yield statement, the iterator returns the provided value and suspends the function's execution, preserving the local variables. Execution resumes on the following call to the iterator's next() method, picking up after the yield statement.

In Python 2.3, yield was a statement; it didn't return any value. In 2.5, yield is now an expression, returning a value that can be assigned to a variable or otherwise operated on:

val = (yield i)

戻り値に何かする際は、 yield 式の周りには括弧は付けておいたほうがいいと思います、この例でのように。括弧はべつに必要はないのですが、どんな場合に必要かを憶えるよりは、いつでも付けるのが簡単でしょう。

(PEP 342 がその規則を正確に説明していますが、それによると yield 式は、代入式で右辺のトップレベルにあるとき以外はいつも括弧を付ける必要があります。つまり val = yield i とは書けますが、 val = (yield i) + 12 のように演算子があるときは括弧を使わなくてはいけません。)

Values are sent into a generator by calling its send(value) method. The generator's code is then resumed and the yield expression returns the specified value. If the regular next() method is called, the yield returns None.

以下は前のと同じ例ですが、内部カウンタの値の変更を許す修正をしました:

def counter (maximum):
    i = 0
    while i < maximum:
        val = (yield i)
        # If value provided, change counter
        if val is not None:
            i = val
        else:
            i += 1

そしてカウンタ変更の例がこちらです:

>>> it = counter(10)
>>> print it.next()
0
>>> print it.next()
1
>>> print it.send(8)
8
>>> print it.next()
9
>>> print it.next()
Traceback (most recent call last):
  File "t.py", line 15, in ?
    print it.next()
StopIteration

yield will usually return None, so you should always check for this case. Don't just use its value in expressions unless you're sure that the send() method will be the only method used to resume your generator function.

In addition to send(), there are two other new methods on generators:

throw(type, value=None, traceback=None) はジェネレータ内で例外を投げるために使います; その例外はジェネレータの実行が停止したところの yield 式によって投げられます。
close() raises a new GeneratorExit exception inside the generator to terminate the iteration. On receiving this exception, the generator's code must either raise GeneratorExit or StopIteration. Catching the GeneratorExit exception and returning a value is illegal and will trigger a RuntimeError; if the function raises some other exception, that exception is propagated to the caller. close() will also be called by Python's garbage collector when the generator is garbage-collected.

GeneratorExit が起こったときにクリーンアップ作業をする必要があるなら、 GeneratorExit を捕捉するのではなく try: ... finaly: するようお勧めします。

これらの変更の合わせ技で、ジェネレータは情報の一方的な生産者から、生産者かつ消費者という存在に変貌を遂げたのです。

ジェネレータは コルーチン という、より一般化された形式のサブルーチンにもなります。サブルーチンは一カ所 (関数の冒頭) から入って別の一カ所 (return 文) から出るだけですが、コルーチンはいろいろな場所 (yield 文) から入ったり出たり再開したりできるのです。わたしたちは Python でコルーチンを効果的に使うためのパターンを理解する必要があります。

The addition of the close() method has one side effect that isn't obvious. close() is called when a generator is garbage-collected, so this means the generator's code gets one last chance to run before the generator is destroyed. This last chance means that try...finally statements in generators can now be guaranteed to work; the finally clause will now always get a chance to run. The syntactic restriction that you couldn't mix yield statements with a try...finally suite has therefore been removed. This seems like a minor bit of language trivia, but using generators and try...finally is actually necessary in order to implement the with statement described by PEP 343. I'll look at this new statement in the following section.

Another even more esoteric effect of this change: previously, the gi_frame attribute of a generator was always a frame object. It's now possible for gi_frame to be None once the generator has been exhausted.

参考

PEP 342 - 拡張されたジェネレータを用いたコルーチン

PEP は Guido van Rossum と Phillip J. Eby によって書かれ、Phillip J. Eby によって実装されました。コルーチンとしてのジェネレータの用法のいくつかの手が込んだ例が含まれています。

この機能の初期バージョンは Raymond Hettinger による PEP 288 と Samuele Pedroni による PEP 325 で提案されました。

https://en.wikipedia.org/wiki/Coroutine

コルーチンに関する WikiPedia エントリ。

https://web.archive.org/web/20160321211320/http://www.sidhe.org/~dan/blog/archives/000178.html

Dan Sugalski による、Perl の視点からみたコルーチンの説明です。

PEP 343: "with" ステートメント¶

'with' ステートメントは、以前なら後片付けが実行されるのを確実にするために try...finally ブロックを使ったであろうようなコードを、より単純明快にします。このセクションでは、このステートメントの普通の使い方を説明します。続くセクションでは実装の詳細を調べ、このステートメントとともに使うためにオブジェクトをどうやって書けば良いかをお見せします。

'with' ステートメントは基本構造が以下となる新しい制御フロー構造です:

with expression [as variable]:
    with-block

The expression is evaluated, and it should result in an object that supports the context management protocol (that is, has __enter__() and __exit__() methods.

The object's __enter__() is called before with-block is executed and therefore can run set-up code. It also may return a value that is bound to the name variable, if given. (Note carefully that variable is not assigned the result of expression.)

After execution of the with-block is finished, the object's __exit__() method is called, even if the block raised an exception, and can therefore run clean-up code.

Python 2.5 でこのステートメントを有効にするには、以下のディレクティブをあなたのモジュールに追加する必要があります:

from __future__ import with_statement

Python 2.6 ではこれは常に有効になります。

いくつかの Python 標準オブジェクトが既にコンテキスト管理プロトコルをサポートしていて、 'with' とともに使えます。ファイルオブジェクトがその一例です:

with open('/etc/passwd', 'r') as f:
    for line in f:
        print line
        ... more processing code ...

このステートメントが実行し終わったあかつきには、 f のファイルオブジェクトは、たとえ for ループが道半ばにして例外と成り果てても、自動的にクローズされます。

注釈

In this case, f is the same object created by open(), because __enter__() returns self.

threading モジュールのロック・条件変数でも 'with' ステートメントの恩恵にあずかれます:

lock = threading.Lock()
with lock:
    # Critical section of code
    ...

ブロックが実行される前にロックが獲得されて、ブロックが完了するやいなや必ず解放されます。

The new localcontext() function in the decimal module makes it easy to save and restore the current decimal context, which encapsulates the desired precision and rounding characteristics for computations:

from decimal import Decimal, Context, localcontext

# Displays with default precision of 28 digits
v = Decimal('578')
print v.sqrt()

with localcontext(Context(prec=16)):
    # All code in this block uses a precision of 16 digits.
    # The original context is restored on exiting the block.
    print v.sqrt()

コンテキストマネージャを書く¶

中身を紐解いてみれば、 'with' ステートメントはけっこう入り組んでいます。ほとんどの人にとっては、既存のオブジェクトを 'with' とともに使うだけのことでその詳細を知る必要は無いので、それで良いならこのセクションの残りの部分は読み飛ばして結構です。新しいオブジェクトの作者は基礎となる実装の詳細について知る必要があるので、このまま読み進めるべきです。

コンテキスト管理プロトコルの高度な説明はこんなです:

The expression is evaluated and should result in an object called a "context manager". The context manager must have __enter__() and __exit__() methods.
The context manager's __enter__() method is called. The value returned is assigned to VAR. If no 'as VAR' clause is present, the value is simply discarded.
BLOCK 内のコードが実行されます。
BLOCK が例外を起こした場合、コンテキストマネージャの __exit__ メソッドが 3 つの引数とともに呼び出されます。これは例外の詳細です (type, value, traceback, これは sys.exc_info() が返す値と同じで、例外が起こらなければ None です)。そのメソッドの戻り値は例外を再送出するかどうかを制御します: 全ての偽の値ではその例外が再送出され、 True では揉み消します。例外を揉み消すなど滅多なことでは欲しいと思わないでしょう。もしそうしてしまったら、 'with' ステートメントを含んだコードの作者は何か間違ったことが起こったことに決して気付かないですから。
If BLOCK didn't raise an exception, the __exit__() method is still called, but type, value, and traceback are all None.

例を通じて考えましょう。枝葉末節を含んだ完璧なコードを提示しようとは思いませんが、データベースのためにトランザクションをサポートするのに必要となるメソッドの書き方についてスケッチしてみようと思います。

(データベース用語に不慣れな方へ:データベースへの変更のセットは、トランザクションという単位でグループ化されています。トランザクションは「コミット」される、その意味は、全ての変更がデータベースに書き込まれることです、もしくは「ロールバック」される、この場合全ての変更が捨てられてデータベースが変更されません、この 2 つのいずれかになりえます。詳しくはなにかデータベースの著述を読んで下さい。)

データベース接続を表現するオブジェクトがあると仮定しましょう。私たちの目標は、そのオブジェクトのユーザがこのように書けるようになることです:

db_connection = DatabaseConnection()
with db_connection as cursor:
    cursor.execute('insert into ...')
    cursor.execute('delete from ...')
    # ... more operations ...

The transaction should be committed if the code in the block runs flawlessly or rolled back if there's an exception. Here's the basic interface for DatabaseConnection that I'll assume:

class DatabaseConnection:
    # Database interface
    def cursor (self):
        "Returns a cursor object and starts a new transaction"
    def commit (self):
        "Commits current transaction"
    def rollback (self):
        "Rolls back current transaction"

The __enter__() method is pretty easy, having only to start a new transaction. For this application the resulting cursor object would be a useful result, so the method will return it. The user can then add as cursor to their 'with' statement to bind the cursor to a variable name.

class DatabaseConnection:
    ...
    def __enter__ (self):
        # Code to start a new transaction
        cursor = self.cursor()
        return cursor

The __exit__() method is the most complicated because it's where most of the work has to be done. The method has to check if an exception occurred. If there was no exception, the transaction is committed. The transaction is rolled back if there was an exception.

下記のコード内では実行がメソッドの末尾まで落ちていって、なのでデフォルトの None 返却になります。 None は偽なので、例外は自動的に再送出されます。望むならもっと明示的に、コメントでマークした部分で return 文を書いてもよろしいです:

class DatabaseConnection:
    ...
    def __exit__ (self, type, value, tb):
        if tb is None:
            # No exception, so commit
            self.commit()
        else:
            # Exception occurred, so rollback.
            self.rollback()
            # return False

contextlib モジュール¶

新しい contextlib モジュールは、 'with' ステートメントで使えるオブジェクトを書く際に便利ないくつかの関数とデコレータを提供しています。

The decorator is called @~contextlib.contextmanager, and lets you write a single generator function instead of defining a new class. The generator should yield exactly one value. The code up to the yield will be executed as the __enter__() method, and the value yielded will be the method's return value that will get bound to the variable in the 'with' statement's as clause, if any. The code after the yield will be executed in the __exit__() method. Any exception raised in the block will be raised by the yield statement.

このデコレータを使って、前セクションの私たちのデータベースの例はこのように書けます:

from contextlib import contextmanager

@contextmanager
def db_transaction (connection):
    cursor = connection.cursor()
    try:
        yield cursor
    except:
        connection.rollback()
        raise
    else:
        connection.commit()

db = DatabaseConnection()
with db_transaction(db) as cursor:
    ...

contextlib モジュールには nested(mgr1, mgr2, ...) 関数もあり、この関数はたくさんのコンテキストマネージャを組み合わせることができて、入れ子の 'with' を書く必要性をなくしてくれます。この例では、単一の 'with' でデータベーストランザクション開始とスレッドのロック獲得の両方をやってのけています:

lock = threading.Lock()
with nested (db_transaction(db), lock) as (cursor, locked):
    ...

最後になりますが、 closing(object) 関数は object をそのまま返して変数に束縛出来るようにし、かつブロックの終了で、与えた引数が持つ object.close を呼び出します:

import urllib, sys
from contextlib import closing

with closing(urllib.urlopen('http://www.yahoo.com')) as f:
    for line in f:
        sys.stdout.write(line)

参考

PEP 343 - "with" ステートメント: PEP は Guido van Rossum と Nick Coghlan によって書かれ、Mike Bland、 Guido van Rossum、Neal Norwitz により実装されました。この PEP は 'with' ステートメントによって生成されるコードを見せてくれるので、このステートメントがどうやって動作するのかを知るのに役立ちます。

contextlib モジュールについてのドキュメント。

PEP 352: 例外の新スタイルクラス化¶

例外クラスは、今では旧スタイルクラスだけではなく新スタイルクラスになれます。組み込みの Exception クラスと全ての標準組み込み例外 (NameError, ValueError, など) が今では新スタイルクラスです。

例外の継承階層が少し再編成されました。2.5 では継承関係はこのようになっています (---訳注: 2.6 ではさらに GeneratorExit が Exception ではなく BaseExceotion の直接の子になり、これは 3.x でも引き継がれています。---):

BaseException       # New in Python 2.5
|- KeyboardInterrupt
|- SystemExit
|- Exception
   |- (all other current built-in exceptions)

この再編成が行われたのは、人々がしばしばプログラムのエラーを示す例外すべてを捕捉したがるからです。 KeyboardInterrupt と SystemExit はエラーではありませんが、普通は Control-C をユーザが叩いたか、コードが sys.exit() を呼び出したかのような明示的なアクションを表します。剥き出しの except: はまさに全ての例外を捕捉しますから、一般に、再送出のために KeyboardInterrupt と SystemExit はリストする必要があります。お決まりのパターンはこんなです:

try:
    ...
except (KeyboardInterrupt, SystemExit):
    raise
except:
    # Log error...
    # Continue running program...

Python 2.5 では同じ結果を得るのにもう except Exception と書いても良いです。これは普通はエラーを示す例外全てを捕捉しつつも KeyboardInterrupt と SystemExit は置き去りにします。無論かつてまでのバージョン同様、剥き身の except: は全例外を捕捉します。

Python 3.0 での目標は、例外として raise される全てのクラスが BaseException からの、もしくは BaseException のある子孫であることを要求することであり、Python 2.x シリーズの将来のリリースにおいてはこの制約の強制を始めるかもしれません。ですからあなたの全ての例外は、今からすぐに Exception 派生にするのが肝要です。捕捉例外を特定しない except: は Python 3.0 から取り除かれるべきであることが提案されていますが、Guido van Rossum はこれをどうするかまだ決めていません。 (---訳注: 2.7 では raise は「新スタイルクラスであれば」 BaseException 派生であることを要求されますが、旧スタイルクラスであればなんでも許されてしまう点ではあまり変わっていません。これについての警告は -3 などで受けることが出来ます。Python 3.0 でようやく全ての例外が BaseException 派生であることが強制されるようになりました。また、 except: は 3.4 でも残っています。---)

例外として文字列を raise "Error occurred" として raise することは Python 2.5 では非推奨であり警告を引き起こします。その目的は、あと少しのリリースで文字列例外を削除することです。 (---訳注: 2.6 以降の What's New では明示的に書かれてはいませんが、2.7 では警告ではなくエラーです。---)

参考

PEP 352 - 例外のためのスーパークラス: PEP 著: Brett Cannon と Guido van Rossum; 実装: Brett Cannon.

PEP 353: 添え字型に ssize_t を使う¶

A wide-ranging change to Python's C API, using a new Py_ssize_t type definition instead of int, will permit the interpreter to handle more data on 64-bit platforms. This change doesn't affect Python's capacity on 32-bit platforms.

Various pieces of the Python interpreter used C's int type to store sizes or counts; for example, the number of items in a list or tuple were stored in an int. The C compilers for most 64-bit platforms still define int as a 32-bit type, so that meant that lists could only hold up to 2**31 - 1 = 2147483647 items. (There are actually a few different programming models that 64-bit C compilers can use -- see https://unix.org/version2/whatsnew/lp64_wp.html for a discussion -- but the most commonly available model leaves int as 32 bits.)

2147483647 に要素数が制限されることは 32 ビットプラットフォームではあまり重大でもありません。その長さ制限を喰らう前にメモリを使い果たすでしょうから。それぞれのリストアイテムは、ポインタのための領域 (4 バイト) をアイテムを表現する PyObject に余分に必要とします。2147483647*4 は既に 32 ビットのアドレス空間が含められるバイト数を超えています。(---訳注: 意味不明。何か誤解があるような…? 4 バイトは 2147483647 (符号付きとして 2^31 - 1) でしょう。4 を掛ける説明なんかここでまったくされていない。なぜ掛ける? それと要素のインデクシングとアイテムの必要領域は議論には無関係なのでは?---)

It's possible to address that much memory on a 64-bit platform, however. The pointers for a list that size would only require 16 GiB of space, so it's not unreasonable that Python programmers might construct lists that large. Therefore, the Python interpreter had to be changed to use some type other than int, and this will be a 64-bit type on 64-bit platforms. The change will cause incompatibilities on 64-bit machines, so it was deemed worth making the transition now, while the number of 64-bit users is still relatively small. (In 5 or 10 years, we may all be on 64-bit machines, and the transition would be more painful then.)

この変更は C 拡張モジュールの著者に大変大きく影響します。Python 文字列とリストやタプルのようなコンテナ型は、そのサイズを表すのに Py_ssize_t を使うようになっています。 PyList_Size() のような関数は Py_ssize_t を返します。拡張モジュール内のコードでは、ですので、ある程度の変数の Py_ssize_t への変更が必要でしょう。

The PyArg_ParseTuple() and Py_BuildValue() functions have a new conversion code, n, for Py_ssize_t. PyArg_ParseTuple()'s s# and t# still output int by default, but you can define the macro PY_SSIZE_T_CLEAN before including Python.h to make them return Py_ssize_t.

PEP 353 には、拡張の著者が読んで学ぶべき 64 ビットプラットフォームサポートについてのセクションがあります。

参考

PEP 353 - 添え字型に ssize_t を使う: PEP 著と実装 Martin von Löwis.

PEP 357: 'index' メソッド¶

The NumPy developers had a problem that could only be solved by adding a new special method, __index__(). When using slice notation, as in [start:stop:step], the values of the start, stop, and step indexes must all be either integers or long integers. NumPy defines a variety of specialized integer types corresponding to unsigned and signed integers of 8, 16, 32, and 64 bits, but there was no way to signal that these types could be used as slice indexes.

Slicing can't just use the existing __int__() method because that method is also used to implement coercion to integers. If slicing used __int__(), floating-point numbers would also become legal slice indexes and that's clearly an undesirable behaviour.

Instead, a new special method called __index__() was added. It takes no arguments and returns an integer giving the slice index to use. For example:

class C:
    def __index__ (self):
        return self.value

戻り値は Python 整数または長整数でなければなりません。インタプリタは戻り値が正しいかどうかをチェックし、要求に合わない場合に TypeError を送出します。

A corresponding nb_index slot was added to the C-level PyNumberMethods structure to let C extensions implement this protocol. PyNumber_Index(obj) can be used in extension code to call the __index__() function and retrieve its result.

参考

PEP 357 - スライシングのために任意のオブジェクトを使えるようにする: PEP 著と実装 Travis Oliphant.

その他の言語変更¶

以下が、Python 2.5 言語コアに加えられた全ての変更点です。

The dict type has a new hook for letting subclasses provide a default value when a key isn't contained in the dictionary. When a key isn't found, the dictionary's __missing__(key) method will be called. This hook is used to implement the new defaultdict class in the collections module. The following example defines a dictionary that returns zero for any missing key:
```
class zerodict (dict):
    def __missing__ (self, key):
        return 0

d = zerodict({1:1, 2:2})
print d[1], d[2]   # Prints 1, 2
print d[3], d[4]   # Prints 0, 0
```
8 ビット文字列、Unicode ともに、ありがちなユースケースを単純化する partition(sep) と rpartition(sep) が追加されました。

文字列のスライスに使うために find(S) メソッドを使ってインデクスを得るのだけれども、やりたいのはそのセパレータ(S)の前後の部分文字列を得ることだ、といったことはしょっちゅう必要になります。partition(sep) はこのパターンのコードを、一撃、に圧縮してくれます。これはセパレータ前の部分文字列、セパレータ自身、セパレータ後の部分文字列の 3 要素タプルを返します。セパレータが含まれない場合は、返却値の最初の要素に文字列全体が、残り2つは空文字列で返ります。rpartition(sep) もほぼ同じことをしますがセパレータの検索をお尻から始めます。 r は逆順('reverse')を表します。

いくつかの例です:
```
>>> ('http://www.python.org').partition('://')
('http', '://', 'www.python.org')
>>> ('file:/usr/share/doc/index.html').partition('://')
('file:/usr/share/doc/index.html', '', '')
>>> (u'Subject: a quick question').partition(':')
(u'Subject', u':', u' a quick question')
>>> 'www.python.org'.rpartition('.')
('www.python', '.', 'org')
>>> 'www.python.org'.rpartition(':')
('', '', 'www.python.org')
```
(Implemented by Fredrik Lundh following a suggestion by Raymond Hettinger.)
The startswith() and endswith() methods of string types now accept tuples of strings to check for.
```
def is_image_file (filename):
    return filename.endswith(('.gif', '.jpg', '.tiff'))
```
(Implemented by Georg Brandl following a suggestion by Tom Lynn.)
The min() and max() built-in functions gained a key keyword parameter analogous to the key argument for sort(). This parameter supplies a function that takes a single argument and is called for every value in the list; min()/max() will return the element with the smallest/largest return value from this function. For example, to find the longest string in a list, you can do:
```
L = ['medium', 'longest', 'short']
# Prints 'longest'
print max(L, key=len)
# Prints 'short', because lexicographically 'short' has the largest value
print max(L)
```
(Contributed by Steven Bethard and Raymond Hettinger.)
2 つの新たなビルトイン関数 any() と all() は、イテレータがいくつか真を含むかどうかを評価します。 any() はイテレータのいずれかが返す値が真と評価される場合に True を返し、そうでなければ False を返します。 all() はイテレータが返す値の評価が真のみの場合に True を返します。 (Suggested by Guido van Rossum, and implemented by Raymond Hettinger.)
The result of a class's __hash__() method can now be either a long integer or a regular integer. If a long integer is returned, the hash of that value is taken. In earlier versions the hash value was required to be a regular integer, but in 2.5 the id() built-in was changed to always return non-negative numbers, and users often seem to use id(self) in __hash__() methods (though this is discouraged).
モジュールのデフォルトエンコーディングが ASCII になりました(訳注: Python 3 からはデフォルトは utf-8 (PEP 3120))。8 ビット文字を含んでいるのにエンコーディング宣言がないモジュールが構文エラーになるようになりました。Python 2.4 では構文エラーとはならず警告でした。モジュールのエンコーディングをする方法については PEP 263 を参照してください; 例えば、ソースコードの先頭付近にこのような行を追加することで出来ます:
```
# -*- coding: latin1 -*-
```
Unicode 文字列とデフォルトの ASCII エンコーディングで Unicode に変換出来ない 8 ビット文字列との比較を試みると、新しい警告 UnicodeWarning が引き起こされるようになっています。比較結果は偽です:
```
>>> chr(128) == unichr(128)   # Can't convert chr(128) to Unicode
__main__:1: UnicodeWarning: Unicode equal comparison failed
  to convert both arguments to Unicode - interpreting them
  as being unequal
False
>>> chr(127) == unichr(127)   # chr(127) can be converted
True
```
以前はこれは UnicodeDecodeError を起こしていましたので、2.5 でのこれは辞書のアクセス時に悩ましい問題をもたらしました。キーとして使われている unichr(128) や chr(128) を探そうとすると UnicodeDecodeError 例外が起こっていたでしょう。2.5 での、辞書を実装している dictobject.c のほかの変更により、これは揉み消すのではなく、引き続き例外を起こします(訳注: 原文では this exception といっているので UnicodeDecodeError が起こるように読めるのですが、2.5 時点での振る舞いは確認出来ていませんが 2.7 では UnicodeWarning とともに KeyError が発生する、のようになるはずです。2.6 以降の What's New でのこれに関する言及はありませんが変更があったのかもしれません)。

この種の比較で例外を起こすのは完全に正しいことではあるものの、その変更はコードを破壊するかもしれないので、必要に応じて警告に出来るよう導入されたのが UnicodeWarning です。

(Implemented by Marc-André Lemburg.)
Python プログラマがよくしでかす間違いとして、パッケージディレクトリに __init__.py を入れ忘れる、というものがあります。この間違いのデバッグはややこしく、大抵 Python を -v スイッチ付きで起動して、パス検索全てをログ出力するハメになります。Python 2.5 ではパッケージディレクトリに __init__.py がないことが検出されると新たに ImportWarning 警告を出すようになりました。デフォルトではこの警告は黙って無視されます; これを出したければ、 -Wd スイッチをつけて Python を起動してください。(Implemented by Thomas Wouters.)
クラス定義において、基底クラスのリストを空リストに出来るようになっています(訳注: object を派生しないので結果として旧スタイルクラスになります)。例えば以下は今や合法です:
```
class C():
    pass
```
(Implemented by Brett Cannon.)

対話的なインタプリタの変更¶

対話的なインタプリタ内では、 quit と exit はずっと、立ち去ろうと(quit しようと)試みてそうタイプしてしまうユーザにとってある種有用な文字列になっていました:

>>> quit
'Use Ctrl-D (i.e. EOF) to exit.'

Python 2.5 でも quit と exit はそれ自身そのような文字列を生成はしますが、同時に「呼び出す」ことが出来るオブジェクトになっています。初心者が思わず quit() や exit() してしまった場合、今では彼らが期待する通りにインタプリタが終了します。 (Implemented by Georg Brandl.)

Python 実行形式ファイルが標準的な長い形式の --help と --version を受け付けるようになりました。Windows の場合はヘルプメッセージを出力するのに /? オプションも受け付けます。(Implemented by Georg Brandl.)

最適化¶

この最適化のいくつかは、2006年5月21日から28日にかけてアイスランドのレイキャビクで開催されたイベント、NeedForSpeed sprintで開発されました。このスプリントはCPythonの実装のスピードアップに焦点を当て、EWT LLCが資金を提供し、CCP Gamesが現地でサポートしました。このスプリントで追加された最適化については、以下のリストで特別にマークしています。

Python 2.4 が導入されたとき、組み込みの set と frozenset 型は Python の辞書型の上に構築されたものでした。 Python 2.5 では、内部データ構造がセットを実装するためにカスタマイズされ、その結果、セットのメモリ使用量は3分の1になり、多少高速化されました。(Raymond Hettinger による実装。)
部分文字列の検索、文字列の分割、文字マップのエンコードとデコードなど、いくつかの Unicode 操作の速度が改善されました。 (部分文字列の検索と分割は、NeedForSpeed スプリントで Fredrik Lundh と Andrew Dalke が追加しました。部分文字列の検索と分割の改良は、NeedForSpeed sprintでFredrik LundhとAndrew Dalkeによって追加されました。文字マップは Walter Dörwald と Martin von Löwis によって改善されました。)
The long(str, base) function is now faster on long digit strings because fewer intermediate results are calculated. The peak is for strings of around 800--1000 digits where the function is 6 times faster. (Contributed by Alan McIntyre and committed at the NeedForSpeed sprint.)
It's now illegal to mix iterating over a file with for line in file and calling the file object's read()/readline()/readlines() methods. Iteration uses an internal buffer and the read*() methods don't use that buffer. Instead they would return the data following the buffer, causing the data to appear out of order. Mixing iteration and these methods will now trigger a ValueError from the read*() method. (Implemented by Thomas Wouters.)
The struct module now compiles structure format strings into an internal representation and caches this representation, yielding a 20% speedup. (Contributed by Bob Ippolito at the NeedForSpeed sprint.)
re モジュールは、システムの malloc() と free() の代わりに Python のアロケータ関数に切り替えることで、1～2%のスピードアップを達成しました。(NeedForSpeed sprintにてJack Diederichによるコントリビュート)
コードジェネレータの peephole オプティマイザは式の中で簡単な定数の折りたたみを実行するようになりました。 a = 2+3 のように書くと、コードジェネレータは算術演算を行い、 a = 5 に対応するコードを生成します。 (Raymond Hettinger による提案と実装。)
Function calls are now faster because code objects now keep the most recently finished frame (a "zombie frame") in an internal field of the code object, reusing it the next time the code object is invoked. (Original patch by Michael Hudson, modified by Armin Rigo and Richard Jones; committed at the NeedForSpeed sprint.) Frame objects are also slightly smaller, which may improve cache locality and reduce memory usage a bit. (Contributed by Neal Norwitz.)
Python's built-in exceptions are now new-style classes, a change that speeds up instantiation considerably. Exception handling in Python 2.5 is therefore about 30% faster than in 2.4. (Contributed by Richard Jones, Georg Brandl and Sean Reifschneider at the NeedForSpeed sprint.)
Importing now caches the paths tried, recording whether they exist or not so that the interpreter makes fewer open() and stat() calls on startup. (Contributed by Martin von Löwis and Georg Brandl.)

新たなモジュール、改良されたモジュール、削除されたモジュール¶

標準ライブラリは Python 2.5 で数多くの拡張とバグフィックスを行っています。ここでは注目に値する変更をモジュール名の辞書順で列挙します。変更についてのもっと完全なリストが見たければソースツリーの Misc/NEWS を調べるか、全ての詳細について SVN ログに目を通してみてください。

The audioop module now supports the a-LAW encoding, and the code for u-LAW encoding has been improved. (Contributed by Lars Immisch.)
The codecs module gained support for incremental codecs. The codecs.lookup() function now returns a CodecInfo instance instead of a tuple. CodecInfo instances behave like a 4-tuple to preserve backward compatibility but also have the attributes encode, decode, incrementalencoder, incrementaldecoder, streamwriter, and streamreader. Incremental codecs can receive input and produce output in multiple chunks; the output is the same as if the entire input was fed to the non-incremental codec. See the codecs module documentation for details. (Designed and implemented by Walter Dörwald.)
The collections module gained a new type, defaultdict, that subclasses the standard dict type. The new type mostly behaves like a dictionary but constructs a default value when a key isn't present, automatically adding it to the dictionary for the requested key value.

The first argument to defaultdict's constructor is a factory function that gets called whenever a key is requested but not found. This factory function receives no arguments, so you can use built-in type constructors such as list() or int(). For example, you can make an index of words based on their initial letter like this:
```
words = """Nel mezzo del cammin di nostra vita
mi ritrovai per una selva oscura
che la diritta via era smarrita""".lower().split()

index = defaultdict(list)

for w in words:
    init_letter = w[0]
    index[init_letter].append(w)
```
index を出力するとこんな具合です:
```
defaultdict(<type 'list'>, {'c': ['cammin', 'che'], 'e': ['era'],
        'd': ['del', 'di', 'diritta'], 'm': ['mezzo', 'mi'],
        'l': ['la'], 'o': ['oscura'], 'n': ['nel', 'nostra'],
        'p': ['per'], 's': ['selva', 'smarrita'],
        'r': ['ritrovai'], 'u': ['una'], 'v': ['vita', 'via']}
```
(Contributed by Guido van Rossum.)
The deque double-ended queue type supplied by the collections module now has a remove(value) method that removes the first occurrence of value in the queue, raising ValueError if the value isn't found. (Contributed by Raymond Hettinger.)
新規モジュール: contextlib モジュールには 'with' ステートメントで使えるヘルパ関数が含まれています。詳細は contextlib モジュールをみてください。
New module: The cProfile module is a C implementation of the existing profile module that has much lower overhead. The module's interface is the same as profile: you run cProfile.run('main()') to profile a function, can save profile data to a file, etc. It's not yet known if the Hotshot profiler, which is also written in C but doesn't match the profile module's interface, will continue to be maintained in future versions of Python. (Contributed by Armin Rigo.)

Also, the pstats module for analyzing the data measured by the profiler now supports directing the output to any file object by supplying a stream argument to the Stats constructor. (Contributed by Skip Montanaro.)
The csv module, which parses files in comma-separated value format, received several enhancements and a number of bugfixes. You can now set the maximum size in bytes of a field by calling the csv.field_size_limit(new_limit) function; omitting the new_limit argument will return the currently set limit. The reader class now has a line_num attribute that counts the number of physical lines read from the source; records can span multiple physical lines, so line_num is not the same as the number of records read.

CSV パーサは引用符内のマルチラインをより厳密に扱うようになっています。以前は、引用符で囲まれたフィールドが改行文字(newline)で終了することなく終端すると、改行文字が返却フィールドに追加されていました。この振る舞いはフィールドに復帰文字(carriage return)を含んだファイルを読み込む際に問題だったため、改行文字を挿入することなくフィールドを返すように変更されました。このことにより、フィールド内に埋め込まれた改行文字が重要な場合、入力は改行文字が保存される方法で行分割されるはずです。(---訳注: リファレンスも含めこの説明の表現がわかりにくいです。行指向で読み込む場合改行コードのモード(NL, CR NL, CR)依存で「物理行が終了した」とみなされるわけですが、このモード依存(CR 時)の振る舞いがなくなった、ということです。---)

(Contributed by Skip Montanaro and Andrew McNamara.)
The datetime class in the datetime module now has a strptime(string, format) method for parsing date strings, contributed by Josh Spoerri. It uses the same format characters as time.strptime() and time.strftime():
```
import datetime as dt

ts = dt.datetime.strptime('10:13:15 2006-03-07',
                          '%H:%M:%S %Y-%m-%d')
```
The difflib.SequenceMatcher.get_matching_blocks() method in the difflib module now guarantees to return a minimal list of blocks describing matching subsequences. Previously, the algorithm would occasionally break a block of matching elements into two list entries. (Enhancement by Tim Peters.)
doctest モジュールに、実行されるべきものから例を守る SKIP オプションが追加されています。これは、読者のための使用例として意図しているけれども実際のテストケースとして意図していないようなコードスニペットのために用意されました。

An encoding parameter was added to the testfile() function and the DocFileSuite class to specify the file's encoding. This makes it easier to use non-ASCII characters in tests contained within a docstring. (Contributed by Bjorn Tillenius.)
email パッケージがバージョン 4.0 にアップデートされました (Contributed by Barry Warsaw.)
The fileinput module was made more flexible. Unicode filenames are now supported, and a mode parameter that defaults to "r" was added to the input() function to allow opening files in binary or universal newlines mode. Another new parameter, openhook, lets you use a function other than open() to open the input files. Once you're iterating over the set of files, the FileInput object's new fileno() returns the file descriptor for the currently opened file. (Contributed by Georg Brandl.)
In the gc module, the new get_count() function returns a 3-tuple containing the current collection counts for the three GC generations. This is accounting information for the garbage collector; when these counts reach a specified threshold, a garbage collection sweep will be made. The existing gc.collect() function now takes an optional generation argument of 0, 1, or 2 to specify which generation to collect. (Contributed by Barry Warsaw.)

The nsmallest() and nlargest() functions in the heapq module now support a key keyword parameter similar to the one provided by the min()/max() functions and the sort() methods. For example:

>>> import heapq
>>> L = ["short", 'medium', 'longest', 'longer still']
>>> heapq.nsmallest(2, L)  # Return two lowest elements, lexicographically
['longer still', 'longest']
>>> heapq.nsmallest(2, L, key=len)   # Return two shortest elements
['short', 'medium']

(Contributed by Raymond Hettinger.)

itertools.islice() 関数が start と step として None を受け付けるようになっています。これは slice オブジェクトとの互換性を高めるもので、つまり以下のように書くことが出来ます:
```
s = slice(5)     # Create slice object
itertools.islice(iterable, s.start, s.stop, s.step)
```
(Contributed by Raymond Hettinger.)
The format() function in the locale module has been modified and two new functions were added, format_string() and currency().

format() 関数の val パラメータは以前は一つ以上の %文字指定子が現れる文字列が許されていました; 今はこのパラメータは正確に一つの %文字指定子を含む、周りに囲むテキストのない文字列でなければなりません。省略可能な monetary パラメータも追加されています。これが True の場合、通貨の書式化の 3 桁ごとに区切るセパレータに、ロケールのルールを使います。

To format strings with multiple %char specifiers, use the new format_string() function that works like format() but also supports mixing %char specifiers with arbitrary text.

A new currency() function was also added that formats a number according to the current locale's settings.

(Contributed by Georg Brandl.)
The mailbox module underwent a massive rewrite to add the capability to modify mailboxes in addition to reading them. A new set of classes that include mbox, MH, and Maildir are used to read mailboxes, and have an add(message) method to add messages, remove(key) to remove messages, and lock()/unlock() to lock/unlock the mailbox. The following example converts a maildir-format mailbox into an mbox-format one:
```
import mailbox

# 'factory=None' uses email.Message.Message as the class representing
# individual messages.
src = mailbox.Maildir('maildir', factory=None)
dest = mailbox.mbox('/tmp/mbox')

for msg in src:
    dest.add(msg)
```
(Contributed by Gregory K. Johnson. Funding was provided by Google's 2005 Summer of Code.)
New module: the msilib module allows creating Microsoft Installer .msi files and CAB files. Some support for reading the .msi database is also included. (Contributed by Martin von Löwis.)
The nis module now supports accessing domains other than the system default domain by supplying a domain argument to the nis.match() and nis.maps() functions. (Contributed by Ben Bell.)
The operator module's itemgetter() and attrgetter() functions now support multiple fields. A call such as operator.attrgetter('a', 'b') will return a function that retrieves the a and b attributes. Combining this new feature with the sort() method's key parameter lets you easily sort lists using multiple fields. (Contributed by Raymond Hettinger.)
The optparse module was updated to version 1.5.1 of the Optik library. The OptionParser class gained an epilog attribute, a string that will be printed after the help message, and a destroy() method to break reference cycles created by the object. (Contributed by Greg Ward.)
The os module underwent several changes. The stat_float_times variable now defaults to true, meaning that os.stat() will now return time values as floats. (This doesn't necessarily mean that os.stat() will return times that are precise to fractions of a second; not all systems support such precision.)

Constants named os.SEEK_SET, os.SEEK_CUR, and os.SEEK_END have been added; these are the parameters to the os.lseek() function. Two new constants for locking are os.O_SHLOCK and os.O_EXLOCK.

Two new functions, wait3() and wait4(), were added. They're similar the waitpid() function which waits for a child process to exit and returns a tuple of the process ID and its exit status, but wait3() and wait4() return additional information. wait3() doesn't take a process ID as input, so it waits for any child process to exit and returns a 3-tuple of process-id, exit-status, resource-usage as returned from the resource.getrusage() function. wait4(pid) does take a process ID. (Contributed by Chad J. Schroeder.)

On FreeBSD, the os.stat() function now returns times with nanosecond resolution, and the returned object now has st_gen and st_birthtime. The st_flags attribute is also available, if the platform supports it. (Contributed by Antti Louko and Diego Pettenò.)
pdb モジュールで提供される Python デバッガが、ブレイクポイントに到達して実行が停止する際に実行するコマンドのリストを記憶するようになりました。ブレイクポイント #1 を作ったら、 commands 1 を入力し、実行するコマンド群を入力し、 end でリストを終えます。コマンドリストには continue や next のような実行再開コマンドを含めることが出来ます。 (Contributed by Grégoire Dooms.)
The pickle and cPickle modules no longer accept a return value of None from the __reduce__() method; the method must return a tuple of arguments instead. The ability to return None was deprecated in Python 2.4, so this completes the removal of the feature.
The pkgutil module, containing various utility functions for finding packages, was enhanced to support PEP 302's import hooks and now also works for packages stored in ZIP-format archives. (Contributed by Phillip J. Eby.)
Marc-André Lemburg による pybench ベンチマークスイートが Tools/pybench ディレクトリに含まれるようになりました。pybench スイートは、広く使われている pystone.py プログラムの改善版で、インタプリタの速度についてのより詳しい計測を行います。 pystone.py のように多くの異なる演算を実行したり単独の数字に縮退する代わりに、これは関数コール、タプルのスライス、メソッドの検索、数値演算のような特定の演算を計測します。 (---訳注: Tools についての言及すべてに共通することですが、原則として Tools は「全て」を入手する手段はソースコード配布を利用することだけです。インストールされるものはプラットフォーム依存や linux であればディストリビュータによっても変わります。顕著なのは Windows で、Tools 配下のものはごく限られたものだけがインストールされます。pybench もソースコード配布物にのみ含まれ、Windows 公式インストーラではインストールされません。なお、 pystone.py は Lib/test 内にあります。---)
The pyexpat module now uses version 2.0 of the Expat parser. (Contributed by Trent Mick.)
The Queue class provided by the queue module gained two new methods. join() blocks until all items in the queue have been retrieved and all processing work on the items have been completed. Worker threads call the other new method, task_done(), to signal that processing for an item has been completed. (Contributed by Raymond Hettinger.)
The old regex and regsub modules, which have been deprecated ever since Python 2.0, have finally been deleted. Other deleted modules: statcache, tzparse, whrandom.
Also deleted: the lib-old directory, which includes ancient modules such as dircmp and ni, was removed. lib-old wasn't on the default sys.path, so unless your programs explicitly added the directory to sys.path, this removal shouldn't affect your code.
rlcompleter モジュールが readline モジュールのインポートに依存しないようになりました。これにより非 Unix プラットフォームで動作するようになりました。 (Patch from Robert Kiendl.)
The SimpleXMLRPCServer and DocXMLRPCServer classes now have a rpc_paths attribute that constrains XML-RPC operations to a limited set of URL paths; the default is to allow only '/' and '/RPC2'. Setting rpc_paths to None or an empty tuple disables this path checking.
The socket module now supports AF_NETLINK sockets on Linux, thanks to a patch from Philippe Biondi. Netlink sockets are a Linux-specific mechanism for communications between a user-space process and kernel code; an introductory article about them is at https://www.linuxjournal.com/article/7356. In Python code, netlink addresses are represented as a tuple of 2 integers, (pid, group_mask).

ソケットオブジェクトの 2 つの新たなメソッド、 recv_into(buffer), recvfrom_into(buffer) は、受信データを、文字列データとして返す代わりにバッファプロトコルをサポートするオブジェクトに書き込みます。このことにより、受信データを直接 array やメモリマップドファイルに置けます。

Socket objects also gained getfamily(), gettype(), and getproto() accessor methods to retrieve the family, type, and protocol values for the socket.
New module: the spwd module provides functions for accessing the shadow password database on systems that support shadow passwords.
The struct is now faster because it compiles format strings into Struct objects with pack() and unpack() methods. This is similar to how the re module lets you create compiled regular expression objects. You can still use the module-level pack() and unpack() functions; they'll create Struct objects and cache them. Or you can use Struct instances directly:
```
s = struct.Struct('ih3s')

data = s.pack(1972, 187, 'abc')
year, number, name = s.unpack(data)
```
You can also pack and unpack data to and from buffer objects directly using the pack_into(buffer, offset, v1, v2, ...) and unpack_from(buffer, offset) methods. This lets you store data directly into an array or a memory-mapped file.

(Struct objects were implemented by Bob Ippolito at the NeedForSpeed sprint. Support for buffer objects was added by Martin Blais, also at the NeedForSpeed sprint.)
Python デベロッパは 2.5 開発プロセスの間で CVS から Subversion に移行しました。ビルドバージョンについての正確な情報は sys.subversion により取得可能で、これは 3 要素タプル (interpreter-name, branch-name, revision-range) です。例えば執筆時点では 2.5 で ('CPython', 'trunk', '45313:45315') を返します。 (---訳注: Python 3.2 から 3.3 にかけて、開発が Mercurial に移行していて、これは 2.7 メンテナンスリリースの間にあたります。今(3.6 dev 時点)では全て Mercurial に移行しており、 sys.subversion は今では意味のある値は返しません。3.3 からは既に sys.subversion は削除されています。---)

この情報は C 拡張から Py_GetBuildInfo() 関数を使っても得ることが出来ます。これはビルド情報についての文字列を返し、例えば "trunk:45355:45356M, Apr 13 2006, 07:42:19" のような値です。 (Contributed by Barry Warsaw.)
もう一つの新規関数 sys._current_frames() は、実行スレッドについての現在スタックフレームを、各スレッドの識別子がキーの辞書として返します。辞書の値は、関数が呼ばれた時点のそのスレッドで現在アクティブになっているスタックフレームの一番上です。(Contributed by Tim Peters.)
The TarFile class in the tarfile module now has an extractall() method that extracts all members from the archive into the current working directory. It's also possible to set a different directory as the extraction target, and to unpack only a subset of the archive's members.

ストリームモードで開く tarfile に使われる圧縮を、モード 'r|*' を使って自動で検出出来るようになっています。 (Contributed by Lars Gustäbel.)
threading モジュールで、新しいスレッドを作る際に使われるスタックサイズを設定出来るようになりました。 stack_size([*size*]) 関数は現在構成されているスタックサイズを返し、省略可能引数 size パラメータを与えると新しい値を設定します。全てのプラットフォームがスタックサイズ変更をサポートしているわけではなく、Windows、POSIX スレッド、OS/2 で可能です。 (Contributed by Andrew MacIntyre.)
The unicodedata module has been updated to use version 4.1.0 of the Unicode character database. Version 3.2.0 is required by some specifications, so it's still available as unicodedata.ucd_3_2_0.

New module: the uuid module generates universally unique identifiers (UUIDs) according to RFC 4122. The RFC defines several different UUID versions that are generated from a starting string, from system properties, or purely randomly. This module contains a UUID class and functions named uuid1(), uuid3(), uuid4(), and uuid5() to generate different versions of UUID. (Version 2 UUIDs are not specified in RFC 4122 and are not supported by this module.)

>>> import uuid
>>> # make a UUID based on the host ID and current time
>>> uuid.uuid1()
UUID('a8098c1a-f86e-11da-bd1a-00112444be1e')

>>> # make a UUID using an MD5 hash of a namespace UUID and a name
>>> uuid.uuid3(uuid.NAMESPACE_DNS, 'python.org')
UUID('6fa459ea-ee8a-3ca4-894e-db77e160355e')

>>> # make a random UUID
>>> uuid.uuid4()
UUID('16fd2706-8baf-433b-82eb-8c7fada847da')

>>> # make a UUID using a SHA-1 hash of a namespace UUID and a name
>>> uuid.uuid5(uuid.NAMESPACE_DNS, 'python.org')
UUID('886313e1-3b8a-5372-9b90-0c9aee199e5d')

(Contributed by Ka-Ping Yee.)

The weakref module's WeakKeyDictionary and WeakValueDictionary types gained new methods for iterating over the weak references contained in the dictionary. iterkeyrefs() and keyrefs() methods were added to WeakKeyDictionary, and itervaluerefs() and valuerefs() were added to WeakValueDictionary. (Contributed by Fred L. Drake, Jr.)
The webbrowser module received a number of enhancements. It's now usable as a script with python -m webbrowser, taking a URL as the argument; there are a number of switches to control the behaviour (-n for a new browser window, -t for a new tab). New module-level functions, open_new() and open_new_tab(), were added to support this. The module's open() function supports an additional feature, an autoraise parameter that signals whether to raise the open window when possible. A number of additional browsers were added to the supported list such as Firefox, Opera, Konqueror, and elinks. (Contributed by Oleg Broytmann and Georg Brandl.)
The xmlrpclib module now supports returning datetime objects for the XML-RPC date type. Supply use_datetime=True to the loads() function or the Unmarshaller class to enable this feature. (Contributed by Skip Montanaro.)
zipfile モジュールが ZIP64 バージョンのフォーマットをサポートしました。これで 4GiB を超える zip 書庫を作れ、4GiB を超えるファイルを書庫に含めることが出来ます。 (Contributed by Ronald Oussoren.)
The zlib module's Compress and Decompress objects now support a copy() method that makes a copy of the object's internal state and returns a new Compress or Decompress object. (Contributed by Chris AtLee.)

ctypes パッケージ¶

The ctypes package, written by Thomas Heller, has been added to the standard library. ctypes lets you call arbitrary functions in shared libraries or DLLs. Long-time users may remember the dl module, which provides functions for loading shared libraries and calling functions in them. The ctypes package is much fancier.

To load a shared library or DLL, you must create an instance of the CDLL class and provide the name or path of the shared library or DLL. Once that's done, you can call arbitrary functions by accessing them as attributes of the CDLL object.

import ctypes

libc = ctypes.CDLL('libc.so.6')
result = libc.printf("Line of output\n")

Type constructors for the various C types are provided: c_int(), c_float(), c_double(), c_char_p() (equivalent to char*), and so forth. Unlike Python's types, the C versions are all mutable; you can assign to their value attribute to change the wrapped value. Python integers and strings will be automatically converted to the corresponding C types, but for other types you must call the correct type constructor. (And I mean must; getting it wrong will often result in the interpreter crashing with a segmentation fault.)

You shouldn't use c_char_p() with a Python string when the C function will be modifying the memory area, because Python strings are supposed to be immutable; breaking this rule will cause puzzling bugs. When you need a modifiable memory area, use create_string_buffer():

s = "this is a string"
buf = ctypes.create_string_buffer(s)
libc.strfry(buf)

C functions are assumed to return integers, but you can set the restype attribute of the function object to change this:

>>> libc.atof('2.71828')
-1783957616
>>> libc.atof.restype = ctypes.c_double
>>> libc.atof('2.71828')
2.71828

ctypes also provides a wrapper for Python's C API as the ctypes.pythonapi object. This object does not release the global interpreter lock before calling a function, because the lock must be held when calling into the interpreter's code. There's a py_object type constructor that will create a PyObject* pointer. A simple usage:

import ctypes

d = {}
ctypes.pythonapi.PyObject_SetItem(ctypes.py_object(d),
          ctypes.py_object("abc"),  ctypes.py_object(1))
# d is now {'abc', 1}.

Don't forget to use py_object(); if it's omitted you end up with a segmentation fault.

ctypes が周辺ライブラリとして登場してしばらく経ちますが、ctypes が存在していることに依存できないとなれば人々はまだハードコードされた拡張モジュールを書いて配布します。 ctypes が Python 中核に含められた今、おそらく開発者たちは、拡張モジュールの代わりに ctypes を介してアクセスするライブラリの上に Python ラッパーを書き始めるでしょう,

参考

https://web.archive.org/web/20180410025338/http://starship.python.net/crew/theller/ctypes/: The pre-stdlib ctypes web page, with a tutorial, reference, and FAQ.

ctypes モジュールについてのドキュメント。

ElementTree パッケージ¶

A subset of Fredrik Lundh's ElementTree library for processing XML has been added to the standard library as xml.etree. The available modules are ElementTree, ElementPath, and ElementInclude from ElementTree 1.2.6. The cElementTree accelerator module is also included.

The rest of this section will provide a brief overview of using ElementTree. Full documentation for ElementTree is available at https://web.archive.org/web/20201124024954/http://effbot.org/zone/element-index.htm.

ElementTree represents an XML document as a tree of element nodes. The text content of the document is stored as the text and tail attributes of (This is one of the major differences between ElementTree and the Document Object Model; in the DOM there are many different types of node, including TextNode.)

The most commonly used parsing function is parse(), that takes either a string (assumed to contain a filename) or a file-like object and returns an ElementTree instance:

from xml.etree import ElementTree as ET

tree = ET.parse('ex-1.xml')

feed = urllib.urlopen(
          'http://planet.python.org/rss10.xml')
tree = ET.parse(feed)

Once you have an ElementTree instance, you can call its getroot() method to get the root Element node.

There's also an XML() function that takes a string literal and returns an Element node (not an ElementTree). This function provides a tidy way to incorporate XML fragments, approaching the convenience of an XML literal:

svg = ET.XML("""<svg width="10px" version="1.0">
             </svg>""")
svg.set('height', '320px')
svg.append(elem1)

個々の XML 要素はいくつかの辞書のような、いくつかのリストのようなメソッドをサポートしています。辞書的な操作は属性値のアクセスに使い、リスト的な操作は子ノードのアクセスに使います。

演算	結果
`elem[n]`	n 番目の子要素を返す。
`elem[m:n]`	m 番目から n 番目までの子要素を返す(---訳注: n は含まない---)。
`len(elem)`	子要素数を返す。
`list(elem)`	子要素のリストを返す。
`elem.append(elem2)`	elem2 を子として追加する。
`elem.insert(index, elem2)`	指定された位置に elem2 を挿入する。
`del elem[n]`	n 番目の子要素を削除する。
`elem.keys()`	属性名のリストを返す。
`elem.get(name)`	属性 name の値を返す。
`elem.set(name, value)`	属性 name に新しい値をセットする。
`elem.attrib`	属性を含んだ辞書を取り出す。
`del elem.attrib[name]`	属性 name を削除する。

Comments and processing instructions are also represented as Element nodes. To check if a node is a comment or processing instructions:

if elem.tag is ET.Comment:
    ...
elif elem.tag is ET.ProcessingInstruction:
    ...

To generate XML output, you should call the xml.etree.ElementTree.ElementTree.write() method. Like parse(), it can take either a string or a file-like object:

# Encoding is US-ASCII
tree.write('output.xml')

# Encoding is UTF-8
f = open('output.xml', 'w')
tree.write(f, encoding='utf-8')

(警告: 出力に使われるデフォルトエンコーディングは ASCII です。一般的な XML での作業では要素名に任意の Unicode 文字を含められるので、ASCII エンコーディングはあまり役に立ちません。要素名に 127 より大きなどんな文字が含まれても例外となりますので。ですから任意の Unicode 文字を扱うために UTF-8 のようなエンコーディングを指定するのが吉です。)

このセクションでは ElementTree インターフェイスのほんの一部しか記載していません。さらに詳しい情報については、パッケージの公式ドキュメントを参照して下さい。

参考

https://web.archive.org/web/20201124024954/http://effbot.org/zone/element-index.htm: ElementTree の公式ドキュメント

hashlib パッケージ¶

A new hashlib module, written by Gregory P. Smith, has been added to replace the md5 and sha modules. hashlib adds support for additional secure hashes (SHA-224, SHA-256, SHA-384, and SHA-512). When available, the module uses OpenSSL for fast platform optimized implementations of algorithms.

The old md5 and sha modules still exist as wrappers around hashlib to preserve backwards compatibility. The new module's interface is very close to that of the old modules, but not identical. The most significant difference is that the constructor functions for creating new hashing objects are named differently.

# Old versions
h = md5.md5()
h = md5.new()

# New version
h = hashlib.md5()

# Old versions
h = sha.sha()
h = sha.new()

# New version
h = hashlib.sha1()

# Hash that weren't previously available
h = hashlib.sha224()
h = hashlib.sha256()
h = hashlib.sha384()
h = hashlib.sha512()

# Alternative form
h = hashlib.new('md5')          # Provide algorithm as a string

Once a hash object has been created, its methods are the same as before: update(string) hashes the specified string into the current digest state, digest() and hexdigest() return the digest value as a binary string or a string of hex digits, and copy() returns a new hashing object with the same digest state.

参考

hashlib モジュールについてのドキュメント。

sqlite3 パッケージ¶

The pysqlite module (https://www.pysqlite.org), a wrapper for the SQLite embedded database, has been added to the standard library under the package name sqlite3.

SQLite は、軽量なディスク上のデータベースを提供する C ライブラリです。別のサーバプロセスを用意する必要なく、 SQL クエリー言語の非標準的な一種を使用してデータベースにアクセスできます。一部のアプリケーションは内部データ保存に SQLite を使えます。また、SQLite を使ってアプリケーションのプロトタイプを作り、その後そのコードを PostgreSQL や Oracle のような大規模データベースに移植するということも可能です。

pysqlite は Gerhard Häring によって書かれ、 PEP 249 に記述された DB-API 2.0 仕様に準拠した SQL インターフェースを提供するものです。

ソースコードから Python をビルドする場合は注意してください。Python ソースツリーは SQLite コードを含まずラッパーモジュールのみを含んでいます。Python ビルド前に SQLite ライブラリとヘッダをインストールする必要があります。必要ヘッダが利用可能となっていればモジュールはビルドされます。

To use the module, you must first create a Connection object that represents the database. Here the data will be stored in the /tmp/example file:

conn = sqlite3.connect('/tmp/example')

特別な名前である :memory: を使うと RAM 上にデータベースを作ることもできます。

Once you have a Connection, you can create a Cursor object and call its execute() method to perform SQL commands:

c = conn.cursor()

# Create table
c.execute('''create table stocks
(date text, trans text, symbol text,
 qty real, price real)''')

# Insert a row of data
c.execute("""insert into stocks
          values ('2006-01-05','BUY','RHAT',100,35.14)""")

たいてい、SQL 操作は Python 変数の値を使う必要があります。この時、クエリーを Python の文字列操作を使って構築することは、安全とは言えないので、すべきではありません。そのようなことをするとプログラムが SQL インジェクション攻撃に対し脆弱になりかねません。

Instead, use the DB-API's parameter substitution. Put ? as a placeholder wherever you want to use a value, and then provide a tuple of values as the second argument to the cursor's execute() method. (Other database modules may use a different placeholder, such as %s or :1.) For example:

# Never do this -- insecure!
symbol = 'IBM'
c.execute("... where symbol = '%s'" % symbol)

# Do this instead
t = (symbol,)
c.execute('select * from stocks where symbol=?', t)

# Larger example
for t in (('2006-03-28', 'BUY', 'IBM', 1000, 45.00),
          ('2006-04-05', 'BUY', 'MSOFT', 1000, 72.00),
          ('2006-04-06', 'SELL', 'IBM', 500, 53.00),
         ):
    c.execute('insert into stocks values (?,?,?,?,?)', t)

To retrieve data after executing a SELECT statement, you can either treat the cursor as an iterator, call the cursor's fetchone() method to retrieve a single matching row, or call fetchall() to get a list of the matching rows.

以下の例ではイテレータの形を使います:

>>> c = conn.cursor()
>>> c.execute('select * from stocks order by price')
>>> for row in c:
...    print row
...
(u'2006-01-05', u'BUY', u'RHAT', 100, 35.140000000000001)
(u'2006-03-28', u'BUY', u'IBM', 1000, 45.0)
(u'2006-04-06', u'SELL', u'IBM', 500, 53.0)
(u'2006-04-05', u'BUY', u'MSOFT', 1000, 72.0)
>>>

SQLite でサポートされる SQL 方言についてもっと詳しく知りたければ、 http://www.sqlite.org を参照して下さい。

参考

https://www.pysqlite.org: pysqlite のウェブページ。
https://www.sqlite.org: SQLite のウェブページ。ここの文書ではサポートされる SQL 方言の文法と使えるデータ型を説明しています。

sqlite3 モジュールについてのドキュメント。

PEP 249 - Database API Specification 2.0: Marc-Andre Lemburg により書かれた PEP。

wsgiref パッケージ¶

Web Server Gateway Interface (WSGI) v1.0 は、 Web サーバと Python で記述された Web アプリケーションとの標準インターフェースであり、 PEP 333 で定義されています。 wsgiref パッケージは WSGI 仕様のリファレンス実装です。

パッケージには WSGI アプリケーションとして動作する基礎的な HTTP サーバが含まれています; このサーバは製品ユースではないデバッグ目的に有用です。サーバをセットアップするには僅か数行で済みます:

from wsgiref import simple_server

wsgi_app = ...

host = ''
port = 8000
httpd = simple_server.make_server(host, port, wsgi_app)
httpd.serve_forever()

参考

https://web.archive.org/web/20160331090247/http://wsgi.readthedocs.org/en/latest/: WSGI 関連のリソースについて集約しているウェブサイト。
PEP 333 - Python Web Server Gateway Interface v1.0: PEP 著 Phillip J. Eby.

ビルドならびに C API の変更¶

Python のビルド過程と C API の変更は以下の通りです:

Python ソースツリーは CVS から Subversion に変換されました。この複雑な移行手続きは Martin von Löwis によって指揮されつつがなく遂行されました。この手続きは PEP 347 にて開発されました。
Coverity, a company that markets a source code analysis tool called Prevent, provided the results of their examination of the Python source code. The analysis found about 60 bugs that were quickly fixed. Many of the bugs were refcounting problems, often occurring in error-handling code. See https://scan.coverity.com for the statistics.
The largest change to the C API came from PEP 353, which modifies the interpreter to use a Py_ssize_t type definition instead of int. See the earlier section PEP 353: 添え字型に ssize_t を使う for a discussion of this change.
The design of the bytecode compiler has changed a great deal, no longer generating bytecode by traversing the parse tree. Instead the parse tree is converted to an abstract syntax tree (or AST), and it is the abstract syntax tree that's traversed to produce the bytecode.

It's possible for Python code to obtain AST objects by using the compile() built-in and specifying _ast.PyCF_ONLY_AST as the value of the flags parameter:
```
from _ast import PyCF_ONLY_AST
ast = compile("""a=0
for i in range(10):
    a += i
""", "<string>", 'exec', PyCF_ONLY_AST)

assignment = ast.body[0]
for_loop = ast.body[1]
```
No official documentation has been written for the AST code yet, but PEP 339 discusses the design. To start learning about the code, read the definition of the various AST nodes in Parser/Python.asdl. A Python script reads this file and generates a set of C structure definitions in Include/Python-ast.h. The PyParser_ASTFromString() and PyParser_ASTFromFile(), defined in Include/pythonrun.h, take Python source as input and return the root of an AST representing the contents. This AST can then be turned into a code object by PyAST_Compile(). For more information, read the source code, and then ask questions on python-dev.

The AST code was developed under Jeremy Hylton's management, and implemented by (in alphabetical order) Brett Cannon, Nick Coghlan, Grant Edwards, John Ehresman, Kurt Kaiser, Neal Norwitz, Tim Peters, Armin Rigo, and Neil Schemenauer, plus the participants in a number of AST sprints at conferences such as PyCon.
Evan Jones's patch to obmalloc, first described in a talk at PyCon DC 2005, was applied. Python 2.4 allocated small objects in 256K-sized arenas, but never freed arenas. With this patch, Python will free arenas when they're empty. The net effect is that on some platforms, when you allocate many objects, Python's memory usage may actually drop when you delete them and the memory may be returned to the operating system. (Implemented by Evan Jones, and reworked by Tim Peters.)

Note that this change means extension modules must be more careful when allocating memory. Python's API has many different functions for allocating memory that are grouped into families. For example, PyMem_Malloc(), PyMem_Realloc(), and PyMem_Free() are one family that allocates raw memory, while PyObject_Malloc(), PyObject_Realloc(), and PyObject_Free() are another family that's supposed to be used for creating Python objects.

Previously these different families all reduced to the platform's malloc() and free() functions. This meant it didn't matter if you got things wrong and allocated memory with the PyMem function but freed it with the PyObject function. With 2.5's changes to obmalloc, these families now do different things and mismatches will probably result in a segfault. You should carefully test your C extension modules with Python 2.5.
ビルトインの集合型のために公式の C API が作られました。新しく作るのには PySet_New() か PyFrozenSet_New() を、要素の追加には PySet_Add() 、削除には PySet_Discard() 、 PySet_Contains() と PySet_Size() で集合オブジェクトの状態を調べます。 (Contributed by Raymond Hettinger.)
C コードから Python インタプリタの正確なリビジョンについての情報を取得出来るようになりました。 Py_GetBuildInfo() 関数を呼び出すことでビルド情報についての文字列が "trunk:45355:45356M, Apr 13 2006, 07:42:19" のように返ります。 (Contributed by Barry Warsaw.)
Two new macros can be used to indicate C functions that are local to the current file so that a faster calling convention can be used. Py_LOCAL declares the function as returning a value of the specified type and uses a fast-calling qualifier. Py_LOCAL_INLINE does the same thing and also requests the function be inlined. If macro PY_LOCAL_AGGRESSIVE is defined before python.h is included, a set of more aggressive optimizations are enabled for the module; you should benchmark the results to find out if these optimizations actually make the code faster. (Contributed by Fredrik Lundh at the NeedForSpeed sprint.)
PyErr_NewException(name, base, dict) が base 引数としてベースクラスのタプルを受け付けるようになりました。 (Contributed by Georg Brandl.)
The PyErr_Warn() function for issuing warnings is now deprecated in favour of PyErr_WarnEx(category, message, stacklevel) which lets you specify the number of stack frames separating this function and the caller. A stacklevel of 1 is the function calling PyErr_WarnEx(), 2 is the function above that, and so forth. (Added by Neal Norwitz.)
CPython は今でも C で書かれていますが、コードは C++ コンパイラでもエラーなしでコンパイル出来るようになりました。 (Implemented by Anthony Baxter, Martin von Löwis, Skip Montanaro.)
The PyRange_New() function was removed. It was never documented, never used in the core code, and had dangerously lax error checking. In the unlikely case that your extensions were using it, you can replace it by something like the following:
```
range = PyObject_CallFunction((PyObject*) &PyRange_Type, "lll",
                              start, stop, step);
```

ポート特有の変更¶

MacOS X (10.3 以上): モジュールの動的ロードに、MacOS 固有の関数ではなく dlopen() 関数を使うようになりました。
MacOS X: an --enable-universalsdk switch was added to the configure script that compiles the interpreter as a universal binary able to run on both PowerPC and Intel processors. (Contributed by Ronald Oussoren; bpo-2573.)
Windows: .dll はもはや拡張モジュールのファイル拡張子としてはサポートされません。これからは .pyd だけが拡張モジュールとして検索されるファイル拡張子です。

Python 2.5 への移植¶

このセクションでは前述の変更により必要となるかもしれないコードの変更を列挙します:

ASCII がモジュールのデフォルトエンコーディングになっています(訳注: Python 3 からはデフォルトは utf-8 (PEP 3120))。8 ビット文字を含んでいるのにエンコーディング宣言がないモジュールが構文エラーになるようになりました。Python 2.4 では構文エラーとはならず警告でした。
Previously, the gi_frame attribute of a generator was always a frame object. Because of the PEP 342 changes described in section PEP 342: ジェネレータの新機能, it's now possible for gi_frame to be None.
Unicode 文字列とデフォルトの ASCII エンコーディングで Unicode に変換出来ない 8 ビット文字列との比較を試みると、新しい警告 UnicodeWarning が引き起こされるようになっています。以前はそのような比較では UnicodeDecodeError 例外を起こしていました。
ライブラリ: csv モジュールは引用符内のマルチラインをより厳密に扱うようになっています。フィールド内に改行を埋め込んでいるファイルがある場合は、その入力は改行文字を残すやり方で行分解されるはずです。
ライブラリ: locale モジュールの format() 関数は、以前は %文字指定子が一つを超えてさえいなければ任意の文字列を受け付けていたかもしれませんが、Python 2.5 からは正確に一つの、ほかに周りをテキストで囲まれていない %文字指定子でなければなりません。
Library: The pickle and cPickle modules no longer accept a return value of None from the __reduce__() method; the method must return a tuple of arguments instead. The modules also no longer accept the deprecated bin keyword parameter.
Library: The SimpleXMLRPCServer and DocXMLRPCServer classes now have a rpc_paths attribute that constrains XML-RPC operations to a limited set of URL paths; the default is to allow only '/' and '/RPC2'. Setting rpc_paths to None or an empty tuple disables this path checking.
C API: Many functions now use Py_ssize_t instead of int to allow processing more data on 64-bit machines. Extension code may need to make the same change to avoid warnings and to support 64-bit machines. See the earlier section PEP 353: 添え字型に ssize_t を使う for a discussion of this change.
C API: The obmalloc changes mean that you must be careful to not mix usage of the PyMem_* and PyObject_* families of functions. Memory allocated with one family's *_Malloc must be freed with the corresponding family's *_Free function.

謝辞¶

The author would like to thank the following people for offering suggestions, corrections and assistance with various drafts of this article: Georg Brandl, Nick Coghlan, Phillip J. Eby, Lars Gustäbel, Raymond Hettinger, Ralf W. Grosse-Kunstleve, Kent Johnson, Iain Lowe, Martin von Löwis, Fredrik Lundh, Andrew McNamara, Skip Montanaro, Gustavo Niemeyer, Paul Prescod, James Pryor, Mike Rovner, Scott Weikart, Barry Warsaw, Thomas Wouters.

What's New in Python 2.5¶

PEP 308: 条件式 (Conditional Expressions)¶

PEP 309: 関数の部分適用¶

PEP 314: Metadata for Python Software Packages v1.1¶

PEP 328: 絶対インポート、相対インポート¶

PEP 338 - モジュールをスクリプトとして実行する¶

PEP 341: try/except/finally の一体化¶

PEP 342: ジェネレータの新機能¶

PEP 343: "with" ステートメント¶

コンテキストマネージャを書く¶

contextlib モジュール¶

PEP 352: 例外の新スタイルクラス化¶

PEP 353: 添え字型に ssize_t を使う¶

PEP 357: 'index' メソッド¶

その他の言語変更¶

対話的なインタプリタの変更¶

最適化¶

新たなモジュール、改良されたモジュール、削除されたモジュール¶

ctypes パッケージ¶

ElementTree パッケージ¶

hashlib パッケージ¶

sqlite3 パッケージ¶

wsgiref パッケージ¶

ビルドならびに C API の変更¶

ポート特有の変更¶

Python 2.5 への移植¶

謝辞¶

目次

前のトピックへ

次のトピックへ

This page

What's New in Python 2.5¶

PEP 308: 条件式 (Conditional Expressions)¶

PEP 309: 関数の部分適用¶

PEP 314: Metadata for Python Software Packages v1.1¶

PEP 328: 絶対インポート、相対インポート¶

PEP 338 - モジュールをスクリプトとして実行する¶

PEP 341: try/except/finally の一体化¶

PEP 342: ジェネレータの新機能¶

PEP 343: "with" ステートメント¶

コンテキストマネージャを書く¶

contextlib モジュール¶

PEP 352: 例外の新スタイルクラス化¶

PEP 353: 添え字型に ssize_t を使う¶

PEP 357: '__index__' メソッド¶

その他の言語変更¶

対話的なインタプリタの変更¶

最適化¶

新たなモジュール、改良されたモジュール、削除されたモジュール¶

ctypes パッケージ¶

ElementTree パッケージ¶

hashlib パッケージ¶

sqlite3 パッケージ¶

wsgiref パッケージ¶

ビルドならびに C API の変更¶

ポート特有の変更¶

Python 2.5 への移植¶

謝辞¶

PEP 357: 'index' メソッド¶