`zlib` --- gzip 互換の圧縮¶

For applications that require data compression, the functions in this module allow compression and decompression, using the zlib library.

This is an optional module. If it is missing from your copy of CPython, look for documentation from your distributor (that is, whoever provided Python to you). If you are the distributor, see オプションのモジュールの要件.

zlib's functions have many options and often need to be used in a particular order. This documentation doesn't attempt to cover all of the permutations; consult the zlib manual for authoritative information.

.gz ファイルの読み書きのためには、 gzip モジュールを参照してください。

このモジュールで利用可能な例外と関数を以下に示します:

exception zlib.error¶: 圧縮および展開時のエラーによって送出される例外です。

zlib.adler32(data[, value])¶: data の Adler-32 チェックサムを計算します (Adler-32 チェックサムは、おおむね CRC32 と同等の信頼性を持ちながら、はるかに高速に計算できます)。結果は、符号のない 32 ビットの整数です。 value が与えられている場合、チェックサム計算の初期値として使われます。与えられていない場合、デフォルト値の 1 が使われます。 value を与えることで、複数の入力を結合したデータ全体にわたり、通しのチェックサムを計算できます。このアルゴリズムは暗号論的には強力ではなく、認証やデジタル署名などに用いるべきではありません。また、チェックサムアルゴリズムとして設計されているため、汎用のハッシュアルゴリズムには向きません。

バージョン 3.0 で変更: The result is always unsigned.

zlib.compress(data, /, level=Z_DEFAULT_COMPRESSION, wbits=MAX_WBITS)¶

Compresses the bytes in data, returning a bytes object containing compressed data. level is an integer from 0 to 9 or -1 controlling the level of compression; See Z_BEST_SPEED (1), Z_BEST_COMPRESSION (9), Z_NO_COMPRESSION (0), and the default, Z_DEFAULT_COMPRESSION (-1) for more information about these values.

The wbits argument controls the size of the history buffer (or the "window size") used when compressing data, and whether a header and trailer is included in the output. It can take several ranges of values, defaulting to 15 (MAX_WBITS):

+9 to +15: The base-two logarithm of the window size, which therefore ranges between 512 and 32768. Larger values produce better compression at the expense of greater memory usage. The resulting output will include a zlib-specific header and trailer.
−9 to −15: Uses the absolute value of wbits as the window size logarithm, while producing a raw output stream with no header or trailing checksum.
+25 to +31 = 16 + (9 to 15): Uses the low 4 bits of the value as the window size logarithm, while including a basic gzip header and trailing checksum in the output.

Raises the error exception if any error occurs.

バージョン 3.6 で変更: level can now be used as a keyword parameter.

バージョン 3.11 で変更: The wbits parameter is now available to set window bits and compression type.

zlib.compressobj(level=Z_DEFAULT_COMPRESSION, method=DEFLATED, wbits=MAX_WBITS, memLevel=DEF_MEM_LEVEL, strategy=Z_DEFAULT_STRATEGY[, zdict])¶

一度にメモリ上に置くことができないようなデータストリームを圧縮するための圧縮オブジェクトを返します。

level is the compression level -- an integer from 0 to 9 or -1. See Z_BEST_SPEED (1), Z_BEST_COMPRESSION (9), Z_NO_COMPRESSION (0), and the default, Z_DEFAULT_COMPRESSION (-1) for more information about these values.

method は圧縮アルゴリズムです。現在、 DEFLATED のみサポートされています。

The wbits parameter controls the size of the history buffer (or the "window size"), and what header and trailer format will be used. It has the same meaning as described for compress().

memLevel 引数は内部圧縮状態用に使用されるメモリ量を制御します。有効な値は 1 から 9 です。大きい値ほど多くのメモリを消費しますが、より速く、より小さな出力を作成します。

strategy is used to tune the compression algorithm. Possible values are Z_DEFAULT_STRATEGY, Z_FILTERED, Z_HUFFMAN_ONLY, Z_RLE and Z_FIXED.

zdict は定義済み圧縮辞書です。これは圧縮されるデータ内で繰り返し現れると予想されるサブシーケンスを含む (bytes オブジェクトのような) バイト列のシーケンスです。最も一般的と思われるサブシーケンスは辞書の末尾に来なければなりません。

バージョン 3.3 で変更: zdict パラメータとキーワード引数のサポートが追加されました。

zlib.crc32(data[, value])¶: data の CRC (Cyclic Redundancy Check, 巡回冗長検査) チェックサムを計算します。結果は、符号のない 32 ビットの整数です。 value が与えられている場合、チェックサム計算の初期値として使われます。与えられていない場合、デフォルト値の 0 が使われます。 value を与えることで、複数の入力を結合したデータ全体にわたり、通しのチェックサムを計算できます。このアルゴリズムは暗号論的には強力ではなく、認証やデジタル署名などに用いるべきではありません。また、チェックサムアルゴリズムとして設計されているため、汎用のハッシュアルゴリズムには向きません。

バージョン 3.0 で変更: The result is always unsigned.

zlib.decompress(data, /, wbits=MAX_WBITS, bufsize=DEF_BUF_SIZE)¶

Decompresses the bytes in data, returning a bytes object containing the uncompressed data. The wbits parameter depends on the format of data, and is discussed further below. If bufsize is given, it is used as the initial size of the output buffer. Raises the error exception if any error occurs.

The wbits parameter controls the size of the history buffer (or "window size"), and what header and trailer format is expected. It is similar to the parameter for compressobj(), but accepts more ranges of values:

+8 to +15: The base-two logarithm of the window size. The input must include a zlib header and trailer.
0: Automatically determine the window size from the zlib header. Only supported since zlib 1.2.3.5.
−8 to −15: Uses the absolute value of wbits as the window size logarithm. The input must be a raw stream with no header or trailer.
+24 to +31 = 16 + (8 to 15): Uses the low 4 bits of the value as the window size logarithm. The input must include a gzip header and trailer.
+40 to +47 = 32 + (8 to 15): Uses the low 4 bits of the value as the window size logarithm, and automatically accepts either the zlib or gzip format.

When decompressing a stream, the window size must not be smaller than the size originally used to compress the stream; using a too-small value may result in an error exception. The default wbits value corresponds to the largest window size and requires a zlib header and trailer to be included.

bufsize は展開されたデータを保持するためのバッファサイズの初期値です。バッファの空きは必要に応じて必要なだけ増加するので、必ずしも正確な値を指定する必要はありません。この値のチューニングでできることは、malloc() が呼ばれる回数を数回減らすことぐらいです。

バージョン 3.6 で変更: wbits and bufsize can be used as keyword arguments.

zlib.decompressobj(wbits=MAX_WBITS[, zdict])¶

一度にメモリ上に置くことができないようなデータストリームを展開するための展開オブジェクトを返します。

The wbits parameter controls the size of the history buffer (or the "window size"), and what header and trailer format is expected. It has the same meaning as described for decompress().

zdict パラメータには定義済み圧縮辞書を指定します。このパラメータを指定する場合、展開するデータを圧縮した際に使用した辞書と同じものでなければなりません。

注釈

zdict が (bytearray のような) 変更可能オブジェクトの場合、decompressobj() の呼び出しとデコンプレッサの decompress() メソッドの最初の呼び出しの間に辞書の内容を変更してはいけません。

バージョン 3.3 で変更: パラメータに zdict を追加しました。

圧縮オブジェクトは以下のメソッドをサポートしています:

Compress.compress(data)¶: data を圧縮し、圧縮されたデータを含むバイト列オブジェクトを返します。この文字列は少なくとも data の一部分のデータに対する圧縮データを含みます。このデータは以前に呼んだ compress() が返した出力と結合することができます。入力の一部は以後の処理のために内部バッファに保存されることもあります。

Compress.flush([mode])¶: All pending input is processed, and a bytes object containing the remaining compressed output is returned. mode can be selected from the constants Z_NO_FLUSH, Z_PARTIAL_FLUSH, Z_SYNC_FLUSH, Z_FULL_FLUSH, Z_BLOCK, or Z_FINISH, defaulting to Z_FINISH. Except Z_FINISH, all constants allow compressing further bytestrings of data, while Z_FINISH finishes the compressed stream and prevents compressing any more data. After calling flush() with mode set to Z_FINISH, the compress() method cannot be called again; the only realistic action is to delete the object.

Compress.copy()¶: 圧縮オブジェクトのコピーを返します。これを使うと先頭部分が共通している複数のデータを効率的に圧縮することができます。

バージョン 3.8 で変更: Added copy.copy() and copy.deepcopy() support to compression objects.

展開オブジェクトは以下のメソッドと属性をサポートしています:

Decompress.unused_data¶: 圧縮データの末尾より後のバイト列が入ったバイト列オブジェクトです。すなわち、この値は圧縮データの入っているバイト列の最後の文字が利用可能になるまでは b"" のままとなります。入力バイト文字列すべてが圧縮データを含んでいた場合、この属性は b"" 、すなわち空バイト列になります。

Decompress.unconsumed_tail¶: 展開されたデータを収めるバッファの長さ制限を超えたために、直近の decompress() 呼び出しで処理しきれなかったデータを含むバイト列オブジェクトです。このデータはまだ zlib 側からは見えていないので、正しい展開出力を得るには以降の decompress() メソッド呼び出しに (場合によっては後続のデータが追加された) データを差し戻さなければなりません。

Decompress.eof¶

圧縮データストリームの終了に達したかどうかを示すブール値です。

これは、正常な形式の圧縮ストリームと、不完全あるいは切り詰められたストリームとを区別することを可能にします。

Added in version 3.3.

Decompress.decompress(data, max_length=0)¶

data を展開し、少なくとも string の一部分に対応する展開されたデータを含むバイト列オブジェクトを返します。このデータは以前に decompress() メソッドを呼んだ時に返された出力と結合することができます。入力データの一部分が以後の処理のために内部バッファに保存されることもあります。

If the optional parameter max_length is non-zero then the return value will be no longer than max_length. This may mean that not all of the compressed input can be processed; and unconsumed data will be stored in the attribute unconsumed_tail. This bytestring must be passed to a subsequent call to decompress() if decompression is to continue. If max_length is zero then the whole input is decompressed, and unconsumed_tail is empty.

バージョン 3.6 で変更: max_length can be used as a keyword argument.

Decompress.flush([length])¶

未処理の入力データをすべて処理し、最終的に圧縮されなかった残りの出力バイト列オブジェクトを返します。flush() を呼んだ後、decompress() を再度呼ぶべきではありません。このときできる唯一の現実的な操作はオブジェクトの削除だけです。

オプション引数 length には出力バッファの初期サイズを指定します。

Decompress.copy()¶: 展開オブジェクトのコピーを返します。これを使うとデータストリームの途中にある展開オブジェクトの状態を保存でき、未来のある時点で行なわれるストリームのランダムなシークをスピードアップするのに利用できます。

バージョン 3.8 で変更: Added copy.copy() and copy.deepcopy() support to decompression objects.

The following constants are available to configure compression and decompression behavior:

zlib.DEFLATED¶: The deflate compression method.

zlib.MAX_WBITS¶: The maximum window size, expressed as a power of 2. For example, if MAX_WBITS is 15 it results in a window size of 32 KiB.

zlib.DEF_MEM_LEVEL¶: The default memory level for compression objects.

zlib.DEF_BUF_SIZE¶: The default buffer size for decompression operations.

zlib.Z_NO_COMPRESSION¶: Compression level 0; no compression.

Added in version 3.6.

zlib.Z_BEST_SPEED¶: Compression level 1; fastest and produces the least compression.

zlib.Z_BEST_COMPRESSION¶: Compression level 9; slowest and produces the most compression.

zlib.Z_DEFAULT_COMPRESSION¶: Default compression level (-1); a compromise between speed and compression. Currently equivalent to compression level 6.

zlib.Z_DEFAULT_STRATEGY¶: Default compression strategy, for normal data.

zlib.Z_FILTERED¶: Compression strategy for data produced by a filter (or predictor).

zlib.Z_HUFFMAN_ONLY¶: Compression strategy that forces Huffman coding only.

zlib.Z_RLE¶

Compression strategy that limits match distances to one (run-length encoding).

This constant is only available if Python was compiled with zlib 1.2.0.1 or greater.

Added in version 3.6.

zlib.Z_FIXED¶

Compression strategy that prevents the use of dynamic Huffman codes.

This constant is only available if Python was compiled with zlib 1.2.2.2 or greater.

Added in version 3.6.

zlib.Z_NO_FLUSH¶: Flush mode 0. No special flushing behavior.

Added in version 3.6.

zlib.Z_PARTIAL_FLUSH¶: Flush mode 1. Flush as much output as possible.

zlib.Z_SYNC_FLUSH¶: Flush mode 2. All output is flushed and the output is aligned to a byte boundary.

zlib.Z_FULL_FLUSH¶: Flush mode 3. All output is flushed and the compression state is reset.

zlib.Z_FINISH¶: Flush mode 4. All pending input is processed, no more input is expected.

zlib.Z_BLOCK¶

Flush mode 5. A deflate block is completed and emitted.

This constant is only available if Python was compiled with zlib 1.2.2.2 or greater.

Added in version 3.6.

zlib.Z_TREES¶

Flush mode 6, for inflate operations. Instructs inflate to return when it gets to the next deflate block boundary.

This constant is only available if Python was compiled with zlib 1.2.3.4 or greater.

Added in version 3.6.

使用している zlib ライブラリのバージョン情報を以下の定数で確認できます:

zlib.ZLIB_VERSION¶: モジュールのビルド時に使用された zlib ライブラリのバージョン文字列です。これは ZLIB_RUNTIME_VERSION で確認できる、実行時に使用している実際の zlib ライブラリのバージョンとは異なる場合があります。

zlib.ZLIB_RUNTIME_VERSION¶: インタプリタが読み込んだ実際の zlib ライブラリのバージョン文字列です。

Added in version 3.3.

zlib.ZLIBNG_VERSION¶

The version string of the zlib-ng library that was used for building the module if zlib-ng was used. When present, the ZLIB_VERSION and ZLIB_RUNTIME_VERSION constants reflect the version of the zlib API provided by zlib-ng.

If zlib-ng was not used to build the module, this constant will be absent.

Added in version 3.14.

参考

gzip モジュール: gzip 形式ファイルへの読み書きを行うモジュール。
https://www.zlib.net: zlib ライブラリホームページ。
https://www.zlib.net/manual.html: zlib ライブラリの多くの関数の意味と使い方を解説したマニュアル。

In case gzip (de)compression is a bottleneck, the python-isal package speeds up (de)compression with a mostly compatible API.

`zlib` --- gzip 互換の圧縮¶

前のトピックへ

次のトピックへ

This page

zlib --- gzip 互換の圧縮¶

`zlib` --- gzip 互換の圧縮¶