11.5. marshal — 内部 Python 对象序列化

此模块包含一此能以二进制格式来读写 Python 值的函数。 这种格式是 Python 专属的,但是独立于特定的机器架构(即你可以在一台 PC 上写入某个 Python 值,将文件传到一台 Sun 上并在那里读取它)。 这种格式的细节有意不带文档说明;它可能在不同 Python 版本中发生改变(但这种情况极少发生)。 1

这不是一个通用的“持久化”模块。 对于通用的持久化以及通过 RPC 调用传递 Python 对象,请参阅 pickleshelve 等模块。 marshal 模块主要是为了支持读写 .pyc 文件形式“伪编译”代码的 Python 模块。 因此,Python 维护者保留在必要时以不向下兼容的方式修改 marshal 格式的权利。 如果你要序列化和反序列化 Python 对象,请改用 pickle 模块 – 其执行效率相当,版本独立性有保证,并且 pickle 还支持比 marshal 更多样的对象类型。

警告

marshal 模块对于错误或恶意构建的数据来说是不安全的。 永远不要 unmarshal 来自不受信任的或未经验证的来源的数据。

Not all Python object types are supported; in general, only objects whose value is independent from a particular invocation of Python can be written and read by this module. The following types are supported: booleans, integers, long integers, floating point numbers, complex numbers, strings, Unicode objects, tuples, lists, sets, frozensets, dictionaries, and code objects, where it should be understood that tuples, lists, sets, frozensets and dictionaries are only supported as long as the values contained therein are themselves supported; and recursive lists, sets and dictionaries should not be written (they will cause infinite loops). The singletons None, Ellipsis and StopIteration can also be marshalled and unmarshalled.

警告

On machines where C’s long int type has more than 32 bits (such as the DEC Alpha), it is possible to create plain Python integers that are longer than 32 bits. If such an integer is marshaled and read back in on a machine where C’s long int type has only 32 bits, a Python long integer object is returned instead. While of a different type, the numeric value is the same. (This behavior is new in Python 2.2. In earlier versions, all but the least-significant 32 bits of the value were lost, and a warning message was printed.)

There are functions that read/write files as well as functions operating on strings.

这个模块定义了以下函数:

marshal.dump(value, file[, version])

Write the value on the open file. The value must be a supported type. The file must be an open file object such as sys.stdout or returned by open() or os.popen(). It may not be a wrapper such as TemporaryFile on Windows. It must be opened in binary mode ('wb' or 'w+b').

如果值具有(或所包含的对象具有)不受支持的类型,则会引发 ValueError — 但是将向文件写入垃圾数据。 对象也将不能正确地通过 load() 重新读取。

2.4 新版功能: version 参数指明 dump 应当使用的数据格式(见下文)。

marshal.load(file)

Read one value from the open file and return it. If no valid value is read (e.g. because the data has a different Python version’s incompatible marshal format), raise EOFError, ValueError or TypeError. The file must be an open file object opened in binary mode ('rb' or 'r+b').

注解

如果通过 dump() marshal 了一个包含不受支持类型的对象,load() 将为不可 marshal 的类型替换 None

marshal.dumps(value[, version])

Return the string that would be written to a file by dump(value, file). The value must be a supported type. Raise a ValueError exception if value has (or contains an object that has) an unsupported type.

2.4 新版功能: version 参数指明 dumps 应当使用的数据类型(见下文)。

marshal.loads(string)

Convert the string to a value. If no valid value is found, raise EOFError, ValueError or TypeError. Extra characters in the string are ignored.

此外,还定义了以下常量:

marshal.version

Indicates the format that the module uses. Version 0 is the historical format, version 1 (added in Python 2.4) shares interned strings and version 2 (added in Python 2.5) uses a binary format for floating point numbers. The current version is 2.

2.4 新版功能.

备注

1

此模块的名称来源于 Modula-3 (及其他语言) 的设计者所使用的术语,他们使用术语 “marshal” 来表示以自包含的形式传输数据。 严格地说,将数据从内部形式转换为外部形式 (例如用于 RPC 缓冲区) 称为 “marshal” 而其逆过程则称为 “unmarshal”。