11.4. shelve — Python 对象持久化

源代码: Lib/shelve.py


“shelf” 是一种持久化的类似字典的对象。 与 “dbm” 数据库的区别在于 shelf 中的值(不是键!)实际上可以为任意 Python 对象 — 即 pickle 模块能够处理的任何东西。 这包括大部分类实例、递归数据类型,以及包含大量共享子对象的对象。 键则为普通的字符串。

shelve.open(filename, flag='c', protocol=None, writeback=False)

Open a persistent dictionary. The filename specified is the base filename for the underlying database. As a side-effect, an extension may be added to the filename and more than one file may be created. By default, the underlying database file is opened for reading and writing. The optional flag parameter has the same interpretation as the flag parameter of anydbm.open().

By default, version 0 pickles are used to serialize values. The version of the pickle protocol can be specified with the protocol parameter.

在 2.3 版更改: The protocol parameter was added.

由于 Python 语义的限制,shelf 无法确定一个可变的持久化字典条目在何时被修改。 默认情况下 只有 在被修改对象再赋值给 shelf 时才会写入该对象 (参见 示例)。 如果可选的 writeback 形参设为 True,则所有被访问的条目都将在内存中被缓存,并会在 sync()close() 时被写入;这可以使得对持久化字典中可变条目的修改更方便,但是如果访问的条目很多,这会消耗大量内存作为缓存,并会使得关闭操作变得非常缓慢,因为所有被访问的条目都需要写回到字典(无法确定被访问的条目中哪个是可变的,也无法确定哪个被实际修改了)。

Like file objects, shelve objects should be closed explicitly to ensure that the persistent data is flushed to disk.

警告

由于 shelve 模块需要 pickle 的支持,因此从不可靠的来源载入 shelf 是不安全的。 与 pickle 一样,载入 shelf 时可以执行任意代码。

Shelf objects support most of the methods supported by dictionaries. This eases the transition from dictionary based scripts to those requiring persistent storage.

Note, the Python 3 transition methods (viewkeys(), viewvalues(), and viewitems()) are not supported.

额外支持的两个方法:

Shelf.sync()

如果 shelf 打开时将 writeback 设为 True 则写回缓存中的所有条目。 如果可行还会清空缓存并将持久化字典同步到磁盘。 此方法会在使用 close() 关闭 shelf 时自动被调用。

Shelf.close()

同步并关闭持久化 dict 对象。 对已关闭 shelf 的操作将失败并引发 ValueError

参见

持久化字典方案,使用了广泛支持的存储格式并具有原生字典的速度。

11.4.1. 限制

  • The choice of which database package will be used (such as dbm, gdbm or bsddb) depends on which interface is available. Therefore it is not safe to open the database directly using dbm. The database is also (unfortunately) subject to the limitations of dbm, if it is used — this means that (the pickled representation of) the objects stored in the database should be fairly small, and in rare cases key collisions may cause the database to refuse updates.

  • shelve 模块不支持对 shelve 对象的 并发 读/写访问。 (多个同时读取访问则是安全的。) 当一个程序打开一个 shelve 对象来写入时,不应再有其他程序同时打开它来读取或写入。 Unix 文件锁定可被用来解决此问题,但这在不同 Unix 版本上会存在差异,并且需要有关所用数据库实现的细节知识。

class shelve.Shelf(dict, protocol=None, writeback=False)

A subclass of UserDict.DictMixin which stores pickled values in the dict object.

By default, version 0 pickles are used to serialize values. The version of the pickle protocol can be specified with the protocol parameter. See the pickle documentation for a discussion of the pickle protocols.

在 2.3 版更改: The protocol parameter was added.

如果 writeback 形参为 True,对象将为所有访问过的条目保留缓存并在同步和关闭时将它们写回到 dict。 这允许对可变的条目执行自然操作,但是会消耗更多内存并让同步和关闭花费更长时间。

class shelve.BsdDbShelf(dict, protocol=None, writeback=False)

A subclass of Shelf which exposes first(), next(), previous(), last() and set_location() which are available in the bsddb module but not in other database modules. The dict object passed to the constructor must support those methods. This is generally accomplished by calling one of bsddb.hashopen(), bsddb.btopen() or bsddb.rnopen(). The optional protocol and writeback parameters have the same interpretation as for the Shelf class.

class shelve.DbfilenameShelf(filename, flag='c', protocol=None, writeback=False)

A subclass of Shelf which accepts a filename instead of a dict-like object. The underlying file will be opened using anydbm.open(). By default, the file will be created and opened for both read and write. The optional flag parameter has the same interpretation as for the open() function. The optional protocol and writeback parameters have the same interpretation as for the Shelf class.

11.4.2. 示例

对接口的总结如下 (key 为字符串,data 为任意对象):

import shelve

d = shelve.open(filename) # open -- file may get suffix added by low-level
                          # library

d[key] = data   # store data at key (overwrites old data if
                # using an existing key)
data = d[key]   # retrieve a COPY of data at key (raise KeyError if no
                # such key)
del d[key]      # delete data stored at key (raises KeyError
                # if no such key)
flag = d.has_key(key)   # true if the key exists
klist = d.keys() # a list of all existing keys (slow!)

# as d was opened WITHOUT writeback=True, beware:
d['xx'] = range(4)  # this works as expected, but...
d['xx'].append(5)   # *this doesn't!* -- d['xx'] is STILL range(4)!

# having opened d without writeback=True, you need to code carefully:
temp = d['xx']      # extracts the copy
temp.append(5)      # mutates the copy
d['xx'] = temp      # stores the copy right back, to persist it

# or, d=shelve.open(filename,writeback=True) would let you just code
# d['xx'].append(5) and have it work as expected, BUT it would also
# consume more memory and make the d.close() operation slower.

d.close()       # close it

参见

Module anydbm

Generic interface to dbm-style databases.

Module bsddb

BSD db database interface.

Module dbhash

Thin layer around the bsddb which provides an open() function like the other database modules.

模块 dbm

Standard Unix database interface.

Module dumbdbm

Portable implementation of the dbm interface.

Module gdbm

GNU database interface, based on the dbm interface.

模块 pickle

shelve 所使用的对象序列化。

Module cPickle

High-performance version of pickle.