15.2. io
— 处理流的核心工具¶
2.6 新版功能.
The io
module provides the Python interfaces to stream handling.
Under Python 2.x, this is proposed as an alternative to the built-in
file
object, but in Python 3.x it is the default interface to
access files and streams.
注解
Since this module has been designed primarily for Python 3.x, you have to
be aware that all uses of “bytes” in this document refer to the
str
type (of which bytes
is an alias), and all uses
of “text” refer to the unicode
type. Furthermore, those two
types are not interchangeable in the io
APIs.
At the top of the I/O hierarchy is the abstract base class IOBase
. It
defines the basic interface to a stream. Note, however, that there is no
separation between reading and writing to streams; implementations are allowed
to raise an IOError
if they do not support a given operation.
Extending IOBase
is RawIOBase
which deals simply with the
reading and writing of raw bytes to a stream. FileIO
subclasses
RawIOBase
to provide an interface to files in the machine’s
file system.
BufferedIOBase
deals with buffering on a raw byte stream
(RawIOBase
). Its subclasses, BufferedWriter
,
BufferedReader
, and BufferedRWPair
buffer streams that are
readable, writable, and both readable and writable.
BufferedRandom
provides a buffered interface to random access
streams. BytesIO
is a simple stream of in-memory bytes.
Another IOBase
subclass, TextIOBase
, deals with
streams whose bytes represent text, and handles encoding and decoding
from and to unicode
strings. TextIOWrapper
, which extends
it, is a buffered text interface to a buffered raw stream
(BufferedIOBase
). Finally, StringIO
is an in-memory
stream for unicode text.
Argument names are not part of the specification, and only the arguments of
open()
are intended to be used as keyword arguments.
15.2.1. Module Interface¶
-
io.
DEFAULT_BUFFER_SIZE
¶ An int containing the default buffer size used by the module’s buffered I/O classes.
open()
uses the file’s blksize (as obtained byos.stat()
) if possible.
-
io.
open
(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True)¶ Open file and return a corresponding stream. If the file cannot be opened, an
IOError
is raised.file is either a string giving the pathname (absolute or relative to the current working directory) of the file to be opened or an integer file descriptor of the file to be wrapped. (If a file descriptor is given, it is closed when the returned I/O object is closed, unless closefd is set to
False
.)mode is an optional string that specifies the mode in which the file is opened. It defaults to
'r'
which means open for reading in text mode. Other common values are'w'
for writing (truncating the file if it already exists), and'a'
for appending (which on some Unix systems, means that all writes append to the end of the file regardless of the current seek position). In text mode, if encoding is not specified the encoding used is platform dependent. (For reading and writing raw bytes use binary mode and leave encoding unspecified.) The available modes are:Character
Meaning
'r'
open for reading (default)
'w'
open for writing, truncating the file first
'a'
open for writing, appending to the end of the file if it exists
'b'
binary mode
't'
text mode (default)
'+'
open a disk file for updating (reading and writing)
'U'
universal newlines mode (for backwards compatibility; should not be used in new code)
The default mode is
'rt'
(open for reading text). For binary random access, the mode'w+b'
opens and truncates the file to 0 bytes, while'r+b'
opens the file without truncation.Python distinguishes between files opened in binary and text modes, even when the underlying operating system doesn’t. Files opened in binary mode (including
'b'
in the mode argument) return contents asbytes
objects without any decoding. In text mode (the default, or when't'
is included in the mode argument), the contents of the file are returned asunicode
strings, the bytes having been first decoded using a platform-dependent encoding or using the specified encoding if given.buffering is an optional integer used to set the buffering policy. Pass 0 to switch buffering off (only allowed in binary mode), 1 to select line buffering (only usable in text mode), and an integer > 1 to indicate the size of a fixed-size chunk buffer. When no buffering argument is given, the default buffering policy works as follows:
Binary files are buffered in fixed-size chunks; the size of the buffer is chosen using a heuristic trying to determine the underlying device’s “block size” and falling back on
DEFAULT_BUFFER_SIZE
. On many systems, the buffer will typically be 4096 or 8192 bytes long.“Interactive” text files (files for which
isatty()
returns True) use line buffering. Other text files use the policy described above for binary files.
encoding is the name of the encoding used to decode or encode the file. This should only be used in text mode. The default encoding is platform dependent (whatever
locale.getpreferredencoding()
returns), but any encoding supported by Python can be used. See thecodecs
module for the list of supported encodings.errors is an optional string that specifies how encoding and decoding errors are to be handled—this cannot be used in binary mode. Pass
'strict'
to raise aValueError
exception if there is an encoding error (the default ofNone
has the same effect), or pass'ignore'
to ignore errors. (Note that ignoring encoding errors can lead to data loss.)'replace'
causes a replacement marker (such as'?'
) to be inserted where there is malformed data. When writing,'xmlcharrefreplace'
(replace with the appropriate XML character reference) or'backslashreplace'
(replace with backslashed escape sequences) can be used. Any other error handling name that has been registered withcodecs.register_error()
is also valid.newline controls how universal newlines works (it only applies to text mode). It can be
None
,''
,'\n'
,'\r'
, and'\r\n'
. It works as follows:On input, if newline is
None
, universal newlines mode is enabled. Lines in the input can end in'\n'
,'\r'
, or'\r\n'
, and these are translated into'\n'
before being returned to the caller. If it is''
, universal newlines mode is enabled, but line endings are returned to the caller untranslated. If it has any of the other legal values, input lines are only terminated by the given string, and the line ending is returned to the caller untranslated.On output, if newline is
None
, any'\n'
characters written are translated to the system default line separator,os.linesep
. If newline is''
, no translation takes place. If newline is any of the other legal values, any'\n'
characters written are translated to the given string.
If closefd is
False
and a file descriptor rather than a filename was given, the underlying file descriptor will be kept open when the file is closed. If a filename is given closefd has no effect and must beTrue
(the default).The type of file object returned by the
open()
function depends on the mode. Whenopen()
is used to open a file in a text mode ('w'
,'r'
,'wt'
,'rt'
, etc.), it returns a subclass ofTextIOBase
(specificallyTextIOWrapper
). When used to open a file in a binary mode with buffering, the returned class is a subclass ofBufferedIOBase
. The exact class varies: in read binary mode, it returns aBufferedReader
; in write binary and append binary modes, it returns aBufferedWriter
, and in read/write mode, it returns aBufferedRandom
. When buffering is disabled, the raw stream, a subclass ofRawIOBase
,FileIO
, is returned.It is also possible to use an
unicode
orbytes
string as a file for both reading and writing. Forunicode
stringsStringIO
can be used like a file opened in text mode, and forbytes
aBytesIO
can be used like a file opened in a binary mode.
-
exception
io.
BlockingIOError
¶ Error raised when blocking would occur on a non-blocking stream. It inherits
IOError
.In addition to those of
IOError
,BlockingIOError
has one attribute:-
characters_written
¶ An integer containing the number of characters written to the stream before it blocked.
-
-
exception
io.
UnsupportedOperation
¶ An exception inheriting
IOError
andValueError
that is raised when an unsupported operation is called on a stream.
15.2.2. I/O 基类¶
-
class
io.
IOBase
¶ 所有 I/O 类的抽象基类,作用于字节流。没有公共构造函数。
此类为许多方法提供了空的抽象实现,派生类可以有选择地重写。默认实现无法读取、写入或查找的文件。
Even though
IOBase
does not declareread()
,readinto()
, orwrite()
because their signatures will vary, implementations and clients should consider those methods part of the interface. Also, implementations may raise anIOError
when operations they do not support are called.The basic type used for binary data read from or written to a file is
bytes
(also known asstr
). Method arguments may also bebytearray
ormemoryview
of arrays of bytes. In some cases, such asreadinto()
, a writable object such asbytearray
is required. Text I/O classes work withunicode
data.在 2.7 版更改: Implementations should support
memoryview
arguments.Note that calling any method (even inquiries) on a closed stream is undefined. Implementations may raise
IOError
in this case.IOBase (and its subclasses) support the iterator protocol, meaning that an
IOBase
object can be iterated over yielding the lines in a stream. Lines are defined slightly differently depending on whether the stream is a binary stream (yieldingbytes
), or a text stream (yieldingunicode
strings). Seereadline()
below.IOBase is also a context manager and therefore supports the
with
statement. In this example, file is closed after thewith
statement’s suite is finished—even if an exception occurs:with io.open('spam.txt', 'w') as file: file.write(u'Spam and eggs!')
IOBase
提供以下数据属性和方法:-
close
()¶ 刷新并关闭此流。如果文件已经关闭,则此方法无效。文件关闭后,对文件的任何操作(例如读取或写入)都会引发
ValueError
。为方便起见,允许多次调用此方法。但是,只有第一个调用才会生效。
-
closed
¶ True if the stream is closed.
-
fileno
()¶ Return the underlying file descriptor (an integer) of the stream if it exists. An
IOError
is raised if the IO object does not use a file descriptor.
-
flush
()¶ 刷新流的写入缓冲区(如果适用)。这对只读和非阻塞流不起作用。
-
isatty
()¶ 如果流是交互式的(即连接到终端/tty设备),则返回
True
。
-
readline
(limit=-1)¶ Read and return one line from the stream. If limit is specified, at most limit bytes will be read.
The line terminator is always
b'\n'
for binary files; for text files, the newline argument toopen()
can be used to select the line terminator(s) recognized.
-
readlines
(hint=-1)¶ 从流中读取并返回包含多行的列表。 可以指定 hint 来控制要读取的行数:如果(以字节/字符数表示的)所有行的总大小超出了 hint 则将不会读取更多的行。
请注意使用
for line in file: ...
就足够对文件对象进行迭代了,可以不必调用file.readlines()
。
-
seek
(offset, whence=SEEK_SET)¶ 将流位置修改到给定的字节 offset。 offset 将相对于由 whence 指定的位置进行解析。 whence 的默认值为
SEEK_SET
。 whence 的可用值有:SEEK_SET
或0
– 流的开头(默认值);offset 应为零或正值SEEK_CUR
or1
– 当前流位置;offset 可以为负值SEEK_END
or2
– 流的末尾;offset 通常为负值
返回新的绝对位置。
2.7 新版功能: The
SEEK_*
constants
-
seekable
()¶ Return
True
if the stream supports random access. IfFalse
,seek()
,tell()
andtruncate()
will raiseIOError
.
-
tell
()¶ 返回当前流的位置。
-
truncate
(size=None)¶ Resize the stream to the given size in bytes (or the current position if size is not specified). The current stream position isn’t changed. This resizing can extend or reduce the current file size. In case of extension, the contents of the new file area depend on the platform (on most systems, additional bytes are zero-filled, on Windows they’re undetermined). The new file size is returned.
-
writable
()¶ Return
True
if the stream supports writing. IfFalse
,write()
andtruncate()
will raiseIOError
.
-
writelines
(lines)¶ 将行列表写入到流。 不会添加行分隔符,因此通常所提供的每一行都带有末尾行分隔符。
-
-
class
io.
RawIOBase
¶ 原始二进制 I/O 的基类。 它继承自
IOBase
。 没有公共构造器。原始二进制 I/O 通常提供对下层 OS 设备或 API 的低层级访问,而不尝试将其封装到高层级的基元中(这是留给缓冲 I/O 和 Text I/O 的,将在下文中描述)。
In addition to the attributes and methods from
IOBase
, RawIOBase provides the following methods:-
read
(n=-1)¶ Read up to n bytes from the object and return them. As a convenience, if n is unspecified or -1,
readall()
is called. Otherwise, only one system call is ever made. Fewer than n bytes may be returned if the operating system call returns fewer than n bytes.If 0 bytes are returned, and n was not 0, this indicates end of file. If the object is in non-blocking mode and no bytes are available,
None
is returned.
-
readall
()¶ 从流中读取并返回所有字节直到 EOF,如有必要将对流执行多次调用。
-
readinto
(b)¶ Read up to len(b) bytes into b, and return the number of bytes read. The object b should be a pre-allocated, writable array of bytes, either
bytearray
ormemoryview
. If the object is in non-blocking mode and no bytes are available,None
is returned.
-
write
(b)¶ Write b to the underlying raw stream, and return the number of bytes written. The object b should be an array of bytes, either
bytes
,bytearray
, ormemoryview
. The return value can be less thanlen(b)
, depending on specifics of the underlying raw stream, and especially if it is in non-blocking mode.None
is returned if the raw stream is set not to block and no single byte could be readily written to it. The caller may release or mutate b after this method returns, so the implementation should only access b during the method call.
-
-
class
io.
BufferedIOBase
¶ 支持某种缓冲的二进制流的基类。 它继承自
IOBase
。 没有公共构造器。与
RawIOBase
的主要差别在于read()
,readinto()
和write()
等方法将(分别)尝试按照要求读取尽可能多的输入或是耗尽所有给定的输出,其代价是可能会执行一次以上的系统调用。除此之外,那些方法还可能引发
BlockingIOError
,如果下层的原始数据流处于非阻塞模式并且无法接受或给出足够数据的话;不同于对应的RawIOBase
方法,它们将永远不会返回None
。并且,
read()
方法也没有转向readinto()
的默认实现。典型的
BufferedIOBase
实现不应当继承自RawIOBase
实现,而要包装一个该实现,正如BufferedWriter
和BufferedReader
所做的那样。BufferedIOBase
在IOBase
的现有成员以外还提供或重载了下列方法和属性:-
raw
¶ 由
BufferedIOBase
处理的下层原始流 (RawIOBase
的实例)。 它不是BufferedIOBase
API 的组成部分并且不存在于某些实现中。
-
detach
()¶ 从缓冲区分离出下层原始流并将其返回。
在原始流被分离之后,缓冲区将处于不可用的状态。
某些缓冲区例如
BytesIO
并无可从此方法返回的单独原始流的概念。 它们将会引发UnsupportedOperation
。2.7 新版功能.
-
read
(n=-1)¶ Read and return up to n bytes. If the argument is omitted,
None
, or negative, data is read and returned until EOF is reached. An empty bytes object is returned if the stream is already at EOF.如果此参数为正值,并且下层原始流不可交互,则可能发起多个原始读取以满足字节计数(直至先遇到 EOF)。 但对于可交互原始流,则将至多发起一个原始读取,并且简短的结果并不意味着已到达 EOF。
BlockingIOError
会在下层原始流不处于阻塞模式,并且当前没有可用数据时被引发。
-
read1
(n=-1)¶ Read and return up to n bytes, with at most one call to the underlying raw stream’s
read()
method. This can be useful if you are implementing your own buffering on top of aBufferedIOBase
object.
-
readinto
(b)¶ Read up to len(b) bytes into b, and return the number of bytes read. The object b should be a pre-allocated, writable array of bytes, either
bytearray
ormemoryview
.Like
read()
, multiple reads may be issued to the underlying raw stream, unless the latter is ‘interactive’.BlockingIOError
会在下层原始流不处于阻塞模式,并且当前没有可用数据时被引发。
-
write
(b)¶ Write b, and return the number of bytes written (always equal to
len(b)
, since if the write fails anIOError
will be raised). The object b should be an array of bytes, eitherbytes
,bytearray
, ormemoryview
. Depending on the actual implementation, these bytes may be readily written to the underlying stream, or held in a buffer for performance and latency reasons.当处于非阻塞模式时,如果需要将数据写入原始流但它无法在不阻塞的情况下接受所有数据则将引发
BlockingIOError
。调用者可能会在此方法返回后释放或改变 b,因此该实现应当仅在方法调用期间访问 b。
-
15.2.3. 原始文件 I/O¶
-
class
io.
FileIO
(name, mode='r', closefd=True)¶ FileIO
代表在 OS 层级上包含文件的字节数据。 它实现了RawIOBase
接口(因而也实现了IOBase
接口)。name 可以是以下两项之一:
a string representing the path to the file which will be opened;
an integer representing the number of an existing OS-level file descriptor to which the resulting
FileIO
object will give access.
The mode can be
'r'
,'w'
or'a'
for reading (default), writing, or appending. The file will be created if it doesn’t exist when opened for writing or appending; it will be truncated when opened for writing. Add a'+'
to the mode to allow simultaneous reading and writing.该类的
read()
(当附带正值参数调用时),readinto()
和write()
方法将只执行一次系统调用。In addition to the attributes and methods from
IOBase
andRawIOBase
,FileIO
provides the following data attributes and methods:-
mode
¶ 构造函数中给定的模式。
-
name
¶ 文件名。当构造函数中没有给定名称时,这是文件的文件描述符。
15.2.4. 缓冲流¶
相比原始 I/O,缓冲 I/O 流提供了针对 I/O 设备的更高层级接口.
-
class
io.
BytesIO
([initial_bytes])¶ A stream implementation using an in-memory bytes buffer. It inherits
BufferedIOBase
.The optional argument initial_bytes is a
bytes
object that contains initial data.BytesIO
在继承自BufferedIOBase
和IOBase
的成员以外还提供或重载了下列方法:-
getvalue
()¶ Return
bytes
containing the entire contents of the buffer.
-
-
class
io.
BufferedReader
(raw, buffer_size=DEFAULT_BUFFER_SIZE)¶ 一个提供对可读的序列型
RawIOBase
对象更高层级访问的缓冲区。 它继承自BufferedIOBase
。 当从此对象读取数据时,可能会从下层原始流请求更大量的数据,并存放到内部缓冲区中。 接下来可以在后续读取时直接返回缓冲数据。根据给定的可读 raw 流和 buffer_size 创建
BufferedReader
的构造器。 如果省略 buffer_size,则会使用DEFAULT_BUFFER_SIZE
。BufferedReader
在继承自BufferedIOBase
和IOBase
的成员以外还提供或重载了下列方法:-
peek
([n])¶ 从流返回字节数据而不前移位置。 完成此调用将至多读取一次原始流。 返回的字节数量可能少于或多于请求的数量。
-
read
([n])¶ Read and return n bytes, or if n is not given or negative, until EOF or if the read call would block in non-blocking mode.
-
read1
(n)¶ Read and return up to n bytes with only one call on the raw stream. If at least one byte is buffered, only buffered bytes are returned. Otherwise, one raw stream read call is made.
-
-
class
io.
BufferedWriter
(raw, buffer_size=DEFAULT_BUFFER_SIZE)¶ A buffer providing higher-level access to a writeable, sequential
RawIOBase
object. It inheritsBufferedIOBase
. When writing to this object, data is normally held into an internal buffer. The buffer will be written out to the underlyingRawIOBase
object under various conditions, including:当缓冲区对于所有挂起数据而言太小时;
当
flush()
被调用时当(为
BufferedRandom
对象)请求seek()
时;当
BufferedWriter
对象被关闭或销毁时。
该构造器会为给定的可写 raw 流创建一个
BufferedWriter
。 如果未给定 buffer_size,则使用默认的DEFAULT_BUFFER_SIZE
。A third argument, max_buffer_size, is supported, but unused and deprecated.
BufferedWriter
在继承自BufferedIOBase
和IOBase
的成员以外还提供或重载了下列方法:-
flush
()¶ 将缓冲区中保存的字节数据强制放入原始流。 如果原始流发生阻塞则应当引发
BlockingIOError
。
-
write
(b)¶ Write b, and return the number of bytes written. The object b should be an array of bytes, either
bytes
,bytearray
, ormemoryview
. When in non-blocking mode, aBlockingIOError
is raised if the buffer needs to be written out but the raw stream blocks.
-
class
io.
BufferedRandom
(raw, buffer_size=DEFAULT_BUFFER_SIZE)¶ 随机访问流的带缓冲的接口。 它继承自
BufferedReader
和BufferedWriter
,并进一步支持seek()
和tell()
功能。该构造器会为在第一个参数中给定的可查找原始流创建一个读取器和定稿器。 如果省略 buffer_size 则使用默认的
DEFAULT_BUFFER_SIZE
。A third argument, max_buffer_size, is supported, but unused and deprecated.
BufferedRandom
能做到BufferedReader
或BufferedWriter
所能做的任何事。
-
class
io.
BufferedRWPair
(reader, writer, buffer_size=DEFAULT_BUFFER_SIZE)¶ 一个带缓冲的 I/O 对象,它将两个单向
RawIOBase
对象 – 一个可读,另一个可写 – 组合为单个双向端点。 它继承自BufferedIOBase
。reader 和 writer 分别是可读和可写的
RawIOBase
对象。 如果省略 buffer_size 则使用默认的DEFAULT_BUFFER_SIZE
。A fourth argument, max_buffer_size, is supported, but unused and deprecated.
BufferedRWPair
实现了BufferedIOBase
的所有方法,但detach()
除外,调用该方法将引发UnsupportedOperation
。警告
BufferedRWPair
不会尝试同步访问其下层的原始流。 你不应当将传给它与读取器和写入器相同的对象;而要改用BufferedRandom
。
15.2.5. 文本 I/O¶
-
class
io.
TextIOBase
¶ Base class for text streams. This class provides a unicode character and line based interface to stream I/O. There is no
readinto()
method because Python’sunicode
strings are immutable. It inheritsIOBase
. There is no public constructor.TextIOBase
在来自IOBase
的成员以外还提供或重载了以下数据属性和方法:-
encoding
¶ 用于将流的字节串解码为字符串以及将字符串编码为字节串的编码格式名称。
-
errors
¶ 解码器或编码器的错误设置。
-
newlines
¶ 一个字符串、字符串元组或者
None
,表示目前已经转写的新行。 根据具体实现和初始构造器旗标的不同,此属性或许会不可用。
-
buffer
¶ The underlying binary buffer (a
BufferedIOBase
instance) thatTextIOBase
deals with. This is not part of theTextIOBase
API and may not exist on some implementations.
-
detach
()¶ 从
TextIOBase
分离出下层二进制缓冲区并将其返回。在下层缓冲区被分离后,
TextIOBase
将处于不可用的状态。Some
TextIOBase
implementations, likeStringIO
, may not have the concept of an underlying buffer and calling this method will raiseUnsupportedOperation
.2.7 新版功能.
-
read
(n=-1)¶ Read and return at most n characters from the stream as a single
unicode
. If n is negative orNone
, reads until EOF.
-
readline
(limit=-1)¶ Read until newline or EOF and return a single
unicode
. If the stream is already at EOF, an empty string is returned.If limit is specified, at most limit characters will be read.
-
seek
(offset, whence=SEEK_SET)¶ 将流位置改为给定的偏移位置 offset。 具体行为取决于 whence 形参。 whence 的默认值为
SEEK_SET
。SEEK_SET
或0
: 从流的开始位置起查找(默认值);offset 必须为TextIOBase.tell()
所返回的数值或为零。 任何其他 offset 值都将导致未定义的行为。SEEK_CUR
或1
: “查找” 到当前位置;offset 必须为零,表示无操作(所有其他值均不受支持)。SEEK_END
或2
: 查找到流的末尾;offset 必须为零(所有其他值均不受支持)。
以数字形式返回新的绝对位置。
2.7 新版功能:
SEEK_*
常量.
-
tell
()¶ 以不透明数字形式返回当前流的位置。 该数字通常并不代表下层二进制存储中对应的字节数。
-
-
class
io.
TextIOWrapper
(buffer, encoding=None, errors=None, newline=None, line_buffering=False)¶ 一个基于
BufferedIOBase
二进制流的缓冲文本流。 它继承自TextIOBase
。encoding gives the name of the encoding that the stream will be decoded or encoded with. It defaults to
locale.getpreferredencoding()
.errors is an optional string that specifies how encoding and decoding errors are to be handled. Pass
'strict'
to raise aValueError
exception if there is an encoding error (the default ofNone
has the same effect), or pass'ignore'
to ignore errors. (Note that ignoring encoding errors can lead to data loss.)'replace'
causes a replacement marker (such as'?'
) to be inserted where there is malformed data. When writing,'xmlcharrefreplace'
(replace with the appropriate XML character reference) or'backslashreplace'
(replace with backslashed escape sequences) can be used. Any other error handling name that has been registered withcodecs.register_error()
is also valid.newline 控制行结束符处理方式。 它可以为
None
,''
,'\n'
,'\r'
和'\r\n'
。 其工作原理如下:On input, if newline is
None
, universal newlines mode is enabled. Lines in the input can end in'\n'
,'\r'
, or'\r\n'
, and these are translated into'\n'
before being returned to the caller. If it is''
, universal newlines mode is enabled, but line endings are returned to the caller untranslated. If it has any of the other legal values, input lines are only terminated by the given string, and the line ending is returned to the caller untranslated.On output, if newline is
None
, any'\n'
characters written are translated to the system default line separator,os.linesep
. If newline is''
, no translation takes place. If newline is any of the other legal values, any'\n'
characters written are translated to the given string.
如果 line_buffering 为
True
,则当一个写入调用包含换行符或回车时将会应用flush()
。TextIOWrapper
provides one attribute in addition to those ofTextIOBase
and its parents:-
line_buffering
¶ 是否启用行缓冲。
-
class
io.
StringIO
(initial_value=u'', newline=u'\n')¶ An in-memory stream for unicode text. It inherits
TextIOWrapper
.缓冲区的初始值可通过提供 initial_value 来设置。 如果启用了行结束符转写,换行将以
write()
所用的方式被编码。 数据流位置将被设为缓冲区的开头。newline 参数的规则与
TextIOWrapper
所用的一致。 默认规则是仅将\n
字符视为行结束符并且不执行换行符转写。 如果 newline 设为None
,在所有平台上换行符都将被写入为\n
,但当读取时仍然会执行通用换行编码格式。StringIO
provides this method in addition to those fromTextIOWrapper
and its parents:-
getvalue
()¶ Return a
unicode
containing the entire contents of the buffer at any time before theStringIO
object’sclose()
method is called. Newlines are decoded as if byread()
, although the stream position is not changed.
用法示例:
import io output = io.StringIO() output.write(u'First line.\n') output.write(u'Second line.\n') # Retrieve file contents -- this will be # u'First line.\nSecond line.\n' contents = output.getvalue() # Close object and discard memory buffer -- # .getvalue() will now raise an exception. output.close()
-
-
class
io.
IncrementalNewlineDecoder
¶ 用于在 universal newlines 模式下解码换行符的辅助编解码器。 它继承自
codecs.IncrementalDecoder
。
15.2.6. Advanced topics¶
Here we will discuss several advanced topics pertaining to the concrete I/O implementations described above.
15.2.6.1. 性能¶
15.2.6.1.1. 二进制 I/O¶
By reading and writing only large chunks of data even when the user asks for a single byte, buffered I/O is designed to hide any inefficiency in calling and executing the operating system’s unbuffered I/O routines. The gain will vary very much depending on the OS and the kind of I/O which is performed (for example, on some contemporary OSes such as Linux, unbuffered disk I/O can be as fast as buffered I/O). The bottom line, however, is that buffered I/O will offer you predictable performance regardless of the platform and the backing device. Therefore, it is most always preferable to use buffered I/O rather than unbuffered I/O.
15.2.6.1.2. 文本 I/O¶
Text I/O over a binary storage (such as a file) is significantly slower than
binary I/O over the same storage, because it implies conversions from
unicode to binary data using a character codec. This can become noticeable
if you handle huge amounts of text data (for example very large log files).
Also, TextIOWrapper.tell()
and TextIOWrapper.seek()
are both
quite slow due to the reconstruction algorithm used.
StringIO
, however, is a native in-memory unicode container and will
exhibit similar speed to BytesIO
.
15.2.6.2. 多线程¶
FileIO
objects are thread-safe to the extent that the operating
system calls (such as read(2)
under Unix) they are wrapping are thread-safe
too.
二进制缓冲对象(例如 BufferedReader
, BufferedWriter
, BufferedRandom
和 BufferedRWPair
)使用锁来保护其内部结构;因此,可以安全地一次从多个线程中调用它们。
TextIOWrapper
对象不再是线程安全的。
15.2.6.3. 可重入性¶
Binary buffered objects (instances of BufferedReader
,
BufferedWriter
, BufferedRandom
and BufferedRWPair
)
are not reentrant. While reentrant calls will not happen in normal situations,
they can arise if you are doing I/O in a signal
handler. If it is
attempted to enter a buffered object again while already being accessed
from the same thread, then a RuntimeError
is raised.
上面的内容隐含地扩展到文本文件,因为 open()
函数会把缓冲对象包装在 TextIOWrapper
中。这包括标准流,因此也会影响内置函数 print()
。