19.1.3. email.generator: Generating MIME documents

Source code: Lib/email/generator.py


One of the most common tasks is to generate the flat text of the email message represented by a message object structure. You will need to do this if you want to send your message via the smtplib module or the nntplib module, or print the message on the console. Taking a message object structure and producing a flat text document is the job of the Generator class.

Again, as with the email.parser module, you aren't limited to the functionality of the bundled generator; you could write one from scratch yourself. However the bundled generator knows how to generate most email in a standards-compliant way, should handle MIME and non-MIME email messages just fine, and is designed so that the transformation from flat text, to a message structure via the Parser class, and back to flat text, is idempotent (the input is identical to the output) [1]. On the other hand, using the Generator on a Message constructed by program may result in changes to the Message object as defaults are filled in.

bytes output can be generated using the BytesGenerator class. If the message object structure contains non-ASCII bytes, this generator's flatten() method will emit the original bytes. Parsing a binary message and then flattening it with BytesGenerator should be idempotent for standards compliant messages.

Here are the public methods of the Generator class, imported from the email.generator module:

class email.generator.Generator(outfp, mangle_from_=True, maxheaderlen=78, *, policy=None)

The constructor for the Generator class takes a file-like object called outfp for an argument. outfp must support the write() method and be usable as the output file for the print() function.

Optional mangle_from_ is a flag that, when True, puts a > character in front of any line in the body that starts exactly as From, i.e. From followed by a space at the beginning of the line. This is the only guaranteed portable way to avoid having such lines be mistaken for a Unix mailbox format envelope header separator (see WHY THE CONTENT-LENGTH FORMAT IS BAD for details). mangle_from_ defaults to True, but you might want to set this to False if you are not writing Unix mailbox format files.

Optional maxheaderlen specifies the longest length for a non-continued header. When a header line is longer than maxheaderlen (in characters, with tabs expanded to 8 spaces), the header will be split as defined in the Header class. Set to zero to disable header wrapping. The default is 78, as recommended (but not required) by RFC 2822.

The policy keyword specifies a policy object that controls a number of aspects of the generator's operation. If no policy is specified, then the policy attached to the message object passed to flatten is used.

Berubah pada versi 3.3: Added the policy keyword.

The other public Generator methods are:

flatten(msg, unixfrom=False, linesep=None)

Print the textual representation of the message object structure rooted at msg to the output file specified when the Generator instance was created. Subparts are visited depth-first and the resulting text will be properly MIME encoded.

Optional unixfrom is a flag that forces the printing of the envelope header delimiter before the first RFC 2822 header of the root message object. If the root object has no envelope header, a standard one is crafted. By default, this is set to False to inhibit the printing of the envelope delimiter.

Note that for subparts, no envelope header is ever printed.

Optional linesep specifies the line separator character used to terminate lines in the output. If specified it overrides the value specified by the msg's or Generator's policy.

Because strings cannot represent non-ASCII bytes, if the policy that applies when flatten is run has cte_type set to 8bit, Generator will operate as if it were set to 7bit. This means that messages parsed with a Bytes parser that have a Content-Transfer-Encoding of 8bit will be converted to a use a 7bit Content-Transfer-Encoding. Non-ASCII bytes in the headers will be RFC 2047 encoded with a charset of unknown-8bit.

Berubah pada versi 3.2: Added support for re-encoding 8bit message bodies, and the linesep argument.

clone(fp)

Return an independent clone of this Generator instance with the exact same options.

write(s)

Write the string s to the underlying file object, i.e. outfp passed to Generator's constructor. This provides just enough file-like API for Generator instances to be used in the print() function.

As a convenience, see the Message methods as_string() and str(aMessage), a.k.a. __str__(), which simplify the generation of a formatted string representation of a message object. For more detail, see email.message.

class email.generator.BytesGenerator(outfp, mangle_from_=True, maxheaderlen=78, *, policy=None)

The constructor for the BytesGenerator class takes a binary file-like object called outfp for an argument. outfp must support a write() method that accepts binary data.

Optional mangle_from_ is a flag that, when True, puts a > character in front of any line in the body that starts exactly as From, i.e. From followed by a space at the beginning of the line. This is the only guaranteed portable way to avoid having such lines be mistaken for a Unix mailbox format envelope header separator (see WHY THE CONTENT-LENGTH FORMAT IS BAD for details). mangle_from_ defaults to True, but you might want to set this to False if you are not writing Unix mailbox format files.

Optional maxheaderlen specifies the longest length for a non-continued header. When a header line is longer than maxheaderlen (in characters, with tabs expanded to 8 spaces), the header will be split as defined in the Header class. Set to zero to disable header wrapping. The default is 78, as recommended (but not required) by RFC 2822.

The policy keyword specifies a policy object that controls a number of aspects of the generator's operation. If no policy is specified, then the policy attached to the message object passed to flatten is used.

Berubah pada versi 3.3: Added the policy keyword.

The other public BytesGenerator methods are:

flatten(msg, unixfrom=False, linesep=None)

Print the textual representation of the message object structure rooted at msg to the output file specified when the BytesGenerator instance was created. Subparts are visited depth-first and the resulting text will be properly MIME encoded. If the policy option cte_type is 8bit (the default), then any bytes with the high bit set in the original parsed message that have not been modified will be copied faithfully to the output. If cte_type is 7bit, the bytes will be converted as needed using an ASCII-compatible Content-Transfer-Encoding. In particular, RFC-invalid non-ASCII bytes in headers will be encoded using the MIME unknown-8bit character set, thus rendering them RFC-compliant.

Messages parsed with a Bytes parser that have a Content-Transfer-Encoding of 8bit will be reconstructed as 8bit if they have not been modified.

Optional unixfrom is a flag that forces the printing of the envelope header delimiter before the first RFC 2822 header of the root message object. If the root object has no envelope header, a standard one is crafted. By default, this is set to False to inhibit the printing of the envelope delimiter.

Note that for subparts, no envelope header is ever printed.

Optional linesep specifies the line separator character used to terminate lines in the output. If specified it overrides the value specified by the Generatoror msg's policy.

clone(fp)

Return an independent clone of this BytesGenerator instance with the exact same options.

write(s)

Write the string s to the underlying file object. s is encoded using the ASCII codec and written to the write method of the outfp outfp passed to the BytesGenerator's constructor. This provides just enough file-like API for BytesGenerator instances to be used in the print() function.

Baru pada versi 3.2.

The email.generator module also provides a derived class, called DecodedGenerator which is like the Generator base class, except that non-text parts are substituted with a format string representing the part.

class email.generator.DecodedGenerator(outfp, mangle_from_=True, maxheaderlen=78, fmt=None)

This class, derived from Generator walks through all the subparts of a message. If the subpart is of main type text, then it prints the decoded payload of the subpart. Optional _mangle_from_ and maxheaderlen are as with the Generator base class.

If the subpart is not of main type text, optional fmt is a format string that is used instead of the message payload. fmt is expanded with the following keywords, %(keyword)s format:

  • type -- Full MIME type of the non-text part
  • maintype -- Main MIME type of the non-text part
  • subtype -- Sub-MIME type of the non-text part
  • filename -- Filename of the non-text part
  • description -- Description associated with the non-text part
  • encoding -- Content transfer encoding of the non-text part

The default value for fmt is None, meaning

[Non-text (%(type)s) part of message omitted, filename %(filename)s]

Catatan kaki

[1]This statement assumes that you use the appropriate setting for the unixfrom argument, and that you set maxheaderlen=0 (which will preserve whatever the input line lengths were). It is also not strictly true, since in many cases runs of whitespace in headers are collapsed into single blanks. The latter is a bug that will eventually be fixed.