13 PEP 305: Comma-separated Files

Comma-separated files are a format frequently used for exporting data from databases and spreadsheets. Python 2.3 adds a parser for comma-separated files.

Comma-separated format is deceptively simple at first glance:


Read a line and call line.split(','): what could be simpler? But toss in string data that can contain commas, and things get more complicated:

"Costs",150,200,3.95,"Includes taxes, shipping, and sundry items"

A big ugly regular expression can parse this, but using the new csv package is much simpler:

import csv

input = open('datafile', 'rb')
reader = csv.reader(input)
for line in reader:
    print line

The reader function takes a number of different options. The field separator isn't limited to the comma and can be changed to any character, and so can the quoting and line-ending characters.

Different dialects of comma-separated files can be defined and registered; currently there are two dialects, both used by Microsoft Excel. A separate csv.writer class will generate comma-separated files from a succession of tuples or lists, quoting strings that contain the delimiter.

See Also:

PEP 305, CSV File API
Written and implemented by Kevin Altis, Dave Cole, Andrew McNamara, Skip Montanaro, Cliff Wells.

See About this document... for information on suggesting changes.