3 PEP 234: Iterators

Another significant addition to 2.2 is an iteration interface at both the C and Python levels. Objects can define how they can be looped over by callers.

In Python versions up to 2.1, the usual way to make for item in obj work is to define a __getitem__() method that looks something like this:

    def __getitem__(self, index):
        return <next item>

__getitem__() is more properly used to define an indexing operation on an object so that you can write obj[5] to retrieve the sixth element. It's a bit misleading when you're using this only to support for loops. Consider some file-like object that wants to be looped over; the index parameter is essentially meaningless, as the class probably assumes that a series of __getitem__() calls will be made with index incrementing by one each time. In other words, the presence of the __getitem__() method doesn't mean that using file[5] to randomly access the sixth element will work, though it really should.

In Python 2.2, iteration can be implemented separately, and __getitem__() methods can be limited to classes that really do support random access. The basic idea of iterators is simple. A new built-in function, iter(obj) or iter(C, sentinel), is used to get an iterator. iter(obj) returns an iterator for the object obj, while iter(C, sentinel) returns an iterator that will invoke the callable object C until it returns sentinel to signal that the iterator is done.

Python classes can define an __iter__() method, which should create and return a new iterator for the object; if the object is its own iterator, this method can just return self. In particular, iterators will usually be their own iterators. Extension types implemented in C can implement a tp_iter function in order to return an iterator, and extension types that want to behave as iterators can define a tp_iternext function.

So, after all this, what do iterators actually do? They have one required method, next(), which takes no arguments and returns the next value. When there are no more values to be returned, calling next() should raise the StopIteration exception.

>>> L = [1,2,3]
>>> i = iter(L)
>>> print i
<iterator object at 0x8116870>
>>> i.next()
>>> i.next()
>>> i.next()
>>> i.next()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?

In 2.2, Python's for statement no longer expects a sequence; it expects something for which iter() will return an iterator. For backward compatibility and convenience, an iterator is automatically constructed for sequences that don't implement __iter__() or a tp_iter slot, so for i in [1,2,3] will still work. Wherever the Python interpreter loops over a sequence, it's been changed to use the iterator protocol. This means you can do things like this:

>>> L = [1,2,3]
>>> i = iter(L)
>>> a,b,c = i
>>> a,b,c
(1, 2, 3)

Iterator support has been added to some of Python's basic types. Calling iter() on a dictionary will return an iterator which loops over its keys:

>>> m = {'Jan': 1, 'Feb': 2, 'Mar': 3, 'Apr': 4, 'May': 5, 'Jun': 6,
...      'Jul': 7, 'Aug': 8, 'Sep': 9, 'Oct': 10, 'Nov': 11, 'Dec': 12}
>>> for key in m: print key, m[key]
Mar 3
Feb 2
Aug 8
Sep 9
May 5
Jun 6
Jul 7
Jan 1
Apr 4
Nov 11
Dec 12
Oct 10

That's just the default behaviour. If you want to iterate over keys, values, or key/value pairs, you can explicitly call the iterkeys(), itervalues(), or iteritems() methods to get an appropriate iterator. In a minor related change, the in operator now works on dictionaries, so key in dict is now equivalent to dict.has_key(key).

Files also provide an iterator, which calls the readline() method until there are no more lines in the file. This means you can now read each line of a file using code like this:

for line in file:
    # do something for each line

Note that you can only go forward in an iterator; there's no way to get the previous element, reset the iterator, or make a copy of it. An iterator object could provide such additional capabilities, but the iterator protocol only requires a next() method.

See Also:

PEP 234, Iterators
Written by Ka-Ping Yee and GvR; implemented by the Python Labs crew, mostly by GvR and Tim Peters.

See About this document... for information on suggesting changes.