Posts Tagged ‘list’

Articles

Thou Shalt Not Modify A List During Iteration

In Python on February 12, 2009 by Matt Giuca Tagged: , , ,

This came up recently when discussing a friend’s code. It relates specifically to Python, but I think the rule is more general.

Firstly, let me be clear that in this article, when I say “modify”, I mean inserting or removing items from the list. Merely updating or mutating the list items is fine.

Now, consider the following Python code:

elems = ['a','b','c']
for e in elems:
    print e
    elems.remove(e)

This prints:

a
c

The author of this code snippet probably expected it to iterate over each element of the list, print it out, then remove it so that the list ends up empty (this may be desirable if, for example, you are removing items conditionally). Hence, he is probably surprised that 'b' was skipped, and furthermore, remains in the list:

>>> elems
['b']

We can see why only 'a' and 'c' were touched by imagining that Python translates the loop into the following equivalent lower-level code:

elems = ['a','b','c']
i = 0
while i < len(elems):
    e = elems[i]
    print e
    elems.remove(e)
    i += 1

This has the same result. Now it’s clear that what’s happening: when you remove the 'a' from elems, the remaining elements slide down the list. The list is now ['b', 'c']. Now 'b' is at index 0, and 'c' is at index 1. Since the next iteration is going to look at index 1 (which is the 'c' element), the 'b' gets skipped entirely.

This rule doesn’t seem to be documented officially anywhere. It’s just a generally-followed rule. I believe the behaviour is, well, unspecified behaviour. Here is a discussion of the issue.

Also note that the same rule applies to the set and dict types (and similar), only those types will actually raise a RuntimeError. (If you know how hash tables work, it should be clear why — there’s no safe way to let that slide).

This rule should be applied in all languages, not just Python. While other languages have different rules, the basic principle above still applies — you’ll just get shot in different ways. (It would even apply to C, if you wrote your own code to insert or remove elements from an array). Also, this sort of code is invariably confusing.

So how can you work around it?

  • You could translate your code into the while-loop form, as above. Then you have explicit access to the variable i, and you can hand-code a fix. (For example, I could choose not to increment i when I am removing an element).
  • You could compute the set of modifications to make to the list during iteration, then apply them afterwards.
  • You could construct a new list during iteration rather than mutating the existing one. (For example, rather than removing all the elements which satisfy a condition, insert into a new list all the elements which don’t).
  • A solution recommended in the discussion linked above is to duplicate the input list and iterate over the copy, mutating the original.

There is always a way around. You just have to be creative!