Articles

Thou Shalt Not Modify A List During Iteration

In Python on February 12, 2009 by Matt Giuca Tagged: , , ,

This came up recently when discussing a friend’s code. It relates specifically to Python, but I think the rule is more general.

Firstly, let me be clear that in this article, when I say “modify”, I mean inserting or removing items from the list. Merely updating or mutating the list items is fine.

Now, consider the following Python code:

elems = ['a','b','c']
for e in elems:
    print e
    elems.remove(e)

This prints:

a
c

The author of this code snippet probably expected it to iterate over each element of the list, print it out, then remove it so that the list ends up empty (this may be desirable if, for example, you are removing items conditionally). Hence, he is probably surprised that 'b' was skipped, and furthermore, remains in the list:

>>> elems
['b']

We can see why only 'a' and 'c' were touched by imagining that Python translates the loop into the following equivalent lower-level code:

elems = ['a','b','c']
i = 0
while i < len(elems):
    e = elems[i]
    print e
    elems.remove(e)
    i += 1

This has the same result. Now it’s clear that what’s happening: when you remove the 'a' from elems, the remaining elements slide down the list. The list is now ['b', 'c']. Now 'b' is at index 0, and 'c' is at index 1. Since the next iteration is going to look at index 1 (which is the 'c' element), the 'b' gets skipped entirely.

This rule doesn’t seem to be documented officially anywhere. It’s just a generally-followed rule. I believe the behaviour is, well, unspecified behaviour. Here is a discussion of the issue.

Also note that the same rule applies to the set and dict types (and similar), only those types will actually raise a RuntimeError. (If you know how hash tables work, it should be clear why — there’s no safe way to let that slide).

This rule should be applied in all languages, not just Python. While other languages have different rules, the basic principle above still applies — you’ll just get shot in different ways. (It would even apply to C, if you wrote your own code to insert or remove elements from an array). Also, this sort of code is invariably confusing.

So how can you work around it?

  • You could translate your code into the while-loop form, as above. Then you have explicit access to the variable i, and you can hand-code a fix. (For example, I could choose not to increment i when I am removing an element).
  • You could compute the set of modifications to make to the list during iteration, then apply them afterwards.
  • You could construct a new list during iteration rather than mutating the existing one. (For example, rather than removing all the elements which satisfy a condition, insert into a new list all the elements which don’t).
  • A solution recommended in the discussion linked above is to duplicate the input list and iterate over the copy, mutating the original.

There is always a way around. You just have to be creative!

Advertisements

10 Responses to “Thou Shalt Not Modify A List During Iteration”

  1. I’m also a little concerned that sets and dicts only raise a RuntimeError if the *size* changes. I think it’s possible for the hash table to get mangled-during-iteration if any insertions or deletions are made, even if the net size difference is 0.

    I’m going to investigate further.

  2. I have a feeling that I’ve seen a programming language that explicitly forbids any sort of modification in loops (I have a feeling it might have been Visual Basic or Javascript). It probably had good intentions but it just was annoying having more or less syntactic sugar trying too hard to protect me from myself.

  3. I think it was VB: http://msdn.microsoft.com/en-us/library/5ebk1751.aspx

    This is a great feature. I agree that in general features shouldn’t artificially restrict you, but in this case the semantics of the loop break if you modify it, so yes, it should stop you from doing it.

  4. > You could translate your code into the while-loop form, as above. Then you have explicit access to the variable i, and you can hand-code a fix. (For example, I could choose not to increment i when I am removing an element).

    This would only work if the element you’re removing is elems[i].

  5. http://docs.python.org/reference/compound_stmts.html#for documents the behaviour. It would be nice if it was mentioned in the sequence types docs as well though.

    • Cheers for that link. Yes, that behaviour is documented. But I don’t really like relying on wacky behaviour that exists solely because “that’s the way the implementation works”, even if it is documented. Firstly, it makes the code unintuitive to read. Secondly, it depends on the level of strictness in the documentation. If it was Java (where every speck of behaviour is rigorously documented and unchangeable), then I might consider it okay, but in Python, the documentation often tends to be descriptive of how one particular implementation works, and other implementations generally are free to change things up if it suits.

  6. Regarding your issue of modifying a list while iterating over it, Java addresses it in two ways:

    * Iterator has a remove() method which is specifically designed to remove the most recent item produced.
    * The standard data structure implementations (ArrayList, etc.) have fail-fast iterators – if you modify the data during iteration, the next iteration will throw an exception.

    A general interesting point is that when you compare Python and Java, they have a lot in common in their underlying features and libraries, but oftentimes one language implements features or documents features significantly better than the other language. (As an opposite example, Python’s built-in list slicing is much better than Java’s need to pass offsets and lengths explicitly.)

  7. […] did manage to get rid of old errors (which were promptly replaced by new ones). I also learned that one must not modify lists while iterating, which makes a lot of sense. Other things I learned were that when trying to obtain a light curve […]

  8. […] Have alook at this: Thou Shalt Not Modify A List During Iteration […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: