Articles

Py3K: Solving the “outer scope” problem

In Python on June 27, 2008 by Matt Giuca

I recently built the beta of Python 3000 – the upcoming total revamp of Python (due to be released in September – 992 years before they promised!) Because Py3K is unashamedly “backwards incompatible”, they are finally fixing all the major language flaws and making things “the way they should be!” (Note there will be a somewhat automated conversion process from Python 2 to 3 code).

And I love it! Everything is fixed the way I hoped. Hence this is the first in the “Py3K rox my sox” series of blog posts. You can see a summary of new features here.

OK, so one of the major problems I’ve complained about (and heard) in Python is the so-called “outer scope” problem. This is a very definite limitation of what you can do in Python. Read on!

How globals really work

First a bit of background you may not know. This applies to all versions of Python, not just 3.0.

In Python if you don’t declare a variable, Python figures out whether you’re referring to a local or global based on whether you write to it. For example:

x = 4
def f():
    return x

Here, Python figures out that the x you refer to is actually the global x, and returns 4. It figures this out because the function never writes to x, anywhere. Not just because it hasn’t written to x yet, but because it has no statement which assigns to x. (It figures this out statically, not at runtime). So, for example:

x = 4
def f():
    if True:
        return x
    else:
        x = 2

This would be a neat quiz question actually: What does f() evaluate to?

Answer: UnboundLocalError: local variable ‘x’ referenced before assignment.

The mere fact that x is assigned somewhere in the function (even somewhere which will never be executed) causes Python to treat it as a local, and hence it is undefined when you go to return it.

The correct solution is to declare it “global” explicitly, which is the only way to make a function which writes to a global.

x = 4
def f():
    global x
    if True:
        return x
    else:
        x = 2

This works well in practice, because you can define constants like MAX_FOO and use them all over the place without declaring them global, but you need to be explicit if you want to update a global (which is usually a good idea because it’s dangerous – see JavaScript for a counter-example).

The “outer scope” problem

On to the “outer scope” problem. Basically, Python lets you write nested functions, and the nested functions have access to the local variables of their containing code. For example:

def outer():
    x = 9
    def inner_read():
        return x
    return inner_read()

If you call outer(), it will return 9. The variable x is local to the outer function. But the inner function can read it, and return it.

The problem comes when you want to write to a non-local variable, like this:

def outer():
    x = 9
    def inner_read():
        return x
    def inner_write():
        x = 3
    inner_write()
    return inner_read()

As with global variables, Python can find outer scope variables if you only read them (as inner_read does), but if you write to them anywhere in the function, it assumes you are making a new local variable (as inner_write does). Hence inner_write creates a new local x, and assigns it 3, and the function outer returns 9. I would like for inner_write to update the existing x, and hence have outer return 3.

The solution is pretty simple: Have a keyword like global, but rather than going all the way to the top scope, it just tells Python to look for the innermost scope with a bound variable of that name.

Python 3.0 introduces exactly that: the nonlocal keyword. Let’s give it a try!

def outer():
    x = 9
    def inner_read():
        return x
    def inner_write():
        nonlocal x
        x = 3
    inner_write()
    return inner_read()

Woot! Python 3.0 compiles this code and the outer function returns 3.

The funny thing is, this problem seems to be specific to Python. In most static languages, all variables are declared. In Haskell, all variables are read-only. In Ruby, you refer to global variables by prefixing them with a $dollar. In JavaScript, it’s the inverse of Python: you declare all local variables and they default to global (which is a hideous idea – if you forget to declare a variable you implicitly start sharing where you didn’t expect to be sharing). Of course there are probably other languages with this problem but Python is the only one I’ve ever seen.

References

8 Responses to “Py3K: Solving the “outer scope” problem”

  1. Ahh. I see now!

  2. Effectively javascript is a must-declare language for that reason.

    Never leave home without your JSLint, it’ll point that out when you use an undeclared variable. Arguably, it should be built into the interpreter as a warning.

  3. I see it a sign of a bad design language. Guido wants his Python to be a hackish language

  4. @pcdinh: Don’t know why you see this as “bad design”. If you don’t require variable declarations, and allow variable update, then this is going to be a problem, and I see this as a nice simple solution.

    (Now that I think about it, Ruby’s $vars won’t help access outer-scope vars, so I’m not sure how Ruby solves this problem).

    The fact that in JavaScript you need JSLint, as Justin pointed out, shows how horrible it can be in a dynamic language which requires variable declaration.

    (The problem with JavaScript versus a statically typed language is that if there is no variable, it creates a global, whereas in a static language it would be a compiler error).

  5. Before I start, I haev a serious complaint. There is a background colour set that affects this textarea.. But no foreground colour is set. So, I’m writing white text on a white background. Very, very uncool.

    Anyway, on topic, my opinion differs from that of pcdinh. Maybe it’s because php is the first scripting language I ever learned. Maybe it’s because I am only a novice programmer, if I am even a programmer at all (I’m not, I promise;)) .. Or maybe (I find this hard to accept) it is because being able to control variable scope explicitly makes a language “hackish”..

    In javascript you have the option of declaring your variables explicitly. I always take that option because I consider it a matter of preventing not only bugs but insecurities.

    In php you don’t really have a choice. Although you automatically delcare your variables, there are NO globally available variables (don’t disagree. I mean in javascript you can access global variables without any special keyword or procedure.. In php you can’t.) so be declaring which instance of a variable you will be referring to, you are explicitly stating which scope the variable you use applies to.

    Is javascript hackish? Well, yeah, I suppose. PHP? … DEFINTIELY. (it’s a strength, really)…

    Does C care about variable scope? C++? I bert they do. They effectively force you to strictly type your variables (almost) so I’d be most surprised if it was a scopeless language.

    Does Java allow you to code without caring about variable scope?? I don’t think so.

    Anyway putting all this together, it looks like Python is the only popular high level language which can operate either procedurally or opbject-orientedly, which allows me no explicit control over variable scope (as of 2.5) …

    I’m very new to it all, particulalrly python, so please correct me if I’m wrong about it.

    But the way I see it, the less hackish a languagge (implementation actually) gets, the more it cares about variable scope (and the more sophisticated the scope control becomes)

    Does the author of the original post celebrate the ability to explicitly call local and/or global variables? I think so, it will make code a lot less buggy and a lot more secure. Without the feature Pthon is less powerful, less useful, and in many applications less appropriate.

    Does the added control make it a more hackish language?
    Hell no! It makes it less of a kids’s hobby toy and more of a serious language to be reckoned with!!

    (NB when I say “less of a toy” I don’t mean to say that it is actually a toy. I use a lot of truly excellent and highly sophisticated python applications on my home computer, so I don’t mean to put it down. In fact, I’m writing something in it right now, so ha! It is a serious language, but the added deature makes it even more so.)

  6. @SneakyWho_am_i: Thanks for a great response.

    Actually you can get globals in PHP – it seems to work the same as Python (though I am no expert on PHP). See here: http://au2.php.net/global

    I think if you look at the two alternatives: Python/PHP being local by default and JavaScript being global by default – note that BOTH give you control over scope, it just varies in what is the default. I would say JavaScript is more “hackish” because global-by-default makes it easier to make mistakes, but potentially do a few things more powerfully. (Though I’d rather simply say it’s badly designed than use the word “hackish”).

    Statically typed languages don’t really have this problem because they (usually) force you to declare all variables. That means you can’t accidentally use an undeclared variable and start manipulating globals. I think the issue is if a language doesn’t force you to declare variables, what should they default to – and local is definately the “safe” option.

    The only problem with defaulting to local is (and I’ve long whinged about it) you can’t access outer-scope at all. That is what this feature fixes it. So I certainly think that if you have a language with implicit variable declaration, this is the way to do it.

  7. Yes Matt, you can get global variables in PHP but the access is local. I don’t know why I wrote it in a way that made it literally incorrect. I wonder what I was trying to say…

    I can’t agree more on the javascript thing, I’ve seen experiences javascript coders become confused momentarily because they did or didn’t declare a variable explicitly.

    At any rate, it all makes sense now. I moved away from Python (still wondering if it was a mistake) for C and C++. I’m still very green in them and yes, of course, as expected, it’s local all the way.
    Shockingly, I’ve never even tried to access a variable from the parent scope. I read somewhere that it’s a bad habit to use a variable that you didn’t parameterize (when habitually PHP-ing) and I guess I must have given it up at some point as a result of it.

    Now I have to go away and learn about that 😮

  8. I think it’s mostly a bad habit, but not as bad as a global. The reason you may want to do it is if you want a nested function which changes a local variable in the parent function.

    In Python, and many other languages, you *can’t* parameterize those arguments because there is no pass by-reference (you can pass them in, but they can’t be updated on the way out). So you could either return all the values you want modified as a tuple, or just use nonlocal.

Leave a comment