Articles

Why Python’s whitespace rule is right

In Language design, Python on October 18, 2011 by Matt Giuca Tagged: , , ,

Python is famous among programming languages for its fairly unique syntax: rather than being delimited by curly braces or “begin/end” keywords, blocks are delimited by indentation. Indenting a line is like adding an opening curly brace, and de-denting is like a closing curly brace. When people criticise Python, it is usually the first complaint: “why would I want to use a language which requires me to indent code?” Indeed, while programmers are very used to indenting their code, they are very un-used to being forced to do so, and I can understand why they may take it as an insult that a language tells them how to write code. I don’t usually like to get into syntax arguments, because I find them very superficial — it is much more important to discuss the semantics of a language than its syntax. But this is such a common argument among Python detractors, I wanted to address it. Python is right, and it’s just about the only language that is.

I think the rub is that programmers like to think of languages as a tool, and tools should be as flexible as possible. I think in general it is a good principle for programming languages not to enforce conventions. Languages that do tend to annoy people who don’t subscribe to the same conventions. For example, the Go programming language enforces the “One True Brace Style” — every opening curly brace must appear on the same line as the function header or control statement. This irritates me because that’s not my preferred convention. But the indentation convention is so universal that it is considered bad programming practice to not indent in all cases. (There is disagreement over tabs vs spaces, the number of spaces, etc, but we all agree that indentation is good.) There is not a single situation in any country, in any programming language, or at any skill level, in which is it acceptable to not indent your code the way Python requires it. Therefore, it is technically redundant to have a language that is not whitespace-sensitive. Any language that is not whitespace-sensitive requires (by universal convention) that programmers communicate the scoping of the code in two distinct manners for every single line of code: braces (or begin/end) and indentation. You are required to make sure that these two things match up, and if you don’t, then you have a program that doesn’t work the way it looks like it works, and the compiler isn’t going to tell you.

There are two solutions to this problem. 1: Make the compiler tell you. Force the programmer to indent and put in curly braces, and have the compiler check the indentation and give either a warning or error if they don’t match up. Now you’ve solved the problem of accidentally getting it wrong, but now what is the point of requiring curly braces at all? The programmer would just be doing extra work to please the compiler. We may as well go with 2: take out the curly braces and just have the compiler determine the blocks based on indentation.

When you really analyse it, Python’s whitespace sensitivity is actually the only logical choice for a programming language, because you only communicate your intent one way, and that intent is read the same way by humans and computers. The only reason to use a whitespace-insensitive language is that that’s the way we’ve always done things, and that’s never a good reason. That is why my programming language, Mars, has the same indentation rule as Python.

* * *

An interesting aside: there is a related syntax rule in Python which doesn’t seem quite so logical: you are required to place a colon at the end of any line preceding an indent. I haven’t fully tested this, but I’m pretty sure there is no technical reason for that (the parser could still work unambiguously without that colon), and it doesn’t seem to add much to the readability either. I slavishly followed this rule in Mars too, because as a Python programmer it “feels right” to me. But perhaps it would be better to drop it.

Advertisements

25 Responses to “Why Python’s whitespace rule is right”

  1. The colon rule probably helps editors – after a colon and a newline, they know you _must_ indent (and will do so for you typically), without having to pay special attention to keywords.

    Also more importantly, it means you have the exact same syntax for a single-line conditional:

    if foo:
    continue

    naturally condenses (if you wish) to

    if foo: continue

    • the comment field ate my indentation, but you know what I mean…

    • Good point. The editor isn’t so important since I’m sure an editor could detect “Line beginning with def/if/while/for/etc and not containing a colon”. But the consistency is a good point.

  2. (I agree with you wholeheartedly.) No, the colon isn’t necessary. For example, Haskell has significant whitespace and no colon.

  3. I cannot imagine how a language that breaks if you indent wrong, could be considered not only a logical choice, but “the only logical choice”. It’s no doubt perfectly logical for the computer itself, and perhaps it might be for another species, but not for this species.

    Sometimes, the road less traveled is less traveled for a reason.

    • Could you explain why it is not for this species? Or what the “reason” is that you think the road is less traveled? I thought I did a pretty good job of explaining why it is perfect for this species in the post: this very species has near universal coding standards that say you must indent your code. So since you are indenting your code anyway, why not make sure you are indenting correctly?

  4. @Matt, consider the difference between Python’s

    if x:
    if y:
    foo()
    else:
    bar()

    and C’s

    if (x) {
    if (y) {
    foo();
    }
    } else {
    bar();
    }

    In the C case, the meaning is unambiguous; in the Python case, only the indenta— I’m sorry, what’s that? WordPress ate my indentation? Well, crap. You know what I *meant* to write, though… don’t you? I mean, source code is meant for humans to read, so it would be *pretty dumb* if you couldn’t paste source code into a blog comment without breaking it…

    • It’s true, but I consider this to be a deficiency of WordPress, not Python. While it would be less common to do so, I could write a blogging tool that strips out curly braces, and then C code would break. I consider “readability” and “looking like it does what it actually does” to be more important design goals for a programming language than “will the code break if pasted into random website X?”

      • Have you ever tried to paste C++ code using templates (or Java using generics) in a comment field that strips HTML tags?

  5. Sorry for commenting your post so lately, but I couldn’t resist 🙂

    I first saw whitespace delimited code in Haskell – what a crazy language, and good one too 🙂

    I am working with C# for almost 10 years now. I can only say the more I work with C#, the more I hate the braces. It is such a waste of time. In company where I work we had tools (FxCop, StyleCop) that enforced us to comply to company standards. Which is OK, of course, but then I noticed something, there was that strange rule in some of these tools that enforces you to begin every block with new line and also you needed to indent that block! So, at the end, there is that rule which is forcing you to do the same thing which Python enforces you but with additional braces, which you must maintain manually (or buy some 3rd party tool). You cannot checkin (commit) your code unless you fix it. It is same with companies which don’t have automated verification, at the end if you all agreed on some rules, why it is OK to break them down? There is only one reasonable answer to that – it is not OK, it must be automated – is it done by compiler or is it done by interpreter or any other 3rd party tool it doesn’t matter, it must be done by machines.

    And yes, whitespace is a piece of the program itself, we are already using it as input information for our eyes, why is so bad if we use the same information to feed the compiler/interpreter?

  6. Python’s indentation rule makes it harder to do code generation. In a language like this you need to provide a alternative mode for the sake of code generation (see what’s available in Haskell). In Python we do not have this, which really sucks a lot.

    • I’ll admit it’s a little harder: it means your code generator has to keep track of the indentation level at all times, and every time it inserts a newline, to insert that many spaces/tabs. But that’s a fairly trivial task compared to all the other things you have to be aware of when writing a code generator, for any language. I don’t think this is a big issue, especially since programming languages should be designed primarily for people to write, not computers.

      Haskell’s whitespace rule is much more complicated than Python’s, and yes, it would be much harder to write a code generator to whitespace-delimited-Haskell than Python, so it’s not really comparable (i.e., it’s good that Haskell provides a brace-delimited mode; Python doesn’t really need one).

  7. Having worked with braceless and braced programming in YAML, JSON, Haskell, Java, HTML, HAML, Ruby, Python, XML, etc…

    You’re dead wrong about this: “The only reason to use a whitespace-insensitive language is that that’s the way we’ve always done things”

    No, the reason I love braces in my code is that I can press one key combo and automatically reformat my code while being absolutely sure that I have not changed its meaning. This is a good thing. It means that I don’t have to count spaces. I don’t have to worry about mixed tabs and spaces. I can bang out a few expressions on one line, hit a key combo, and have perfectly readable code without constantly wasting mental resources on meeting lame formatting standards.

    • I don’t follow this argument. Let’s trace back to my original point: with braces, there are *two* separate representations of the code structure: one machine-readable and one human-readable. If you don’t make sure they are exactly in agreement, then humans will most likely misunderstand your code. Whereas with whitespace-sensitivity, there is only one representation of the code structure, which both the machine and human can read and agree upon.

      Your point seems to be that you have a tool that takes the machine-readable representation and ensures the human-readable one matches. That’s great (I, too, use this tool — although it didn’t exist when I wrote this blog post in 2011 and I don’t know of any that did). But if you were using a whitespace-sensitive language, you wouldn’t need that tool at all, because the human-readable representation *always* matches the machine-readable one! “X is better than Y because X can achieve the same benefits of Y with additional tooling” is not an argument in favour of X.

      You should not be “counting spaces” — your eyes do that for you (you only have to observe what is more indented than what). You should *never* be mixing tabs and spaces, in any language, so that is a non-issue (and you will find that, while crappy C code mixes tabs and spaces a lot, it almost never happens in even crappy Python code because the program would break, so Python fixes that issue too). My point is that you do not *need* to waste mental resources reformatting code in Python because any correct code *is* correctly formatted already!

  8. You are a stupid pencil neck idiot. You are like someone who accepts to be bent over a barrel and be rammed up the ass by a guy with a huge strap on and say, “it’s ok, he’s using a condom so I won’t catch anything from my other geek pals who think the same shit as me”.
    No, it’s not all right to be forced to use whitespace you moron. It takes away choice. Any language that takes away choice should be binned immediately. Same goes for breaking backwards compatibility in under 10yrs. Invisible characters for code marking… sheer stupidity…. and fools like you who accept this make it all possible for the great creators of these garbage to state they have a product.

    • Decided not to mark this hideously offensive comment as spam just so that I can argue against it.

      > No, it’s not all right to be forced to use whitespace you moron. It takes away choice.

      What a ridiculous argument. Anything that takes away choice should be binned? C doesn’t let me choose between using double quotes or single quotes for strings, so that’s out. Java prevents me from converting an int into a pointer and crashing the computer; what a nasty restriction! Ruby insists that my instructions are executed from top to bottom, and not bottom to top.

      Language design — or rather, any design of anything ever — is about carefully choosing restrictions for the user to try and increase the chance of a good outcome. My microwave won’t let me use it while the door is open, to avoid me radiating myself. Do you think that’s wrong because it’s “taking away choice”? As I carefully argued in this post, the designers of Python deliberately take away your choice to write a program that looks one way, and behaves another. If you don’t like it, well, you’re free to choose a different language. But that is, in my opinion, a helpful restriction.

      As for your attitude, grow the fuck up if you want to be taken seriously.

  9. Here’s why the golang creators apparently didn’t like it:
    http://betterlogic.com/roger/2015/05/go-creators-why-curly-braces-instead-of-significant-indentation/ (I’m not arguing either way, but I had heard a quote from them once)

  10. The problem is that that entire argument for pyhtonic whitespaces leans on the developers having to do manual indentation.
    That’s not the case with a state of the art editor. You set the braces and the code get’s formated for you. That’s zero time spend on indentation and perfect output.
    Python on the other hand frequently forces the programmer to manually deal with indentation.
    That’s because the right indentation level can’t be deduced from anything else.
    For example each and every time when I would add a closing bracket I have to manually deindent.
    So no time safed there. But things get worse from there.
    In particular for code refactoring where pieces get moved around the indentation has to be adjusted manually most of the time.
    And while that’s only mildly annoying for moving larger blocks which can be (manually) indented en block it’s a huge waste of time for all the little changes when adding an additional condition or loop.

    • While you’re certainly right that automatic formatting tools make brace-based languages bearable, my basic argument still holds. Note that I wrote this article in 2011 when (to my knowledge), there were no good automatic formatting engines for C/C++, but of course now there is ClangFormat and I use it all the time.

      But I don’t see how this places Python at a *disadvantage* compared to a C programmer with ClangFormat. Surely any good text editor has a block indent/outdent feature. So moving Python blocks into or out of a loop is very simple: select all the lines you wish to indent, hit Shift+> (in Vim, for example), and you’re done. It’s still simpler than manually inserting braces and then running ClangFormat.

      My original point, which I still stand by, is that in Python you shouldn’t think of intentation as “extra time spent prettying up the code”. Manually setting the indentation in Python *is exactly equivalent* to manually setting braces in other languages, except with the added bonus (admittedly less so now that we have ClangFormat) that your code is automatically readable.

      There’s still the problem of any C code that *hasn’t* been run through ClangFormat, such as old code, and code your colleagues may write. The advantage of Python is that correct indentation is fundamental to the language, not an optional tool you can run. This bug simply can’t happen in Python:

      if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
      goto fail;
      goto fail;

  11. This whitespace thing with python really makes me laugh. Except that I have to program in it. My favorite thing is when the pythonistas claim that it is okay to have the language not enforce object member privacy (such as is done in java and c++), because, and I quote, “We are all adults here” and you’re just not supposed to access private data outside the object. Okay, fair enough, except that apparently, we are “all adults here” only when it suits their fairly lame arguments, but are we “adult enough” to decide how and when to indent? Hell, no. Welcome to python, where everyone is assumed to be a novice programmer. The very first set of lines I have to mark as continued over several lines makes less readable crap than any set curly braces ever will.

    Python is the woodpecker of programming languages. Really, you may be around for a while, but guaranteed using your head as a hammer will not set you up on the path to higher evolution… And rarely has a language developed such an amazing number of fanboys. A fanboy is one who cannot be convinced, no matter the evidence or argument, that they may be overstating things…just…a…little…bit. “why-pythons-whitespace-rule-is-right” is a perfect example of such. [face palm]

  12. […] without most of the keywords, parentheses and brackets. It uses whitespace to determine scope, like Python. It also supports classes. These two lines define a method of the TextEditor class called […]

  13. It is true that in general we all agree indentation is important and great, but I don’t think that’s the case we can agree always in specifics.
    If you spare me the “all-code-should-be-beautiful”, which is in an ideal world, which we are not in,
    if I want to write a statement across multiple lines, python makes this ambiguous or cumbersome with ‘\’
    In C# statements like
    thing.Where(i => i func(j));
    Read nicely. All I have to do to make sure my spacing is fine is to do Ctrl+k+f (format)
    I can’t do that in Python.
    I can’t do
    int func (a) { return func(a, default) }
    If I collaborate in code, i’ll likely have to enfore spaces vs.tabs, and number of spaces and so on,
    so I don’t Really GAIN anything.

    While it is true that braces are in general redundant, in specific cases they are useful,
    1. They make whitespace not-significant, for those annoying cases (which are the ones that matter) where we need them to be insignificant
    2. They permit auto-formatting
    3. It is totally unambiguous.
    4. I have no idea how collaborating across different teams/refactoring where the whitespace can go to hell easily is executed by python programmers, but if whitespace can be a pain non-sensitive languages, I can only imagine in python.

    It’s not the only logical choice. Just a pedantic one.
    I rather control my whitespace for maximal readability,
    The compiler doesn’t understand readability, I do.

    I seriously love most design decisions of python, but that one is just such a waste.
    Either way, we may as well argue about the existence ghosts, you’ll never change your mind.

  14. Note that I wrote this article in 2011 when (to my knowledge), there were no good automatic formatting engines for C/C++. The visual studio team will be very upset…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: