Literate programming is a terrible idea

In Language design on June 4, 2010 by Matt Giuca

The code samples for the subject I’m teaching are all written in Literate Haskell. Having worked with a bit of this code, I can say it’s a complete nightmare. The Wikipedia article discusses the philosophy behind it — I’m not here to debate that. My criticism is primarily that literate programs are god awful to work with in a text editor.

What I’m talking about is a programming style (specially supported by the Haskell language) in which comments are the default, and if you want code, you need to explicitly delimit it with a “>”. It inverts the comment/code relationship so that comment is the default. It means that instead of this:

-- This program calculates the factorial of a number.
fact 0 = 1
fact n = n * (fact (n-1))

You write this:

This program calculates the factorial of a number.
>fact 0 = 1
>fact n = n * (fact (n-1))

I’ve compiled this list of gripes after working with this style for just a few minutes:

It encourages extremely verbose comments. This seems at odds with the programmers’ mantra that “good code is self-documenting” and therefore, ideally, code doesn’t need comments at all (except to briefly explain what each function does, and explain any tricky bits).
Text editors don’t highlight the comments (as I demonstrate above). At least vim doesn’t. As far as I can tell, this is by design — the idea is that the program is actually an essay with code snippets, and thus you wouldn’t want the majority of the characters to be blue/green.
You have to write a > on every line of code. I find this much more annoying than having to write a “–” on comments, possibly because when I’m coding, I need to concentrate more than when I’m just writing English.
Working with code is a nightmare. The main reason is that code is indented. So if you forget to add a > on an indented line, you have to go to the left and add it, then you’ve offset your line by 1 character and have to fix it up. If you join a line (Shift+J in vim), it won’t nicely delete the indentation from the subsequent line, because there’ll be a “>” in the way followed by a lot of spaces. You spend a good percentage of your time messing about with “>” characters.
Can’t use line or block indentation commands (such as Shift+> in vim to indent the current line or block), as they’ll indent the > (which must be in column 0). Shift+< doesn’t work either, as vim considers all lines to be unindented. You have to put the cursor at the real start of each line and hit tab.
Shift+^ no longer takes you to the start of the line; it takes you to column 0.
Forgetting a “–” usually means a syntax error. Forgetting a “>” means a missing line of code, which may or may not generate a compiler error (could be a missing pattern, for example, which is a runtime error in Haskell).

This could be a useful style for writing an actual essay with code snippets, but not for an actual program.

19 Comments

19 Responses to “Literate programming is a terrible idea”

Alok June 5, 2010 at 3:33 am | Reply

Not the whole idea, just the specific implementation and guidelines you refer to seems to be bad.

Checkout http://www.toolness.com/wp/?p=441
- Matt Giuca October 5, 2010 at 9:12 pm | Reply
  
  Cool. But from the look of it, that isn’t literate programming at all. That’s just programming with verbose docstrings which are marked up very nicely. I am definitely a fan of such things.
  
  Literate is when the code itself is the other way around, and every line is a comment unless explicitly marked up as code.
Peter Schachte October 5, 2010 at 4:57 pm | Reply

I think your title should have been “Vim is a terrible editor”.

> Text editors don’t highlight the comments
Emacs does.

> You have to write a > on every line of code
Emacs will insert them for you.

> Working with code is a nightmare. The main reason is that code is indented. So if you
> forget to add a > on an indented line, you have to go to the left and add it
Emacs takes care of this.

With a proper editor, literate Haskell is not hard to edit. But it’s not clear to me that literate haskell handles the 3 different kinds of program documentation very well:
1) user documentation for the end user of a system
2) interface documentation for the client of a module
3) code documentation for the developer/maintainer of the code
and I see no reason that inverting the comment/code dichotomy by making comment the default and making you specify code would make handling all of those disparate needs any easier. It’s really designed for producing
4) an impressive-looking description of your implementation
at which, with the help of latex, it excels. But item 3 is handled just fine by comments in the code (they’re not prettily formatted, but they’re always there when you’re editing the code, where you want them). Haddock handles the second (barely) adequately. And I don’t think literate haskell is any better at the first than illiterate haskell.
- Matt Giuca October 5, 2010 at 9:19 pm | Reply
  
  Well, “Vim wasn’t explicitly designed to handle literate programming” or “Emacs was”. Either way, the fact that Emacs knows how to deal with this is good for Emacs+Literate Haskell users, but it means that either all text editors need to be retrofitted with special modes for editing this rather unnatural syntax, or you can’t use them with it. By the same logic, I could propose a language where every line had to start with the token “#*%@^!”, and write a Vim script to automatically insert it, and that would be OK.
  
  On the second point, note that LH does let you separate at least (1+2) from (3), because you can use regular “>–” style comments for code documentation (which are ignored by Latex).
  
  I think that Haddock/Doxygen/Alok’s example he posted above provide an equally impressive looking description of the implementation (I’ve seen whole architecture HTML documents written inside a comment block in C++, for the purpose of being displayed nicely inside a Doxygen output, and that seems good to me.)
  - Peter Schachte October 6, 2010 at 12:27 pm | Reply
    
    > “Vim wasn’t explicitly designed to handle literate programming” or “Emacs was”.
    
    Neither editor was designed with literate programming in mind. A better conclusion is that either “the design of vim isn’t extensible enough” or “someone should develop a good literate haskell mode for vim”.
    
    I don’t think it’s fair to say that a language where you indicate which parts are code is any less “natural” than one where you indicate which parts are comments. It’s just that the latter is more traditional. They really are quite symmetrical.
    
    I think a better argument against literate haskell is that you don’t really get much benefit from this contravention of tradition, so why bother? Another argument is that the intent of literate programming is to produce beautiful documents showing your source code, but most reading of source code is done in a text editor, so that’s where it’s most important that it be readable, and prose laced with latex markup is not terribly readable. Something like ReSructured Text is much better for that, and still produces beautiful documents when you want.
- gzmask August 14, 2012 at 4:19 am | Reply
  
  sounds like vi. vim does all that. most guys complaining vim is because they are in vim’s vi mode, which is by default.
Matt Giuca October 6, 2010 at 12:34 pm | Reply

>I don’t think it’s fair to say that a language where you indicate which parts are code is
>any less “natural” than one where you indicate which parts are comments. It’s just that
>the latter is more traditional. They really are quite symmetrical.

Well no, that was really the point of the original blog. They aren’t symmetrical, because unlike with prose of the comments, code has to be indented. The indentation has to come after the “>” in literal programming, which means a) if you join lines, the indentation space can’t easily be eliminated, and b) you can’t “indent” a line of code because then the “>” will get indented and you need to fix it up, and c) you can’t “dedent” a line of code because the “>” is already at column 0 (so the editor thinks the code isn’t indented at all). And other issues I raised in the post.

Again, an editor could be programmed around all of these issues, but my point is that writing code with an explicit delimiter on each line is far less “natural” (i.e., requiring far more special rules in the editor) than writing comments with an explicit delimiter.

And yes, I agree with your other arguments too.
- svat March 23, 2011 at 12:19 am | Reply
  
  All your arguments seem to be: literate programming requires support from tools such as the text editor. Surely this is obvious to anyone? Yes, attempting literate programming without the right tools is painful, but this is true for anything: imagine trying to run Haskell programs without having a compiler. :p
  
  The trouble here is “literate programming is different from what I’m used to, and is not supported by the tools I’m used to”. This is a problem with anything sufficiently “different”, but it doesn’t make it “a terrible idea”.
  - Matt Giuca March 23, 2011 at 3:19 pm | Reply
    
    Well no, many of my arguments are about text editor support, but I believe the key argument (my first point) was “good code is self-documenting” — literate programming encourages extremely verbose comments which get in the way of code.
    
    My point about text editors is that I shouldn’t require a special text editor for a given language — editors are generic tools (of course, it’s nice to have special syntax highlighting support, but I can live without it). It is a bit silly to compare that to not having a compiler. Of course I need a special compiler for each language, but I shouldn’t need a special text editor.
    
    If I wrote a text editor that was good for all existing languages, and then you wrote a language which is inconvenient to edit with my editor, is the editor terrible at supporting that language, or does the language have terrible syntax? Sure, you *could* write a special text editor to make editing the language easier, but that doesn’t make the syntax any better.
Peter Schachte October 7, 2010 at 10:26 am | Reply

> They aren’t symmetrical, because unlike with prose of the comments,
> code has to be indented.

Prose needs to be re-formatted as you edit, flowing words forward and backward across line boundaries. Sometimes you want to include code snippets in documentation, and they need to be formatted properly. You may want to include centered headings, enumerations, bullet lists, descriptions and even tables that all need proper layout, and all become more difficult when put in end-of-line comments, in much the same way as code becomes more difficult when put in end-of-line anti-comments. It’s not a perfect analogy, since in haskell code layout is less forgiving than documentation layout, but then, documentation layout exhibits a lot more variety. But either way, the key is having an editor that supports editing both code and documentation.
Toby Davies February 19, 2011 at 6:11 pm | Reply

I’m with Peter on this one. Vi(m) just doesn’t cut the mustard sometimes. You see quite a lot of more esoteric languages and paradigms trend towards emacs as a text editor because it’s just so much more powerful when you need to customize behaviour yourself. Haskell, Scheme, CL, Clojure (and i’d assume prolog) all disproportionately favour emacs.

Though I do think you kind of have a point matt in so far as code is much more finicky with regards to syntax – I can comprehend a paragraph with explicit –s at the start with zero extra effort, whereas a compiler can’t. Though what is really being discussed is text editor support, a text editor _does_ have to add extra rules for block comments or blocks of code either way, so it really boils down to vim having a terrible literate haskell mode…
AG December 27, 2012 at 8:57 pm | Reply

LP tools are bad, syntax is terrible. LP goals are only 2: good code supporting/maintenance, good-read code which avoid bugs.

But you need to use weird syntax constructs and to work in this weird textual mix and this make programming process worse, difficult – (remember goal 2?).

LP is good. Tools are not good.

And last. To make coding cleaner placeholding are not enough. So LP is only first step in direction of clean-programming. This technique should grow more and more…
Py3000 February 17, 2013 at 5:51 pm | Reply

Self-documented code is a mith. Program is not only code – code is the extract of ideas, eveen problem view. Program is a solution of a problem – it’s a project, ideas, architecture, design principles, general concept and additional references info (lagorithms, tricks, many other). LP aims to do this, to be tool for COMPLEX VIEW of “program as solution”. It’s not code + many-many comments. It’s notebook of project designing. On the other hand, many firms uses intranet Wiki for it’s projects, libraries and so on, and each developer of such firm should provide records in Wiki after finishing of work. And Wiki and code lives different lifes. My LP tool aims to solve many problems and this too – it’s ideal for collaborative usage – LP code in a) human readable online documentation + b) online LP libraries + c) source (repositiory) of real native program code… and also you can even print such code and read it on paper – as usual book about programming 🙂 See, http://code.google.com/p/nano-lp/
LP is even usuable to learn childrens to programming 🙂
Alberte Romero June 1, 2013 at 11:49 am | Reply

I tend to disagree with you. I’d agree, but literate programming, at least the way I think about it, it’s not to make the code a comment, and the comment the code. That’s silly, and in that context I would agree with you.

But as I see it, literate programming goes further that a simple comment. It’s like trying to write an article about the code, containing the code. Tons of plain text explaining the algorithms, decisions, etc., and then the code, but *as* an examplification of the text. But also, it’s compilable.
In that context, for each one line of code you have 10 or 20 of plain text.

Try to write a document about a new data structure with examples. It’s better to read the text as the document body and the code as examples. Then you can compile the article. That’s literate programming!

PS. Both Vim and Emacs, if well configured, can handle that.
- Matt Giuca June 19, 2013 at 11:45 pm | Reply
  
  Hey. I think you are right about this in the context of writing a document. If you’re writing an article that is, maybe, a tutorial that works through the construction of a particular program, or maybe a book that has a collection of example functions, then it works out pretty well, I think. What I was speaking out against, at the time I wrote this article, was the practice of writing actual software in this style (and I saw people doing it). I don’t think that’s very maintainable.
» Writing Literate Haskell with Code Block Notation August 1, 2013 at 12:49 pm | Reply

[…] especially for larger blocks of code. To read a few reasons against standard Literate Haskell, see this post by Matt Giuca. While I quite enjoy Literate Programming, many of the arguments he puts forth […]
Mastr Mastic September 12, 2014 at 2:02 am | Reply

Forgive me if this is incorrect, but, instead of opening every line with ‘>’ (which truly is repetitively awful), can’t you just make a code section with \begin{code} & \end{code}?
- msanatan March 16, 2015 at 9:48 am | Reply
  
  Same thing I was thinking, even fro small code bits I use the Tex style over the bird style
Konstantinos September 30, 2017 at 12:42 am | Reply

>code doesn’t need comments at all except to briefly explain what each function does

As you said good code doesn’t need explanation of what and how it does. So what’s the point of comments? That is where literate programming comes in. Comments are supposed to explain **why** your code is doing something. And usually this part cannot be given briefly. That’s why literate programming inverts code and comments giving comments spotlight. Nevertheless, it is more convenient just applying those concepts on current programming practices.