Posts Tagged ‘unicode’


Python 2: My new URI/Unicode crusade

In Python on March 14, 2010 by Matt Giuca Tagged: , , ,

You may recall in 2008 I filed a bug on unicode URIs in Python 3, had a massive argument with the Python community, and ended up successfully getting a patch (a complete rewrite of urllib.parse.quote and unquote) accepted in Python 3.

Well two years later, I finally had the stamina to check out the situation with unicode URIs in Python 2. It’s just as bad, if not worse, than it was in Python 3. So I’m doing it all over again!

I’ve just submitted three patches (1, 2, 3) on four separate bugs relating to urllib.quote and urllib.unquote, all of which I already fixed in Python 3. Hopefully this time, the existing Python 3 precedent will mean less arguing. Also the fact that I made three separate patches will mean they’ll be accepted or rejected individually, rather than what happened last time, which was me having to maintain a giant patch fixing a dozen bugs over two months.