Articles

Bidirectional text flips glyphs?

In Unicode on July 19, 2010 by Matt Giuca

Something strange I just noticed, when playing around with bidirectional Unicode text. If you haven’t seen bidirectional (“bidi”) text, it’s weird. This is the part of Unicode that deals with right-to-left scripts, like Hebrew, and their insanely complicated interactions with left-to-right text.

Consider the five characters stored in sequence:

“ר” then ” ” then “∈” then ” ” then “מ”

I separated them with English words so your browser’s bidi doesn’t touch them yet. These five characters might appear in a Hebrew maths paper (?), to mean “ר is a member of the set מ”, much like “x ∈ s” means “x is a member of the set s”.

Let’s render this out:

ר ∈ מ

If your browser is doing what mine is (Firefox 3), you’ll see that as Hebrew dictates, the five characters are written from right to left (the spaces and “∈” are neutrals, so they take the directionality of the surrounding text, so count as right-to-left characters in this instance). What I’m amazed by is that on my display, the “∈” (U+2208) has actually been rendered flipped horizontally, so as to correctly read that “ר is a member of the set מ” even though “ר” is on the right of the operator. It’s been rendered as a “∋” (U+220B). I’m not sure if it’s specifically rendering U+220B instead of U+2208, or if it’s actually flipping the U+2208 glyph horizontally. I can’t find any mention of horizontal flipping in the bidi spec.

Can anyone explain what’s going on?

(Note: I doubt Hebrew-speaking mathematicians use Hebrew characters as variable names; it was just an example.)

Advertisements

2 Responses to “Bidirectional text flips glyphs?”

  1. Well, a year and a half later, I was reading through the Unicode spec and I came across a feature called “bidirectional mirroring”. There is a set of specific characters which define a “mirror” character, and the rendering system is to replace them with the mirror character. These are defined in the file BidiMirroring.txt in the Unicode Database:
    http://www.unicode.org/Public/6.0.0/ucd/BidiMirroring.txt

    What’s funny is that above, I wrote “I’m not sure if it’s specifically rendering U+220B instead of U+2208, or if it’s actually flipping the U+2208 glyph horizontally,” and the answer is that the spec can require both of these. For a glyph like U+2208 (∈), the BidiMirroring file mandates that it be transformed into U+220B (∋). But some glyphs, like U+2211 (∑) don’t have a corresponding mirror character, and the implementation is required to horizontally mirror the glyph!

    Let’s try it out:
    ף ∑ מ
    Nope, didn’t work in my browser (Firefox). I wonder if other implementations support it?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: