Taming em dashes on the Kindle
I like em dashes, and use them when writing (or preserving the style of older books when I’m formatting them), but they need some taming for the Kindle, as discussions on MobileRead will show you.
If you attempt to insert them between two words without any spaces, the Kindle will stubbornly keep both words together when breaking lines. A simple way of getting round this would be to use a spaced en dash instead, as Jaye Manus suggests, but can we do any better?
The Unicode-compliant method of hinting that a line break can occur without a visible space is to put U+200B ZERO WIDTH SPACE either side of the dash, but the Kindle doesn’t recognise this character. However, it does recognise U+200C ZERO WIDTH NON-JOINER, and appears to treat that in exactly the way that ZWSP should be treated.
So, I would mark up the first em dash in Jaye’s example sentence: “I think he’s the best–and I use that loosely–so will let him live.” as
...the best‌—‌and I use...
in the XHTML that I submit to Kindlegen.
With the U+200C ZWNJ included, the Kindle allows line breaking around the dash, and will even insert some space around the dash if necessary to justify the line. Dictionary lookups will also work exactly as you’d expect.
If you were sure that you always wanted a possible break point on both sides of your em dashes, you could just run a search and replace after marking up your document, but you may want to avoid breaks before a trailing dash at the end of a paragraph, for example.
The only further wrinkle here (for me) is that ZWNJ is the wrong character if you’re trying to maintain a ‘clean’ master XHTML file for EPUB conversion or as a web page, so I actually mark up the document with ZWSP and make the ZWSP-to-ZWNJ conversion part of the set that I do before running Kindlegen.