slide 30

How the character-analysis works

Grab every chunk of text (text node)
Remove all the characters we know to be okay
Examine what's left:
- For each such character, report it
- Look it up in a table of Unicode to get its title
- Report an XPath to the text node while we're at it
Stylesheet is XSLT 2.0 (for its Unicode functions)
Extra input is a table of Unicode codepoints with their names
(see examples/Unicode-codepoint-lookup.xml)