Wednesday·24·November·2010
Useful but Unknown Unix Tools: Convert UTF-8 text files to PostScript with paps //at 02:23 //by abe
Sometime you get an UTF-8 encoded text file you want to print. But most text to postscript converters (often invoked automatically by your print server) can only render ISO-Latin-1 text files properly and so you get the notorious ä et al on your printout.
This is especially annoying at PGP/GnuPG keysigning parties where nowadays most people have the names in their UIDs encoded in UTF-8.
Fortunately there is paps (Debian package), a Pango based commandline tool to convert UTF-8 encoded text files into PostScript.
See the paps home page for a neat example.
Update 02:10 (CET): Funnily the explicitly as HTML
entities “Ô and “¤” written “ä”
above got rendered as “ä” in Liferea, but only in the Planet
Debian and Planet Symlink feeds, and only until I put that
“ä” in this paragraph, because both got converted to their
ISO-Latin-1 8-bit equivalent bytes, so that without the “ä”,
“ä”, converted to 8-bit ISO-Latin-1 characters
looks also like an UTF-8 “ä”. And Liferea seems to guess the
character set somehow and if it validates as UTF-8, it
uses UTF-8 even if it isn’t UTF-8. This is a strange
Planet.
Tagged as: CLI, Converter, GnuPG, HTML Entities, KSP, Liferea, nuggets, Pango, paps, PGP, Planet Debian, Planet Symlink, PostScript, UTF-8
// show without comments // write a comment