There is no magic here - it's all documented in ISO 32000-1:2008. First you decode the string according to rules for Name objects, then treat the result as UTF8.
Leonard On 8/29/11 1:51 PM, "suzuki toshiya" <[email protected]> wrote: >Hi, > >I appreciate your interest & effort about non-Unicode font names! > >Albert Astals Cid wrote: >> Today I've been working on trying to fix the names reported by pdffonts >>for >> non latin1 fonts, I have not got anything very clear while reading the >>spec, >> but I understood that the BaseFont string is encoded using the >>/Encoding >> encoding. This has worked fine for some files but not for all like one >>that >> says >> /BaseFont /#CB#CE#CC#E5 >> /Encoding /UniGB-UCS2-H >> If i try to map that to Unicode i get nothing. And Adobe Reader >>properly maps >> that to 宋体 > >Although I've not tested comprehensively yet, I guess >Adobe implementation has some heuristic workaround for >the font names coded by legacy localization mechanism. > >0xCB 0xCE 0xCC 0xE5 is GB-2312 encoding of 宋体. > ># you can check as: ># perl -le '{printf("%c%c%c%c\n", 0xCB, 0xCE, 0xCC, 0xE5);}' | iconv -f >gbk -t utf-8 > >I guess, Adobe implementation processes as following: > >1) check font name if it is in hexadecimal syntax "/#xx#xx#xx..." >2) if its encoding is one of the predefined CJK CMaps, > try to decode the font name by > Adobe-CNS1 -> Big5 > Adobe-GB1 -> GB-2312 (or GBK) > Adobe-Japan1 or Adobe-Japan2 -> Shift_JIS (or Windows-31J) > Adobe-Korea1 -> Wansung > >Fortunately, core part of these legacy localizations are >almost same in MS Windows and Mac OS, the coverage of possible >legacy encoding is not so wide. > >> Any idea what is the proper manipulation one has to do over BaseFont to >>get >> the Unicode value? > >I think if we can request iconv for the users who are interested >in non-Unicode or non-ASCII font name, the conversion is not so >difficult. > >One of my concern is that I don't know about the handling of non- >CJK (or CJK-but-not-predefined) localized font names, like, >Adobe-Vietnam1, etc. > >This is urgent issue? If not, I will try to write some workaround >for this issue. > >Regards, >mpsuzuki >_______________________________________________ >poppler mailing list >[email protected] >http://lists.freedesktop.org/mailman/listinfo/poppler _______________________________________________ poppler mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/poppler
