Dear Jeroen, Please let me prepare some data for regression test. The data I've tested are mainly ASCII or UTF-16BE data. I should check PDFEncoding data cases (if anybody already has something appropriate, please let me know).
Regards, mpsuzuki Jeroen Ooms wrote: > FYI the encoding problems still exist in the master branch today. I am > very interested in this patch by mpsuzuki, what can we do to move this > forward? > > > > > > > > > On Wed, Mar 28, 2018 at 2:26 PM, suzuki toshiya > <[email protected]> wrote: >> Dear Adam, >> >> Adam Reichold wrote: >>>> I see. where is the appropriate place to add a document of >>>> poppler::ustring class itself? >>> Personally, I would suggest Doxygen comments in the public header. >> Thanks! Now I'm trying to write... also I found Doxygen comments >> for text_list needs the improvement. >> >> During the check of the existing functions (to add documents), >> I found a few inconsistencies about BOM. >> >> * ustring::to_latin1() this function does not use iconv(), >> this function just cast the types between unsigned short and >> char. BOM could not be converted to Latin-1, but the exist of >> BOM is not checked. if stored UTF-16 has a BOM, broken 8bit >> would be inserted in the beginning of the result. >> >> * ustring::from_latin1() this function does not use iconv() >> either. BOM is not inserted to the beginning. no-BOM UTF-16 >> string is created. >> >> * ustring::to_utf8() BOM or no-BOM is decided by iconv(). >> >> * ustring::from_utf8() assuming iconv() returns with-BOM UTF-16. >> >> I would collect Debian software packages depending libpoppler-cpp, >> and check how they use ustring object. In my rough check it >> would be less than 10, checking all of them would not be so >> time-consuming. If there are softwares which always the skip >> first character of UTF-16 (based on the assumption as the >> ustring is always with UTF-16 with BOM), some discussion is >> needed. >> >> Regards, >> mpsuzuki >> >> _______________________________________________ >> poppler mailing list >> [email protected] >> https://lists.freedesktop.org/mailman/listinfo/poppler > _______________________________________________ poppler mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/poppler
