On 01/20/2017 03:25 AM, Takeshi Abe wrote:
Preparing a patch for tdf#105382 [1], I come across a question about character encoding for the path part of a URL representing a com.sun.star.frame.XStorable's location. I wonder if the original (before percent-encoded) path of such a URL can be in an encoding other than UTF-8 or even in a different charset due to e.g. a code page of some legacy filesystems. Is it possible? And, if so, is there any reasonable way to tell the encoding?
A conforming URL itself, by definition, is written with a subset of ASCII-only characters.
For file URLs, there never was a definition how to interpret the octets encoded in the URL's path component, so OOo/LO came up with the convention of always interpreting those as UTF-8. (So any code that converts between file URLs and native pathnames needs to do that mapping between UTF-8 and the relevant native pathname encoding, which LO assumes to be as reported by osl_getThreadTextEncoding.)
_______________________________________________ LibreOffice mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/libreoffice
