On 01/20/2017 03:25 AM, Takeshi Abe wrote:
Preparing a patch for tdf#105382 [1], I come across a question about
character encoding for the path part of a URL representing a
com.sun.star.frame.XStorable's location.
I wonder if the original (before percent-encoded) path of such a URL can
be in an encoding other than UTF-8 or even in a different charset due
to e.g. a code page of some legacy filesystems.
Is it possible?
And, if so, is there any reasonable way to tell the encoding?

A conforming URL itself, by definition, is written with a subset of ASCII-only characters.

For file URLs, there never was a definition how to interpret the octets encoded in the URL's path component, so OOo/LO came up with the convention of always interpreting those as UTF-8. (So any code that converts between file URLs and native pathnames needs to do that mapping between UTF-8 and the relevant native pathname encoding, which LO assumes to be as reported by osl_getThreadTextEncoding.)

_______________________________________________
LibreOffice mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/libreoffice

Reply via email to