https://bugs.documentfoundation.org/show_bug.cgi?id=149462

Mike Kaganski <[email protected]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement

--- Comment #4 from Mike Kaganski <[email protected]> ---
I do not quite see how this is a bug.

Any file without BOM and with only bytes 32-127 in them are *at the same time*
valid UTF-8 *and* valid ASCII files. There is nothing in such files that could
allow to detect that it's UTF-8. Hence, the "current Windows codepage"
detection would indeed trigger, and the file would be open as file using 8-bit
system encoding. This detection will be correctly remembered since version 7.2
(bug 120574), and when saving, would be correctly used. If the original
detection was not what OP expected, is a different story.

OTOH, if you opened it using "Text - choose encoding" filter, and defined UTF-8
on opening, it must save the extended characters on save.

So the possible enhancement would be to treat pure ASCII (first 127 Unicode
codepoints) files as UTF-8. Which is reasonable, and in line with e.g.
resolution of tdf#148413.

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to