https://bugs.kde.org/show_bug.cgi?id=506187
--- Comment #15 from Stefan Brüns <[email protected]> --- Git commit 9fa1aaaf4a841224161e791cb8ffd366485dc7e3 by Stefan Brüns. Committed on 06/07/2025 at 18:16. Pushed by bruns into branch 'master'. [PlaintextExtractor] Fix various issues with UTF-16 Read the file in binary mode, feed the complete data into QStringDecoder with the detected encoding, and split the lines last. Opening a file with open mode "QIODevice::Text" mangles Carriage Return sequences, and the UTF16-LE sequence "\r\0\n\0" ends up as "\0\n\0", i.e. an invalid sequence. QIODevice::readline() only supports 8 bit encodings (see QTBUG 121812), and the fixup attempts here were not working in general. Unfortunately, QTextStream::setEncoding only supports UTF encodings, but none of the legacy ISO-8859 or Windows encodings or e.g. GB18030. M +0 -2 autotests/indexerextractortests.cpp M +53 -25 src/extractors/plaintextextractor.cpp https://invent.kde.org/frameworks/kfilemetadata/-/commit/9fa1aaaf4a841224161e791cb8ffd366485dc7e3 -- You are receiving this mail because: You are watching all bug changes.
