Re: RFR: 8354266: Fix non-UTF-8 text encoding

2025-04-11 Thread Magnus Ihse Bursie
On Fri, 11 Apr 2025 03:35:11 GMT, Sergey Bylokhov wrote: >> I have checked the entire code base for incorrect encodings, but luckily >> enough these were the only remaining problems I found. >> >> BOM (byte-order mark) is a method used for distinguishing big and little >> endian UTF-16 encodi

Re: RFR: 8354266: Fix non-UTF-8 text encoding

2025-04-11 Thread Eirik Bjørsnøs
On Fri, 11 Apr 2025 10:21:32 GMT, Magnus Ihse Bursie wrote: >> src/demo/share/java2d/J2DBench/resources/textdata/arabic.ut8.txt line 11: >> >>> 9: تخصص الشفرة الموحدة "يونِكود" رقما وحيدا لكل محرف في جميع اللغات >>> العالمية، وذلك بغض النظر عن نوع الحاسوب أو البرامج المستخدمة. وقد تـم تبني >>>

Re: RFR: 8354266: Fix non-UTF-8 text encoding

2025-04-10 Thread Sergey Bylokhov
On Thu, 10 Apr 2025 10:10:49 GMT, Magnus Ihse Bursie wrote: > I have checked the entire code base for incorrect encodings, but luckily > enough these were the only remaining problems I found. > > BOM (byte-order mark) is a method used for distinguishing big and little > endian UTF-16 encoding

Re: RFR: 8354266: Fix non-UTF-8 text encoding

2025-04-10 Thread Magnus Ihse Bursie
On Thu, 10 Apr 2025 18:30:22 GMT, Eirik Bjørsnøs wrote: >> If this is a French name, it's e acute: é. > >> If this is a French name, it's e acute: é. > > Supported by this Wikipedia page listing S.L as an LCMS developer: > > https://en.wikipedia.org/wiki/Little_CMS It's not a mistake in capita

Re: RFR: 8354266: Fix non-UTF-8 text encoding

2025-04-10 Thread Magnus Ihse Bursie
On Thu, 10 Apr 2025 10:10:49 GMT, Magnus Ihse Bursie wrote: > I have checked the entire code base for incorrect encodings, but luckily > enough these were the only remaining problems I found. > > BOM (byte-order mark) is a method used for distinguishing big and little > endian UTF-16 encoding

Re: RFR: 8354266: Fix non-UTF-8 text encoding

2025-04-10 Thread Magnus Ihse Bursie
On Thu, 10 Apr 2025 19:06:35 GMT, Eirik Bjørsnøs wrote: > (BTW, I enjoyed seeing separate commits for the encoding and BOM changes, > makes it easier to verify each!) Thanks! I do very much like myself to review PRs that has separate logical commits, so I try to produce such myself. I'm glad t

Re: RFR: 8354266: Fix non-UTF-8 text encoding

2025-04-10 Thread Eirik Bjørsnøs
On Thu, 10 Apr 2025 10:10:49 GMT, Magnus Ihse Bursie wrote: > I have checked the entire code base for incorrect encodings, but luckily > enough these were the only remaining problems I found. > > BOM (byte-order mark) is a method used for distinguishing big and little > endian UTF-16 encoding

Re: RFR: 8354266: Fix non-UTF-8 text encoding

2025-04-10 Thread Eirik Bjørsnøs
On Thu, 10 Apr 2025 10:10:49 GMT, Magnus Ihse Bursie wrote: > I have checked the entire code base for incorrect encodings, but luckily > enough these were the only remaining problems I found. > > BOM (byte-order mark) is a method used for distinguishing big and little > endian UTF-16 encoding

Re: RFR: 8354266: Fix non-UTF-8 text encoding

2025-04-10 Thread Eirik Bjørsnøs
On Thu, 10 Apr 2025 17:23:37 GMT, Raffaello Giulietti wrote: > If this is a French name, it's e acute: é. Supported by this Wikipedia page listing S.L as an LCMS developer: https://en.wikipedia.org/wiki/Little_CMS - PR Review Comment: https://git.openjdk.org/jdk/pull/24566#discus

Re: RFR: 8354266: Fix non-UTF-8 text encoding

2025-04-10 Thread Naoto Sato
On Thu, 10 Apr 2025 10:10:49 GMT, Magnus Ihse Bursie wrote: > I have checked the entire code base for incorrect encodings, but luckily > enough these were the only remaining problems I found. > > BOM (byte-order mark) is a method used for distinguishing big and little > endian UTF-16 encoding

Re: RFR: 8354266: Fix non-UTF-8 text encoding

2025-04-10 Thread Erik Joelsson
On Thu, 10 Apr 2025 10:10:49 GMT, Magnus Ihse Bursie wrote: > I have checked the entire code base for incorrect encodings, but luckily > enough these were the only remaining problems I found. > > BOM (byte-order mark) is a method used for distinguishing big and little > endian UTF-16 encoding

Re: RFR: 8354266: Fix non-UTF-8 text encoding

2025-04-10 Thread Raffaello Giulietti
On Thu, 10 Apr 2025 17:09:27 GMT, Naoto Sato wrote: >> I have checked the entire code base for incorrect encodings, but luckily >> enough these were the only remaining problems I found. >> >> BOM (byte-order mark) is a method used for distinguishing big and little >> endian UTF-16 encodings.

Re: RFR: 8354266: Fix non-UTF-8 text encoding

2025-04-10 Thread Naoto Sato
On Thu, 10 Apr 2025 10:10:49 GMT, Magnus Ihse Bursie wrote: > I have checked the entire code base for incorrect encodings, but luckily > enough these were the only remaining problems I found. > > BOM (byte-order mark) is a method used for distinguishing big and little > endian UTF-16 encoding

Re: RFR: 8354266: Fix non-UTF-8 text encoding

2025-04-10 Thread Magnus Ihse Bursie
On Thu, 10 Apr 2025 11:46:45 GMT, Raffaello Giulietti wrote: > I guess the difference at L.1 in the various files is just the BOM? Yes. - PR Review Comment: https://git.openjdk.org/jdk/pull/24566#discussion_r2037357899

Re: RFR: 8354266: Fix non-UTF-8 text encoding

2025-04-10 Thread Raffaello Giulietti
On Thu, 10 Apr 2025 10:10:49 GMT, Magnus Ihse Bursie wrote: > I have checked the entire code base for incorrect encodings, but luckily > enough these were the only remaining problems I found. > > BOM (byte-order mark) is a method used for distinguishing big and little > endian UTF-16 encoding

Re: RFR: 8354266: Fix non-UTF-8 text encoding

2025-04-10 Thread Raffaello Giulietti
On Thu, 10 Apr 2025 10:14:40 GMT, Magnus Ihse Bursie wrote: >> I have checked the entire code base for incorrect encodings, but luckily >> enough these were the only remaining problems I found. >> >> BOM (byte-order mark) is a method used for distinguishing big and little >> endian UTF-16 enc

Re: RFR: 8354266: Fix non-UTF-8 text encoding

2025-04-10 Thread Magnus Ihse Bursie
On Thu, 10 Apr 2025 10:10:49 GMT, Magnus Ihse Bursie wrote: > I have checked the entire code base for incorrect encodings, but luckily > enough these were the only remaining problems I found. > > BOM (byte-order mark) is a method used for distinguishing big and little > endian UTF-16 encoding