On 1/11/22 10:38, Jakub Jelinek wrote:
On Tue, Jan 11, 2022 at 10:27:19AM +0100, Martin Liška wrote:
On 1/10/22 17:14, Martin Liška wrote:
Are you fine with the suggested changes?
Hello.
Jakub had comments so I'm sending v2 where I added few parsing
exceptions. Now it reports:
I'm still surprised by what the sort is doing,
( echo Chene; echo Chêne; echo Chfne ) | LC_ALL=en_US.UTF-8 sort
Chene
Chêne
Chfne
That is on glibc 2.32. On glibc 2.34.9000 I get a different order though,
Chêne last.
That partly ruins the idea of the checking script when the sorting isn't
the same for many people, either the script will be failing for many people
or various people will be changing the order there and back all the time.
Jakub
Or we can utilize https://pypi.org/project/Unidecode python package that
provides:
In [7]: unidecode.unidecode('Jääskeläinen')
Out[7]: 'Jaaskelainen'
and sort it by that.
Martin