Re: Losing my latin on Ordering...

2023-02-14 Thread Laurenz Albe
On Tue, 2023-02-14 at 13:06 +0100, Dominique Devienne wrote: > > Sure, just make sure to use the definition of C that uses UTF-8 encoding > > (I think it's typically called C.UTF-8). >   > OK, so for new DBs, sounds like we need to > > CREATE DATABASE ... WITH LOCALE 'C.UTF-8' ENCODING UTF8  > >

Re: Losing my latin on Ordering...

2023-02-14 Thread Dominique Devienne
On Tue, Feb 14, 2023 at 12:35 PM Alvaro Herrera wrote: > On 2023-Feb-14, Dominique Devienne wrote: > > Honestly, who expects the same prefix to sort differently based on what > > comes after, in left-to-right languages? > Look, we don't define the collation rules. > Ok, ok, sorry. To you, Lauren

Re: Losing my latin on Ordering...

2023-02-14 Thread Laurenz Albe
On Tue, 2023-02-14 at 12:17 +0100, Dominique Devienne wrote: > On Tue, Feb 14, 2023 at 11:23 AM Laurenz Albe > wrote: > > On Tue, 2023-02-14 at 10:31 +0100, Dominique Devienne wrote: > > > Surely sorting should be "constant left-to-right", no? What are we > > > missing? > > > > No, it isn't.  T

Re: Losing my latin on Ordering...

2023-02-14 Thread Alvaro Herrera
On 2023-Feb-14, Dominique Devienne wrote: > Honestly, who expects the same prefix to sort differently based on what > comes after, in left-to-right languages? Look, we don't define the collation rules. We just grab the collation rules defined by experts in collations. In this case the experts h

Re: Losing my latin on Ordering...

2023-02-14 Thread Dominique Devienne
On Tue, Feb 14, 2023 at 11:23 AM Laurenz Albe wrote: > On Tue, 2023-02-14 at 10:31 +0100, Dominique Devienne wrote: > > Hi. Porting a unit test to PostgreSQL, we got a failure related to > ordering. > > > > We've distilled it to the below. The DB is en_US.UTF-8, and the sorting > we get > > does

Re: Losing my latin on Ordering...

2023-02-14 Thread Laurenz Albe
On Tue, 2023-02-14 at 10:31 +0100, Dominique Devienne wrote: > Hi. Porting a unit test to PostgreSQL, we got a failure related to ordering. > > We've distilled it to the below. The DB is en_US.UTF-8, and the sorting we get > does not make sense to me. The same prefix can be sorted differently base