On Wed, Mar 25, 2026 at 6:21 PM Masahiko Sawada <[email protected]> wrote:
>
> On Wed, Mar 25, 2026 at 6:09 PM Masahiko Sawada <[email protected]> wrote:
> >
> > On Wed, Mar 25, 2026 at 5:35 PM Tom Lane <[email protected]> wrote:
> > >
> > > Tomas Vondra <[email protected]> writes:
> > > > On 3/26/26 00:40, Tom Lane wrote:
> > > >> I believe what's happening there is that in cs_CZ locale,
> > > >> "V" doesn't follow simple ASCII sort ordering.
> > >
> > > > With cs_CZ all letters sort *before* numbers, while in en_US it's the
> > > > other way around. V is not special in any way.
> > >
> > > Ah, sorry, I should have researched a bit instead of relying on
> > > fading memory.  The quirk I was thinking of is that in cs_CZ,
> > > "ch" sorts after "h":
> > >
> > > u8=# select 'h' < 'ch'::text collate "en_US";
> > >  ?column?
> > > ----------
> > >  f
> > > (1 row)
> > >
> > > u8=# select 'h' < 'ch'::text collate "cs_CZ";
> > >  ?column?
> > > ----------
> > >  t
> > > (1 row)
> > >
> > > Regular hex encoding isn't bitten by that because it doesn't
> > > use 'h' in the text form ... but this base32hex thingie does.
> > >
> > > However, your point is also correct:
> > >
> > > u8=# select '0' < 'C'::text ;
> > >  ?column?
> > > ----------
> > >  t
> > > (1 row)
> > >
> > > u8=# select '0' < 'C'::text collate "cs_CZ";
> > >  ?column?
> > > ----------
> > >  f
> > > (1 row)
> > >
> > > and that breaks "text ordering matches numeric ordering"
> > > for both traditional hex and base32hex.  So maybe this
> > > is not as big a deal as I first thought.  We need a fix
> > > for the new test though.  Probably adding COLLATE "C"
> > > would be enough.
> >
> > Thank you for the report and the analysis.
> >
> > I've reproduced the issue with "cs_CZ" collation and adding COLLATE
> > "C" to the query resolves it. It seems also a good idea to add a note
> > in the documentation too as users might face the same issue. For
> > example,
> >
> > To maintain the lexicographical sort order of the encoded data, ensure
> > that the text is sorted using the C collation (e.g., using COLLATE
> > "C"). Natural language collations may sort characters differently and
> > break the ordering.
> >
>
> Attached the patch doing the above idea.

Pushed the fix without the documentation changes to make the buildfarm
animals happy first.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com


Reply via email to