Re: `texindex` output depends on locale settings

Eli Zaretskii Sun, 06 Nov 2022 07:05:43 -0800

> From: arn...@skeeve.com
> Date: Sun, 06 Nov 2022 07:57:03 -0700
> Cc: w...@gnu.org, bug-texinfo@gnu.org
> 
> Eli Zaretskii <e...@gnu.org> wrote:
> 
> > Are you sure this Werner's request can be fulfilled:
> >
> > > >   I consider it very bad that `texindex` is locale-dependent.  IMHO
> > > >   the proper solution is to make `texinfo.tex` emit a document
> > > >   encoding statement to the (unsorted) index file that in turn gets
> > > >   acknowledged by `texindex`.
> 
> Sure? No. But I have some thoughts.
> 
> > FWIW, I don't even understand how can this be accomplished, unless the
> > program reinvents all the library functions that deal with characters
> > from scratch, instead of using libc functions (which are
> > locale-dependent).  And Gawk does use libc functions for that.
> 
> The current islower() is
> 
> function islower(c)
> {
>       return index("abcdefghijklmnopqrstuvwxyz", c) > 0
> }
> 
> It could instead be
> 
> function islower(c)
> {
>       return c ~ /[[:lower:]]/
> }
> 
> And similar for the others.  That would work for any unicode character.


Sure, but is the issue only with lower-case letters?  What about
collation order or even determining what is and isn't a character (as
opposed to incomplete byte sequence)?

Re: `texindex` output depends on locale settings

Reply via email to