Re: `texindex` output depends on locale settings

2022-11-06 Thread Werner LEMBERG
> A (non-ideal) workaround is to avoid the use of non-ASCII characters > in the first character of index entries, using accent commands > instead: > > @findex ax > @findex @'ay > @findex ux > @findex @`uy This helps, thanks. Werner

Re: problem with 'txiindexbackslashignore'

2022-11-06 Thread Werner LEMBERG
> You can avoid the error with @sortas: > > @findex @sortas{\\} \\ Aah, nice, didn't think of it. This solution is good enough, thanks. Werner

Re: problem with 'txiindexbackslashignore'

2022-11-06 Thread Werner LEMBERG
> +\ifx\indexsortkey\empty > + \message{Empty index sort key near line \the\inputlineno}% > + \xdef\indexsortkey{ }% > +\fi > > Do you think this would help fix this kind of problem? Yes, thanks. Maybe it makes sense to also emit Are you trying to index a bac

Re: problem with 'txiindexbackslashignore'

2022-11-06 Thread Gavin Smith
On Sun, Nov 06, 2022 at 06:38:24AM +, Werner LEMBERG wrote: > If this is not possible, please make `texinfo.tex` suppress index > entries that consist of whitespace only – perhaps together with a > warning (or even an error). I also suggest to add some words to the > manual about this case. I

Re: problem with 'txiindexbackslashignore'

2022-11-06 Thread Gavin Smith
On Sun, Nov 06, 2022 at 06:38:24AM +, Werner LEMBERG wrote: > If this is not possible, please make `texinfo.tex` suppress index > entries that consist of whitespace only – perhaps together with a > warning (or even an error). I also suggest to add some words to the > manual about this case.

Re: `texindex` output depends on locale settings

2022-11-06 Thread Werner LEMBERG
> I very much hope that your ideas of half-solving this for half of > the OSes out there are not shared by the Texinfo maintainers. I > personally think it wold be shameful to have such a "solution". For the short term, I very much disagree. Better indexing support can be quickly added as a fea

Re: `texindex` output depends on locale settings

2022-11-06 Thread arnold
Eli Zaretskii wrote: > Since texi2any just went through the same process, I think Perl is > probably a good candidate to replace Gawk as an implementation language > for texindex. Another possibility is Python. I have no objection to someone else (re)writing texindex in another language. It's

Re: problem with 'txiindexbackslashignore'

2022-11-06 Thread Gavin Smith
On Sun, Nov 06, 2022 at 06:38:24AM +, Werner LEMBERG wrote: > > [texinfo.tex 2022-10-18.18] > > > Consider the following input file > > ``` > \input texinfo.tex > > @set txiindexbackslashignore > > @findex \\ > @findex A > > @printindex fn > > @bye > ``` > > The created `.fn` file cont

Re: `texindex` output depends on locale settings

2022-11-06 Thread Gavin Smith
On Sun, Nov 06, 2022 at 10:02:44AM +, Werner LEMBERG wrote: > > [texindex (GNU texinfo) 6.8dev] > [GNU Awk 4.2.1, API: 2.0] > [openSUSE Leap 15.4] > > > There are two bugs with texindex, making it basically unusable for > everything except English as the main document language. For the > re

Re: `texindex` output depends on locale settings

2022-11-06 Thread Eli Zaretskii
> From: arn...@skeeve.com > Date: Sun, 06 Nov 2022 11:25:27 -0700 > Cc: bug-texinfo@gnu.org, arn...@skeeve.com > > Eli Zaretskii wrote: > > > > and similarly capable C libraries > > > > Are there such libraries in existence, when locale data is considered? > > Which ones? > > macOS, and Solaris

Re: `texindex` output depends on locale settings

2022-11-06 Thread Patrice Dumas
On Sun, Nov 06, 2022 at 04:09:59PM +, Werner LEMBERG wrote: > * Let's assume that GNU awk behaves similar to, say, GNU sort. The > collation order and input encoding gets controlled with `LANG` – > looking into the awk info manual this seems like a reasonable > assumption. > > As far

Re: `texindex` output depends on locale settings

2022-11-06 Thread Gavin Smith
On Sun, Nov 06, 2022 at 05:05:00PM +0200, Eli Zaretskii wrote: > Sure, but is the issue only with lower-case letters? What about > collation order or even determining what is and isn't a character (as > opposed to incomplete byte sequence)? > As you say, this information isn't in the input index

Re: `texindex` output depends on locale settings

2022-11-06 Thread arnold
Eli Zaretskii wrote: > > and similarly capable C libraries > > Are there such libraries in existence, when locale data is considered? > Which ones? macOS, and Solaris, to name two. I think AIX as well. > > provided this is well documented and probably shown in the output of > > `--version`. Th

Re: `texindex` output depends on locale settings

2022-11-06 Thread Eli Zaretskii
> Date: Sun, 06 Nov 2022 17:19:23 + (UTC) > Cc: arn...@skeeve.com, bug-texinfo@gnu.org > From: Werner LEMBERG > > Again, I think it's OK if i18n support only works with glibc It is? > and similarly capable C libraries Are there such libraries in existence, when locale data is considered? W

Re: `texindex` output depends on locale settings

2022-11-06 Thread Werner LEMBERG
>> * I think it would be OK if the documentation says that i18n >> support for sorting only works with awk programs that understand >> `LANG`. > > Gawk understands and uses any LC_* and LANG variables that Posix > requires, if Gawk uses glibc. Otherwise you depend on the > particular library

Re: `texindex` output depends on locale settings

2022-11-06 Thread Eli Zaretskii
> Date: Sun, 06 Nov 2022 16:09:59 + (UTC) > Cc: arn...@skeeve.com, bug-texinfo@gnu.org > From: Werner LEMBERG > > * I think it would be OK if the documentation says that i18n support > for sorting only works with awk programs that understand `LANG`. Gawk understands and uses any LC_* and L

Re: `texindex` output depends on locale settings

2022-11-06 Thread Werner LEMBERG
>> > > > I consider it very bad that `texindex` is locale-dependent. >> > > > IMHO the proper solution is to make `texinfo.tex` emit a >> > > > document encoding statement to the (unsorted) index file >> > > > that in turn gets acknowledged by `texindex`. >> >> Sure? No. But I have some tho

Re: `texindex` output depends on locale settings

2022-11-06 Thread arnold
Eli Zaretskii wrote: > Sure, but is the issue only with lower-case letters? What about > collation order or even determining what is and isn't a character (as > opposed to incomplete byte sequence)? I have some thoughts about this. I'd have to do some research and try them out. > I mean: what

Re: `texindex` output depends on locale settings

2022-11-06 Thread Eli Zaretskii
> Date: Sun, 06 Nov 2022 17:05:00 +0200 > From: Eli Zaretskii > Cc: w...@gnu.org, bug-texinfo@gnu.org > > > It could instead be > > > > function islower(c) > > { > > return c ~ /[[:lower:]]/ > > } > > > > And similar for the others. That would work for any unicode character. > > Sure, but

Re: `texindex` output depends on locale settings

2022-11-06 Thread Eli Zaretskii
> From: arn...@skeeve.com > Date: Sun, 06 Nov 2022 07:57:03 -0700 > Cc: w...@gnu.org, bug-texinfo@gnu.org > > Eli Zaretskii wrote: > > > Are you sure this Werner's request can be fulfilled: > > > > > > I consider it very bad that `texindex` is locale-dependent. IMHO > > > > the proper solut

Re: `texindex` output depends on locale settings

2022-11-06 Thread arnold
Eli Zaretskii wrote: > Are you sure this Werner's request can be fulfilled: > > > > I consider it very bad that `texindex` is locale-dependent. IMHO > > > the proper solution is to make `texinfo.tex` emit a document > > > encoding statement to the (unsorted) index file that in turn gets >

Re: `texindex` output depends on locale settings

2022-11-06 Thread Eli Zaretskii
> From: arn...@skeeve.com > Date: Sun, 06 Nov 2022 06:33:33 -0700 > Cc: arn...@skeeve.com > > Hi. > > Thanks for the report. As written, texindex is indeed suitable only > for English; when I wrote it ~ 9 years ago, nobody said anything about > support for other languages. > > I think this can b

Re: `texindex` output depends on locale settings

2022-11-06 Thread arnold
Hi. Thanks for the report. As written, texindex is indeed suitable only for English; when I wrote it ~ 9 years ago, nobody said anything about support for other languages. I think this can be remedied, although there may be issues with awk versions besides gawk as most don't support Unicode or ot

`texindex` output depends on locale settings

2022-11-06 Thread Werner LEMBERG
[texindex (GNU texinfo) 6.8dev] [GNU Awk 4.2.1, API: 2.0] [openSUSE Leap 15.4] There are two bugs with texindex, making it basically unusable for everything except English as the main document language. For the report below, here is an input file. ``` \input texinfo.tex @documentencoding UTF-