This issue has come up before: https://stat.ethz.ch/pipermail/r-help/2013-February/346721.html ("gettext wierdness"), https://stat.ethz.ch/pipermail/r-devel/2007-December/047893.html ("gettext() and messages in 'pkg' domain").
Using 'ngettext' is a workaround, like in https://rdrr.io/cran/svMisc/src/R/svMisc-internal.R . It is documented: "For 'gettext', leading and trailing whitespace is ignored when looking for the translation." ------------ >> Martin Maechler >>>>> on Fri, 5 Nov 2021 17:55:24 +0100 writes: >>>>> Tomas Kalibera >>>>> on Fri, 5 Nov 2021 16:15:19 +0100 writes: >> On 11/5/21 4:12 PM, Duncan Murdoch wrote: >>> On 05/11/2021 10:51 a.m., Henrik Bengtsson wrote: >>>> I'm trying to reuse some of the translations available in base R by >>>> using: >>>> >>>> gettext(msgid, domain="R") >>>> >>>> This works great for most 'msgid's, e.g. >>>> >>>> $ LANGUAGE=de Rscript -e 'gettext("cannot get working directory", >>>> domain="R")' >>>> [1] "kann das Arbeitsverzeichnis nicht ermitteln" >>>> >>>> However, it does not work for all. For instance, >>>> >>>> $ LANGUAGE=de Rscript -e 'gettext("Execution halted\n", domain="R")' >>>> [1] "Execution halted\n" >>>> >>>> This despite that 'msgid' existing in: >>>> >>>> $ grep -C 2 -F 'Execution halted\n' src/library/base/po/de.po >>>> >>>> #: src/main/main.c:342 >>>> msgid "Execution halted\n" >>>> msgstr "Ausführung angehalten\n" >>>> >>>> It could be that the trailing newline causes problems, because the >>>> same happens also for: >>>> >>>> $ LANGUAGE=de Rscript --vanilla -e 'gettext("error during cleanup\n", >>>> domain="R")' >>>> [1] "error during cleanup\n" >>>> >>>> Is this meant to work, and if so, how do I get it to work, or is it a >>>> bug? >>> >>> I don't know the solution, but I think the cause is different than you >>> think, because I also have the problem with other strings not >>> including "\n": >>> >>> $ LANGUAGE=de Rscript -e 'gettext("malformed version string", >>> domain="R")' >>> [1] "malformed version string" > You need domain="R-base" for the "malformed version "string" >> I can reproduce Henrik's report and the problem there is that the >> trailing \n is stripped by R before doing the lookup, in do_gettext >> /* strip leading and trailing white spaces and >> add back after translation */ >> for(p = tmp; >> *p && (*p == ' ' || *p == '\t' || *p == '\n'); >> p++, ihead++) ; >> But, calling dgettext with the trailing \n does translate correctly for me. >> I'd leave to translation experts how this should work (e.g. whether the >> .po files should have trailing newlines). > Thanks a lot, Tomas. > This is "interesting" .. and I think an R bug one way or the > other (and I also note that Henrik's guess was also right on !). > We have the following: > - New translation *.po source files are to be made from the original *.pot > files. > In our case it's our code that produce R.pot and R-base.pot > (and more for the non-base packages, and more e.g. for > Recommended packages 'Matrix' and 'cluster' I maintain). > And notably the R.pot (from all the "base" C error/warn/.. messages) > contains tons of msgid strings of the form ".......\n" > i.e., ending in \n. >> From that automatically the translator's *.po files should also > end in \n. > Additionally, the GNU gettext FAQ has > (here : https://www.gnu.org/software/gettext/FAQ.html#newline ) > ------------------------------------------------ > Q: What does this mean: “'msgid' and 'msgstr' entries do not both end with > '\n'” > A: It means that when the original string ends in a newline, your > translation must also end in a newline. And if the original string does not > end in a newline, then your translation should likewise not have a newline > at the end. > ------------------------------------------------ >> From all that I'd conclude that we (R base code) are the source > of the problem. > Given the above FAQ, it seems common in other projects also to > have such trailing \n and so we should really change the C code > you cite above. > On the other hand, this is from almost the very beginning of > when Brian added translation to R, > ------------------------------------------------------------------------ > r32938 | ripley | 2005-01-30 20:24:04 +0100 (Sun, 30 Jan 2005) | 2 lines > include \n in whitespace ignored for R-level gettext > ------------------------------------------------------------------------ > I think this has been because simultaneously we had started to > emphasize to useRs they should *not* end message/format strings > in stop() / warning() by a new line, but rather stop() and > warning() would *add* the newlines(s) themselves. > Still, currently we have a few such cases in R-base.pot, > but just these few and maybe they really are "in error", in the > sense we could drop the ending '\n' (and do the same in all the *.po files!), > and newlines would be appended later {{not just by Rstudio which > graceously adds final newlines in its R console, even for say > cat("abc") }} > However, this is quite different for all the message strings from C, as > used there in error() or warn() e.g., and so in R.pot > we see many many msg strings ending in "\n" (which must then > also be in the *.po files. > My current conclusion is we should try simplifying the > do_gettext() code and *not* remove and re-add the '\n' (nor the > '\t' I think ...) After such a change, I indeed do see $ LANGUAGE=de bin/Rscript --vanilla -e 'gettext("Execution halted\n", domain="R")' [1] "Ausführung angehalten\n" $ LANGUAGE=de bin/Rscript --vanilla -e 'message("Execution halted\n", domain="R")' Ausführung angehalten $ LANGUAGE=de bin/Rscript --vanilla -e 'warning("Execution halted\n", domain="R")' Warnmeldung: Ausführung angehalten $ (note the extra newline after the German translation!) whereas before, not only using gettext() directly did not work, but also using warning() or message() {with or without trailing \n} were never translated. ... and my simple #ifdef .. #endif change around the head/tail save and restor seems to pass make check-devel ... so I will be looking into dropping all those "head" and "tail" add and remove parts in do_gettext() as they really seem to harm given the current translation data bases which indeed *are* full of final '\n' in `msgid` and corresponding translated `msgstr` .... So, no need for a bugzilla PR nor a patch, please. Maybe further examples which add something interesting in addition to the ones we have here. Thank you again, Henrik, Duncan, and Tomas! Martin ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel