sent from my phone On May 23, 2013 4:43 PM, "Dongsheng Song" <dongsheng.s...@gmail.com> wrote: > > On Thu, May 23, 2013 at 10:06 PM, Philip Martin > <philip.mar...@wandisco.com> wrote: > > Dongsheng Song <dongsheng.s...@gmail.com> writes: > > > >> On Thu, May 23, 2013 at 9:28 PM, Philip Martin > >> <philip.mar...@wandisco.com> wrote: > >>> Dongsheng Song <dongsheng.s...@gmail.com> writes: > >>> > >>>> On Thu, May 23, 2013 at 9:11 PM, Philip Martin > >>>> <philip.mar...@wandisco.com> wrote: > >>>>> Philip Martin <philip.mar...@wandisco.com> writes: > >>>>> > >>>>>> So it appears the UTF8 to native conversion is missing from > >>>>>> repos_notify_handler. I think repos_notify_handler should be using > >>>>>> svn_stream_printf_from_utf8 rather than svn_stream_printf. > >>>>> > >>>>> I've fixed trunk to use svn_cmdline_cstring_from_utf8 and proposed it > >>>>> for 1.8. > >>>>> > >>>> > >>>> As GETTEXT(3) man pages said, If and only if > >>>> defined(HAVE_BIND_TEXTDOMAIN_CODESET), > >>>> your commit is OK. > >>>> > >>>> So you should check HAVE_BIND_TEXTDOMAIN_CODESET when you use > >>>> svn_cmdline_cstring_from_utf8. > >>> > >>> Are you saying there is a problem with my change? If there is a problem > >>> doesn't already apply to all other uses of svn_cmdline_cstring_from_utf8? > >>> > >> > >> I thinks so. In the subversion/libsvn_subr/nls.c file: > >> > >> #ifdef HAVE_BIND_TEXTDOMAIN_CODESET > >> bind_textdomain_codeset(PACKAGE_NAME, "UTF-8"); > >> #endif /* HAVE_BIND_TEXTDOMAIN_CODESET */ > >> > >> bind_textdomain_codeset only called when HAVE_BIND_TEXTDOMAIN_CODESET > >> defined. In this case, you can assume GETTEXT(3) returned string is > >> UTF-8 encoded. > > > > I still don't understand if you are claiming my change has a problem or > > if there is a problem in all uses of svn_cmdline_cstring_from_utf8. > > > > I recall a related thread from last year: > > > > http://svn.haxx.se/dev/archive-2012-08/index.shtml#34 > > http://mail-archives.apache.org/mod_mbox/subversion-dev/201208.mbox/%3Cop.wilcelggnngjn5@tortoise%3E > > > > I think we assume that the translations are UTF-8. > > > > Is there some code change you think we should make? > > > > Even ALL the translations are UTF-8, GETTEXT(3) still return the > string encoded by the ***current locale's codeset***. > > Here is sniped from the GETTEXT(3) man pages: > > In both cases, the functions also use the LC_CTYPE locale facet in > order to convert the translated message from the translator's > codeset to the ***current locale's codeset***, unless overridden by a > prior call to the bind_textdomain_codeset function. > > So svn_cmdline_printf SHOULD NOT assume the input string is UTF-8 > coded, it it encoded to the ***current locale's codeset***.
But we call the codeset function to make sure we do not generate output in the current locale encoding. > I think the best solution is: DO NOTconvert the GETTEXT(3) returned > messages, write it ***AS IS***, since GETTEXT(3) already do the > correct conversion for us. Well, even though gettext may want us to believe otherwise, this doesn't work for cross platform applications: e.g. in windows the locale for output on the console may be different from the locale for other uses. Back when we went with gettext (2004?), we've hashed this through pretty thoroughly. I hope that discussion is still available in the archives. Bye, Erik.