Re: Missing LOCALE in post-commit hook leads to weird behaviour of `svnlook log` with unicode characters – broken transliterations

2018-01-29 Thread Johan Corveleyn
On Sat, Jan 27, 2018 at 6:35 PM, H.-Dirk Schmitt  wrote:
> I found a very weird behaviour of `svnlook log` that IMHO is a bug (or
> at least a serious missing documentation issue).
>
> Introduction
> 
>
> Consider a log message like: 'Unicode Test → ø ÄÖÜ'
>
> `svnlook  log` invoked in a normal terminal session shows the proper
> content.
> This works because the environment is set to 'en_US.UTF-8'.
>
> Now start to play - `env LC_ALL=C.UTF-8 svnlook log` also shows a
> correct result.
>
> Problem
> ---
> But falling back to `env LC_ALL=C svnlook log`  I got a very flawed
> result:
>
> Unicode Test {U+2192} {U+00F8} AOU
>
> → and ø are replaced with there code description
> The German Umlaut chars are translitterated in a very uncommon way.
> In the old ASCII/type-writer days Ä was translitterated in Ae (Ö → Oe,
> …)
>
> Why is this behaviour not a cosmetic problem.
> -
>
> Consider a post-commit hook fetching the commit message with `svnlook
> log`.
> Purpose is to postprocess the log message content, e.g. append to
> bugzilla issues.
>
> The actual setup is svn+apache2 and a bash script as post commit hook.
> The machine locatle as reported by `localectl`: System Locale:
> LANG=en_US.utf8
>
> All the commit messages content transfered is broken as described
> above.
>
> This happens because the post-commit hook is running with a very
> reduced set of environment variables:
>PWD=/
>SHLVL=1
>
> Especially `LC_ALL` is not set which is eqivalent to `LC_ALL=C`.
>
> Suggested Mitigation/Fixing
> ---
> 1. Subversion should ensure that the system locale is forwarded to the
> post-commit hook.
> 2. `svnlook` shoud support the `--encoding` switch
> 3. German Umlaute (and surely some other national characters in the 8-
> bit range) shouldn't translittered in a different
>way as unicode characters (see ø / {U+00F8}).
>
>
> PS: Google et. al. haven't shown that this issue is well documented.

This is documented in the official documentation (the "SVN Book"):
http://svnbook.red-bean.com/nightly/en/svn.reposadmin.create.html#svn.reposadmin.hooks.configuration

(see the first sentence there: "By default, Subversion executes hook
scripts with an empty environment—that is, no environment variables
are set at all, not even $PATH (or %PATH%, under Windows).")

-- 
Johan


Re: Missing LOCALE in post-commit hook leads to weird behaviour of `svnlook log` with unicode characters – broken transliterations

2018-01-29 Thread Stefan Sperling
On Sat, Jan 27, 2018 at 06:35:17PM +0100, H.-Dirk Schmitt wrote:
> All the commit messages content transfered is broken as described
> above.
> 
> This happens because the post-commit hook is running with a very
> reduced set of environment variables:
>PWD=/
>SHLVL=1

See http://subversion.apache.org/docs/release-notes/1.8.html#hooks-env
and http://subversion.apache.org/docs/release-notes/1.8.html#mod-dav-svn-utf8


Searching for a C++ API

2018-01-29 Thread R developer
Hello all,

I have been involved in writing a closed source application that among
other things maintains a checkout from an SVN repository, the application
used to be written in C# so at the moment we are used to SharpSvn. Recently
the decision was made to incorporate the functionality into a mobile c++
app (yes, I do wish we could use a different language, but for now that's
out of our scope).

Is there a C++ library available somewhere?
One that can be used in a closed source app and compiles both on Windows
and Linux (and preferably other platforms like OSx, iOS, Android).
Usage of c++11, boost and/or Qt frameworks is fine as our app already uses
them.

When we started searching; we found several dead ends. Either GPL code
which we can't use or outdated and no longer compiling.

What is the preferred/easiest way of doing a simple checkout/update of a
svn repository folder in a c++ application?
(Any leads will be appreciated as we are quite new to this topic)

Thanks,
Richard


Re: Searching for a C++ API

2018-01-29 Thread Branko Čibej
On 29.01.2018 11:45, R developer wrote:
> Hello all,
>
> I have been involved in writing a closed source application that among
> other things maintains a checkout from an SVN repository, the
> application used to be written in C# so at the moment we are used to
> SharpSvn. Recently the decision was made to incorporate the
> functionality into a mobile c++ app (yes, I do wish we could use a
> different language, but for now that's out of our scope).
>
> Is there a C++ library available somewhere?
> One that can be used in a closed source app and compiles both on
> Windows and Linux (and preferably other platforms like OSx, iOS, Android).
> Usage of c++11, boost and/or Qt frameworks is fine as our app already
> uses them.
>
> When we started searching; we found several dead ends. Either GPL code
> which we can't use or outdated and no longer compiling.
>
> What is the preferred/easiest way of doing a simple checkout/update of
> a svn repository folder in a c++ application?
> (Any leads will be appreciated as we are quite new to this topic)


There is no C++ API that I'm aware of. However, the C API can quite
handily be used from C++.

-- Brane


Re: Missing LOCALE in post-commit hook leads to weird behaviour of `svnlook log` with unicode characters – broken transliterations

2018-01-29 Thread H.-Dirk Schmitt

Stefan Sperling :
> On Sat, Jan 27, 2018 at 06:35:17PM +0100, H.-Dirk Schmitt wrote:
> > All the commit messages content transfered is broken as described
> > above.
> > 
> > This happens because the post-commit hook is running with a very
> > reduced set of environment variables:
> >PWD=/
> >SHLVL=1
> 
> See http://subversion.apache.org/docs/release-notes/1.8.html#hooks-en
> v
> and http://subversion.apache.org/docs/release-notes/1.8.html#mod-dav-
> svn-utf8

Johan Corveleyn :
> This is documented in the official documentation (the "SVN Book"):
> [...]
> (see the first sentence there: "By default, Subversion executes hook
> scripts with an empty environment—that is, no environment variables
> are set at all, not even $PATH (or %PATH%, under Windows).")

OK - My „Postscriptum“ was not correct - my apologies.

But still valid are the the points:

- Broken transliteration of German Umlaut.  
- Subversion is ignoring the machine locate settings which should
normally the default if not overwritten in the Environment. This is a
considerable bad behaviour for a linux/unix application.




-- 




  
  

  Signature H.-Dirk Schmitt



  

  

  H.-Dirk Schmitt
  

  Dipl.Math.

  eMail:dirk.schm...@computer42.org
  

  mobile:+49 177 616 8564
  

  phone: +49 2642 99 41 14
  

  fax: +49 2642 99 41 15
  

  Schillerstr. 42, D-53489 Sinzig

  pgp: http://www.computer42.org/~dirk/OpenPGP-fingerprint.html




Re: Missing LOCALE in post-commit hook leads to weird behaviour of `svnlook log` with unicode characters – broken transliterations

2018-01-29 Thread Stefan Sperling
On Mon, Jan 29, 2018 at 04:46:09PM +0100, H.-Dirk Schmitt wrote:
> OK - My „Postscriptum“ was not correct - my apologies.
> 
> But still valid are the the points:
> 
> - Broken transliteration of German Umlaut.  

I don't see a reason to add support for transliteration if
the locale is incompatible. Just use UTF-8. Paths and log
messages are always stored as UTF-8 inside Subversion anyway.

> - Subversion is ignoring the machine locate settings which should
> normally the default if not overwritten in the Environment. This is a
> considerable bad behaviour for a linux/unix application.

Generally, I agree that unix applications should heed locale settings,
but servers are a special case.

As mentioned in 
http://subversion.apache.org/docs/release-notes/1.8.html#mod-dav-svn-utf8
the locale behaviour is the result of a policy decision made by the
Apache HTTPD project, namely that all Apache modules run in the "C"
locale and only the "C" locale, even if the system default locale is
something else! Apache HTTPD does not call the setlocale() function.
This is a reasonable trade-off because locale-dependent behaviour could
potentially result in security issues in the webserver. And therefore,
having a webserver module like mod_dav_svn fiddle with the locale and/or
the environment of the running server would be frowned upon.

Hook scripts are generally only interested in the character set
anyway, i.e. LC_CTYPE. All the other locale settings (LC_TIME,
LC_MESSAGES, LC_NUMERIC, etc.) are not critical for hook scripts.

So we added a custom UTF-8 option to mod_dav_svn to allow SVN users to
configure hook script environments in a way that the default HTTPD
behaviour won't allow for, and to set the character set to UTF-8.
Environment variables set this way are only seen by hook scripts and
do not affect the HTTPD server in any way.

I believe this solution gives you the best of both worlds.

Note that using character sets other than ASCII in hook scripts was
impossible for many years. And the move from ASCII to UTF-8 did happen
a couple of years ago already. I don't think changing this behaviour
again would be worthwhile at this point.
See https://issues.apache.org/jira/browse/SVN-2487 in our bug database.