"Cooke, Mark" <mark.co...@siemens.com> writes: > Quick Summary: subversion (both TortoiseSVN and the command-line > client provided by TSVN) is changing certain characters whilst using > Basic Authentication (over https, from Windows XP) to apache 2.2 (on > Windows Server 2003). So far I have confirmed this for the UK > keyboard `£` (SHIFT-3): > >> When using a browser, I get the following for <shift>-1 >> through <shift>-0 on my UK keyboard (bounded by '[]'): >> >> 2012-04-17 16:03:09.734000 : svntest [!"£$%^&*()] >> >> ...but when I use the svn command line client I log instead: >> >> 2012-04-17 16:01:52.124000 : svntest [!"œ$%^&*()] >> >> Note that the `£` is now different. I think that this explains >> the `Password Mismatch` error? > > Philip Martin has already responded (thanks!) with: > >> Non-ascii passwords are a problem for HTTP because there is >> no standard for encoding the password before constructing the >> digest, nor is there a standard for the client to tell the >> server which encoding it used. Because there is no standard >> clients tend to do different things. Some clients will >> convert the password to UTF-8, some clients will convert to >> some other encoding, and some clients will leave it in whatever >> encoding the user entered. > > ...which helps to explain the problem (except we are using `basic` > plain text, not digest) but I cannot believe that we are the only > subversion users with this problem, what about other users with > non-latin character sets (Russia, Israel etc)?
You have exactly the same problem with basic auth, there is no standard for encoding non-ASCII passwords. It's generally possible to adjust the password storage on the server so that any given client works, but it not possible to get all clients to work. Suppose I have a password consisting of a single '£' character. In ISO-8859-1 that is the single byte 0xA3, in UTF-8 that is two bytes 0xC2 0xA3. If I combine that with a username pm2 the the basic auth token is given by $ echo -n pm2:£ | base64 In ISO-8859-1 this gives 'cG0yOqM=' while in UTF-8 it gives 'cG0yOsKj'. When you store the password on the server in an htpasswd file you choose to store either the literal passowrd or a password hash. If you store the password literally as the line pm2:£ you have to choose how to store the password. If you use the one-byte form the 'cG0yOqM=' auth token will work, if you use the two-byte form the 'cG0yOsKj' auth token will work. If you use some other form, such as UTF-16, then neither of those tokens will work. It's more usual to store password hashes but the same problem occurs. If you store the password hash, using a salt AA, it's typically $ mkpassword £ AA in IS0-8859-1 this leads to the line pm2:AACiVWnPwZTeE and the 'cG0yOqM=' token will work. In UTF-8 it leads to the line pm2:AAzOZFufPfaOQ and the 'cG0yOsKj' token will work. A client like curl does no password encoding conversion, so the command $ curl http://... -u pm2:£ will send the token 'cG0yOqM=' when running in an IS0-8859-1 environment and the token 'cG0yOsKj' when running in UTF-8. Only one of these will work depending on how the htpasswd file is set up. A client like the svn converts passwords from the command line or keyboard to UTF-8 so the command $ svn cat http://... --username pm --password £ will always send the 'cG0yOsKj' auth token. This will work if the htpasswd has been setup for UTF-8 and it will work whatever environment is being used by the client, but will fail if the htpasswd file has not been setup for UTF-8 Other clients such as TSVN or web browsers may behave like curl, or they may behave like svn, or they may do something else. By adjusting the setup on the server you can generally get any given client to work in any given encoding, but there is no way to get all clients to work in all encodings. It gets even more complicated when you consider password caching: the passwords that Subversion stores are in UTF-8 and Subversion assumes that they are still UTF-8 when retrieved. However if the password store is shared with other clients, say a web browser, then those other clients may have stored non-UTF-8 passwords and this will cause Subversion to send non-UTF-8 auth tokens. That works if the server is setup so that non-UTF-8 tokens work. -- Philip