On Feb 8, 2012, at 20:00 , Phil Pinkerton wrote: > We have been doing a few hundred svnsync's from a 1.6.5 repositories > to 1.7.2 repositories > > for the most par this has gone quite well. but we have encountered an > error that is not to clear and we seek any insight to this error: > > svnsync: E000022: Valid UTF-8 data > (hex: 53 65 72 76 65 72 20 43 75 72 72 65 6e 63 79 20) > followed by invalid UTF-8 sequence > (hex: 96 20 42 61)
Indeed, the 0x96 is invalid in UTF-8. >>> "".join([chr(int(n, 16)).decode("utf-8") for n in "53 65 72 76 65 72 20 43 >>> 75 72 72 65 6e 63 79 20".split()]) u'Server Currency ' >>> "".join([chr(int(n, 16)).decode("utf-8") for n in "96 20 42 61".split()]) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/encodings/utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True) UnicodeDecodeError: 'utf8' codec can't decode byte 0x96 in position 0: invalid start byte Does that text appear in a log message? The 1.7 server is stricter about UTF-8. The svnsync command has a new option --source-prop-encoding, which may be useful if some old client committed a log message in some other encoding. Regards, Steve