[issue10066] xmlrpclib does not handle some non-printable characters properly
New submission from Peter Gyorko : If I add a string to the response, which contains non-printable characters, the output will not be parsed by the most of the XML parsers (I tried with XML-RPC for PHP). Here is my quick and dirty fix: --- a/Lib/xmlrpclib.py +++ b/Lib/xmlrpclib.py @@ -165,9 +165,18 @@ def _decode(data, encoding, is8bit=re.compile("[\x80-\xff]").search): return data def escape(s, replace=string.replace): -s = replace(s, "&", "&") -s = replace(s, "<", "<") -return replace(s, ">", ">",) +res = '' +for char in s: +char_code = ord(char) +if (char_code < 32 and char_code not in (9, 10, 13)) or char_code > 126: +res += '\\x%02x' % ord(char) +else: +res += char + +res = replace(res, "&", "&") +res = replace(res, "<", "<") +res = replace(res, ">", ">") +return res if unicode: def _stringify(string): -- components: XML messages: 118376 nosy: gyorkop priority: normal severity: normal status: open title: xmlrpclib does not handle some non-printable characters properly type: behavior versions: Python 2.6 ___ Python tracker <http://bugs.python.org/issue10066> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10066] xmlrpclib does not handle some non-printable characters properly
Peter Gyorko added the comment: The shortest code which can trigger this error is the following: >>> import xmlrpclib >>> print xmlrpclib.dumps(('\x01',)) As you can see, the escape method does not care about non-printable characters which can cause parsing error in the other side. My previous patch used \x to tell to the other side that the value contains some binary garbage. It you want to reject these binary bytes (which was not acceptable in my case), use this patch: --- a/xmlrpclib.py 2010-10-13 14:45:02.0 +0200 +++ b/xmlrpclib.py 2010-10-13 16:03:14.0 +0200 @@ -165,6 +165,9 @@ return data def escape(s, replace=string.replace): +if (None != re.search('[\x00-\x08\x0b-\x0c\x0e-\x1f\x7f-\xff]', s)): +raise Fault(INVALID_ENCODING_CHAR, 'Non-printable character in string') + s = replace(s, "&", "&") s = replace(s, "<", "<") return replace(s, ">", ">",) An other idea: we may use CDATA (http://www.w3schools.com/xml/xml_cdata.asp) to transfer binary values... -- ___ Python tracker <http://bugs.python.org/issue10066> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com