[issue4426] UTF7 decoding is far too strict

2009-05-04 Thread Nick Barnes
Nick Barnes added the comment: This was my first contribution to Python. I don't know what the rules are on changing the arguments of an internal function such as PyUnicode_EncodeUTF7(). Since I was rewriting the whole function anyway, I tried to give it arguments which made more sense

[issue4426] UTF7 decoding is far too strict

2008-12-01 Thread Nick Barnes
Nick Barnes <[EMAIL PROTECTED]> added the comment: Here is my patch. This is a rewrite of the UTF7 encoder and decoder. It now handles surrogate pairs correctly, so non-BMP characters work with this codec. And my motivating example ('/'.decode('utf7')) works OK. I&

[issue4426] UTF7 decoding is far too strict

2008-12-01 Thread Nick Barnes
Nick Barnes <[EMAIL PROTECTED]> added the comment: My original defect report here was incorrect, or possibly only relates to a particular older Python installation. It is still the case that UTF-7 decoding is fussier than it need be (decoding should be permissive), and is broken specif

[issue4426] UTF7 decoding is far too strict

2008-11-27 Thread Nick Barnes
Nick Barnes <[EMAIL PROTECTED]> added the comment: I'll try to get to this next week. Right now I'm snowed under. I don't promise to do any refactoring. ___ Python tracker <[EMAIL PROTECTED]> <ht

[issue4426] UTF7 decoding is far too strict

2008-11-25 Thread Nick Barnes
Nick Barnes <[EMAIL PROTECTED]> added the comment: Well, I could submit a diff for unicodeobject.c, but I have never contributed to Python (or used this particular tracking system) before. Is there a standard form for contributing changes? Unifie

[issue4426] UTF7 decoding is far too strict

2008-11-25 Thread Nick Barnes
Nick Barnes <[EMAIL PROTECTED]> added the comment: # Note, this test covers issues 4425 and 4426 # Direct encoded characters: set_d = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789'(),-./:?" # Optional direct characters: set_o = '!"#$%&*

[issue4426] UTF7 decoding is far too strict

2008-11-25 Thread Nick Barnes
New submission from Nick Barnes <[EMAIL PROTECTED]>: UTF-7 decoding raises an exception for any character not in the RFC2152 "Set D" (directly encoded characters). In particular, it raises an exception for characters in "Set O" (optional direct characters), such

[issue4425] UTF7 encoding of slash (character 47) is incorrect

2008-11-25 Thread Nick Barnes
New submission from Nick Barnes <[EMAIL PROTECTED]>: '/'.encode('utf7') returns '+AC8-'. It should return '/'. See RFC 2152. '/'.decode('utf7') raises an exception (this is a special case of a general problem with UTF-7 decodi