[issue28531] Improve utf7 encoder memory usage

2016-11-18 Thread STINNER Victor
STINNER Victor added the comment: > I remember that we fixed bugs introduced by using _PyUnicodeWriter or > _PyBytesWriter many months after changing the code. Yeah, now I recall it (vaguely), that's why I closed the bug :-) -- ___ Python tracker <

[issue28531] Improve utf7 encoder memory usage

2016-11-18 Thread STINNER Victor
STINNER Victor added the comment: Oh no, now I'm afraid of breaking something :-D I don't trust anymore our test suite for the UTF-7 codec, so I close the issue :-) Sorry Xiang, but as we said, this rarely used codec is not important enough to require optimization. -- resolution: ->

[issue28531] Improve utf7 encoder memory usage

2016-11-18 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I fixed many long living bugs in the UTF-7 codec in the past, and I remember that we fixed bugs introduced by using _PyUnicodeWriter or _PyBytesWriter many months after changing the code. Since the UTF-7 codec is rarely used, there is a risk of introducing n

[issue28531] Improve utf7 encoder memory usage

2016-11-18 Thread STINNER Victor
STINNER Victor added the comment: Serhiy Storchaka: "The performance of the UTF-7 codec is not important." Right. "Actually I'm going to propose replacing it with Python implementation." Oh. Sadly, PyUnicode_DecodeUTF7() is part of the stable ABI. Do you want to call the Python codec from th

[issue28531] Improve utf7 encoder memory usage

2016-10-27 Thread Xiang Zhang
Xiang Zhang added the comment: Actually the patch is not going to speed up the encoder but just make the memory allocation strategy better, make the memory upper bound tighter. The speedup is just a good side effect. > It is rather in the line of idna and punycode than UTF-8 and UTF-32. Agre

[issue28531] Improve utf7 encoder memory usage

2016-10-27 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: The performance of the UTF-7 codec is not important. Unlikely to other UTF-* encodings this is not standard Unicode encoding. It is used in minority of applications and unlikely is a bottleneck. It is rather in the line of idna and punycode than UTF-8 and UT

[issue28531] Improve utf7 encoder memory usage

2016-10-27 Thread Xiang Zhang
Xiang Zhang added the comment: v2 uses _PyBytesWriter so we can use on stack buffer for short string. -- Added file: http://bugs.python.org/file45243/utf7_encoder_v2.patch ___ Python tracker ___

[issue28531] Improve utf7 encoder memory usage

2016-10-25 Thread Ezio Melotti
Changes by Ezio Melotti : -- nosy: +ezio.melotti ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.p

[issue28531] Improve utf7 encoder memory usage

2016-10-25 Thread Xiang Zhang
Changes by Xiang Zhang : Added file: http://bugs.python.org/file45220/utf7_encoder.patch ___ Python tracker ___ ___ Python-bugs-list mailing l

[issue28531] Improve utf7 encoder memory usage

2016-10-25 Thread Xiang Zhang
Changes by Xiang Zhang : Removed file: http://bugs.python.org/file45219/utf7_encoder.patch ___ Python tracker ___ ___ Python-bugs-list mailing

[issue28531] Improve utf7 encoder memory usage

2016-10-25 Thread Xiang Zhang
New submission from Xiang Zhang: Currently utf7 encoder uses an aggressive memory allocation strategy: use the worst case 8. We can tighten the worst case. For 1 byte and 2 byte unicodes, the worst case could be 3*n + 2. For 4 byte unicodes, the worst case could be 6*n + 2. There are 2 cases.