[issue14738] Amazingly faster UTF-8 decoding

2012-05-12 Thread STINNER Victor
STINNER Victor added the comment: If the commit makes Python 3.3 faster than Python 3.2, it is an optimisation that should be documented in the What's New in Python 3.3 document. -- ___ Python tracker

[issue14738] Amazingly faster UTF-8 decoding

2012-05-11 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Thanks Martin for review, which has allowed me to make a quality patch, and for promotion of further research. Thanks Antoine for review, benchmarks, commit, and for the original optimization, which served as the basis for my patch. --

[issue14738] Amazingly faster UTF-8 decoding

2012-05-10 Thread Antoine Pitrou
Antoine Pitrou added the comment: The patch is now committed. Well done and thanks for your contribution. -- resolution: -> fixed stage: patch review -> committed/rejected status: open -> closed ___ Python tracker

[issue14738] Amazingly faster UTF-8 decoding

2012-05-10 Thread Roundup Robot
Roundup Robot added the comment: New changeset e08c3791f035 by Antoine Pitrou in branch 'default': Issue #14738: Speed-up UTF-8 decoding on non-ASCII data. Patch by Serhiy Storchaka. http://hg.python.org/cpython/rev/e08c3791f035 -- nosy: +python-dev __

[issue14738] Amazingly faster UTF-8 decoding

2012-05-09 Thread STINNER Victor
STINNER Victor added the comment: > The difficulty is that you need to check on both Macs > with 16-bit and with 32-bit wchar_t. I don't think that the size of wchar_t is configurable: it should always be 32 bits on Mac OS X. -- ___ Python tracker

[issue14738] Amazingly faster UTF-8 decoding

2012-05-09 Thread Mark Dickinson
Mark Dickinson added the comment: > Actually, it should be enough to run the test suite, since we should > have tests for this. I just ran the test suite ("python -m test") on OS X 10.6.8 with 'decode_utf8_5.patch' applied. (64-bit --with-pydebug build of Python.) No test failures. test h

[issue14738] Amazingly faster UTF-8 decoding

2012-05-09 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I hacked the code (commented out "#if __APPLE__" in Objects/unicodeobject.c and Modules/python.c) to start this branch on Linux and ran the test (test_cmd_line) with C locale. It passed. Then I broke decoder and ran the test again to get the error. I can now c

[issue14738] Amazingly faster UTF-8 decoding

2012-05-09 Thread Antoine Pitrou
Changes by Antoine Pitrou : -- nosy: +janssen ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.pytho

[issue14738] Amazingly faster UTF-8 decoding

2012-05-09 Thread Antoine Pitrou
Antoine Pitrou added the comment: > It would be good if someone checked on Macs work with command line > arguments, including non-valid utf8. The difficulty is that you need > to check on both Macs with 16-bit and with 32-bit wchar_t. Actually, it should be enough to run the test suite, since w

[issue14738] Amazingly faster UTF-8 decoding

2012-05-09 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Issue4388 is related to this Mac-specific portion of the patch. -- ___ Python tracker ___ ___ Pyt

[issue14738] Amazingly faster UTF-8 decoding

2012-05-09 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: It would be good if someone checked on Macs work with command line arguments, including non-valid utf8. The difficulty is that you need to check on both Macs with 16-bit and with 32-bit wchar_t. -- ___ Python tra

[issue14738] Amazingly faster UTF-8 decoding

2012-05-09 Thread Antoine Pitrou
Antoine Pitrou added the comment: There's a Mac-specific portion in the patch, it would be nice if someone could check that it works. -- nosy: +ned.deily, ronaldoussoren ___ Python tracker ___

[issue14738] Amazingly faster UTF-8 decoding

2012-05-06 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: The patch updated in accordance with Antoine cosmetic comments. -- Added file: http://bugs.python.org/file25485/decode_utf8_5.patch ___ Python tracker __

[issue14738] Amazingly faster UTF-8 decoding

2012-05-06 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Thank your, Antoine. Finally Intel Core is defeated! If someone wants to repeat tests, see benchmark tools in issue14624. -- ___ Python tracker

[issue14738] Amazingly faster UTF-8 decoding

2012-05-06 Thread Antoine Pitrou
Antoine Pitrou added the comment: 64-bit Linux, Intel Core i5 2500K: 3.2 3.3 patched utf-8 'A'*1 2550 (+198%) 6828 (+11%) 7607 utf-8 'A'*+'\x80' 2501 (+118%) 2415 (+126%)

[issue14738] Amazingly faster UTF-8 decoding

2012-05-06 Thread Ezio Melotti
Changes by Ezio Melotti : -- components: +Unicode nosy: +ezio.melotti stage: -> patch review ___ Python tracker ___ ___ Python-bugs-l

[issue14738] Amazingly faster UTF-8 decoding

2012-05-06 Thread Serhiy Storchaka
New submission from Serhiy Storchaka : I propose a complex patch, which significantly speeds up UTF-8 decoding. Now decoder faster even decoder in 3.2 (except in a few unreal patological cases). Also the decoder code reduced and simplified (formerly decoding code was repeated in at least three