New submission from Serhiy Storchaka <[email protected]>:
Charmap decoders are not as important as UTF decoders, but are still widely
used. In Python 3.3 with PEP 393 they slowed down 4x. The proposed patch
restores the performance.
Optimized only the most common case, when the decoder is specified by the UCS2
table with length >= 256. Map-based decoders translated to table-based. UCS1
tables widened to UCS2 by adding 257th fake characters.
Benchmark results:
3.2 3.3(vanilla) 3.3(patched)
cp1251 'A'*10000 111 (+10%) 31 (+294%) 122
cp1251 '\xa0'*10000 111 (+8%) 29 (+314%) 120
cp1251 '\u0402'*10000 111 (+6%) 25 (+372%) 118
----------
components: Interpreter Core, Unicode
files: decode_charmap.patch
keywords: patch
messages: 161301
nosy: ezio.melotti, haypo, lemburg, pitrou, storchaka
priority: normal
severity: normal
status: open
title: Faster charmap decoding
type: performance
versions: Python 3.3
Added file: http://bugs.python.org/file25664/decode_charmap.patch
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue14874>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com