Matthew Barnett added the comment:
It appears that in your tests Python 3.2 is faster with Unicode than
bytestrings and that unpatched Python 3.4 is a lot slower.
I get somewhat different results (Windows XP Pro, 32-bit):
C:\Python32\python.exe -m timeit -s "import re; f = re.compile(b'abc').search;
x = b'x'*100000" "f(x)"
1000 loops, best of 3: 449 usec per loop
C:\Python32\python.exe -m timeit -s "import re; f = re.compile('abc').search; x
= 'x'*100000" "f(x)"
1000 loops, best of 3: 506 usec per loop
C:\Python32\python.exe -m timeit -s "import re; f = re.compile('abc').search; x
= '\u20ac'*100000" "f(x)"
1000 loops, best of 3: 506 usec per loop
C:\Python34\python.exe -m timeit -s "import re; f = re.compile(b'abc').search;
x = b'x'*100000" "f(x)"
1000 loops, best of 3: 227 usec per loop
C:\Python34\python.exe -m timeit -s "import re; f = re.compile('abc').search; x
= 'x'*100000" "f(x)"
1000 loops, best of 3: 339 usec per loop
C:\Python34\python.exe -m timeit -s "import re; f = re.compile('abc').search; x
= '\u20ac'*100000" "f(x)"
1000 loops, best of 3: 504 usec per loop
For comparison, in the regex module I don't duplicate whole sections of code,
but instead have a pointer to one of 3 functions (for UCS1, UCS2 and UCS4) that
gets the codepoint, except for some tight loops. Doing that might be too much
of a change for re.
However, the speed appears to be a lot more consistent:
C:\Python32\python.exe -m timeit -s "import regex; f =
regex.compile(b'abc').search; x = b'x'*100000" "f(x)"
10000 loops, best of 3: 113 usec per loop
C:\Python32\python.exe -m timeit -s "import regex; f =
regex.compile('abc').search; x = 'x'*100000" "f(x)"
10000 loops, best of 3: 113 usec per loop
C:\Python32\python.exe -m timeit -s "import regex; f =
regex.compile('abc').search; x = '\u20ac'*100000" "f(x)"
10000 loops, best of 3: 113 usec per loop
C:\Python34\python.exe -m timeit -s "import regex; f =
regex.compile(b'abc').search; x = b'x'*100000" "f(x)"
10000 loops, best of 3: 113 usec per loop
C:\Python34\python.exe -m timeit -s "import regex; f =
regex.compile('abc').search; x = 'x'*100000" "f(x)"
10000 loops, best of 3: 113 usec per loop
C:\Python34\python.exe -m timeit -s "import regex; f =
regex.compile('abc').search; x = '\u20ac'*100000" "f(x)"
10000 loops, best of 3: 113 usec per loop
----------
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue18685>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com