Terry J. Reedy <[email protected]> added the comment:
I settled on the following to compare ParseMap implementations.
from idlelib.pyparse import Parser
import timeit
class ParseGet(dict):
def __getitem__(self, key): return self.get(key, ord('x'))
class ParseMis(dict):
def __missing__(self, key): return ord('x')
for P in (ParseGet, ParseMis):
print(P.__name__, 'hit', 'miss')
p = p=P({i:i for i in (10, 34, 35, 39, 40, 41, 91, 92, 93, 123, 125)})
print(timeit.timeit(
"p[10],p[34],p[35],p[39],p[40],p[41],p[91],p[92],p[93],p[125]",
number=100000, globals = globals()))
print(timeit.timeit(
"p[11],p[33],p[36],p[45],p[50],p[61],p[71],p[82],p[99],p[125]",
number=100000, globals = globals()))
ParseGet hit miss
1.104342376
1.112531999
ParseMis hit miss
0.3530207070000002
1.2165967760000003
ParseGet hit miss
1.185322191
1.1915449519999999
ParseMis hit miss
0.3477272720000002
1.317010653
Avoiding custom code for all ascii chars will be a win. I am sure that calling
__missing__ for non-ascii will be at least as fast as it is presently. I will
commit a revision tomorrow.
I may then compare to Serhiy's sub/replace suggestion. My experiments with
'code.translate(tran)' indicate that time grows sub-linearly up to 1000 or
10000 chars. This suggests that there are significant constant or log-like
terms.
----------
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue32940>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com