Dan Gheorghe Haiduc added the comment:
rhettinger wrote [1]:
> The existing setobject code has been finely tuned and micro-optimized over
> the years, giving it excellent performance on workloads we care about.
This worries me also. But I suppose the tuning should make itself visible in
benchmarks. Does Python have "canonical" benchmarks or speed tests like
PyPy[2]?
I am not sure how useful this is, but I made a naive and synthetic
line_profiler benchmark of a naive implementation of set through a dict (see
attached file).
It resulted in roughly the following differences:
* Initialization from a 1M-sized list: 41-66% slower
* Inserting an item until doubling size: about the same (perhaps faster, due to
the size I've chosen not triggering rebuilding of a dict hash table)
* Deletion: 7-11% slower
* Membership test: 2-15% slower
Running them in the opposite order (first dict, then set) gave me the ranges.
I have not considered memory usage (I have not profiled memory before). But I
suspect this would be larger, since a dict would keep values in addition to
keys.
Additionally, initializing smaller structures (length = 100) seems to be
slower; the initialization takes 2x longer (100% slower), but the O(1)
operations take about the same.
I suspect methane's implementation linked by xtreak is better (but I have not
tried it).
Profiled with:
kernprof -l set_test.py
python -m line_profiler set_test.py.lprof
[1] https://mail.python.org/pipermail/python-dev/2019-February/156475.html
[2] https://speed.pypy.org/
--
Added file: https://bugs.python.org/file49602/set_test.py
___
Python tracker
<https://bugs.python.org/issue42368>
___
___
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com