On 2018-06-27 11:11:37 +1200, Gregory Ewing wrote: > Bart wrote: > > x = set(range(10_000_000)) > > > > This used up 460MB of RAM (the original 100M I tried exhausted the memory). > > > > The advantage of Pascal-style sets is that that same set will occupy > > only 1.25MB, as it is a bit-map. > > That's true, but they're also extremely limited compared to > the things you can do with Python sets. They're really two > quite different things, designed for different use cases. > > Compact sets of integers are possible in Python, but not > as a built-in type. This suggests that they're not needed > much
Also, when such sets are needed, a simple array of bits may not be
sufficient. For example, sets may be sparse, or they may have almost all
except a few bits set. In these cases a tree or RLE representation is
much smaller and faster. There are a number of such implementations
(Judy Arrays and Roaring Bitmaps come to mind[1]). Each has its
advantages and limitations, so its probably better to leave the choice
to the author.
hp
[1] Judy is a C library. Roaring bitmaps are a data structure which has
been implemented in several languages. I haven't checked whether
there are packages on pypi. Shouldn't be too hard to implement if
one needs it.
--
_ | Peter J. Holzer | we build much bigger, better disasters now
|_|_) | | because we have much more sophisticated
| | | [email protected] | management tools.
__/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>
signature.asc
Description: PGP signature
-- https://mail.python.org/mailman/listinfo/python-list
