Hi Adrian, Very interesting idea, I don't recall seeing this used in any of the reference implementations. On the surface I agree it looks compatible but I need to think a little bit more deeply about it.
Cheers, Micah On Mon, Mar 30, 2026 at 3:27 PM Adrian Garcia Badaracco <[email protected]> wrote: > I think I've found a neat trick for making smaller bloom filters: > https://github.com/apache/arrow-rs/pull/9628 > > The idea is that you choose a largeish initial bloom filter size and once > you're done populating it you compress it by folding it onto itself if it > is sparse. > > Does anyone know if this trick is used in any other Parquet implementation? > As far as I can tell it is compatible with the spec and should cause no > issues, but I haven't heard of anyone doing this before. >
