Hi Adrian,
Very interesting idea, I don't recall seeing this used in any of the
reference implementations.  On the surface I agree it looks compatible but
I need to think a little bit more deeply about it.

Cheers,
Micah

On Mon, Mar 30, 2026 at 3:27 PM Adrian Garcia Badaracco <[email protected]>
wrote:

> I think I've found a neat trick for making smaller bloom filters:
> https://github.com/apache/arrow-rs/pull/9628
>
> The idea is that you choose a largeish initial bloom filter size and once
> you're done populating it you compress it by folding it onto itself if it
> is sparse.
>
> Does anyone know if this trick is used in any other Parquet implementation?
> As far as I can tell it is compatible with the spec and should cause no
> issues, but I haven't heard of anyone doing this before.
>

Reply via email to