madeirak commented on issue #11918: URL: https://github.com/apache/iceberg/issues/11918#issuecomment-2572732344
> How about using [parquet-cli](https://github.com/apache/parquet-java/tree/master/parquet-cli)? `footer` option provides bloom filter's offset and length. Also, we can use`bloom-filter` option to check the column contains bloom filter or not, and whether the provided value doesn't exist or maybe exists in the file: > > ```shell > ➜ parquet bloom-filter -c a -v 2 20250106_091836_00029_airuh-4a353534-10e4-4d7e-8d38-4b9edab7bf1b.parquet > > Row group 0: > -------------------------------------------------------------------------------- > column a has no bloom filter > > ➜ parquet bloom-filter -c b -v 1 20250106_091836_00029_airuh-4a353534-10e4-4d7e-8d38-4b9edab7bf1b.parquet > > Row group 0: > -------------------------------------------------------------------------------- > value 1 NOT exists. > ➜ parquet bloom-filter -c b -v 2 20250106_091836_00029_airuh-4a353534-10e4-4d7e-8d38-4b9edab7bf1b.parquet > > Row group 0: > -------------------------------------------------------------------------------- > value 2 maybe exists. > ``` Thx for reply, It seems to work, I'll try it -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org