pitrou opened a new issue, #48334:
URL: https://github.com/apache/arrow/issues/48334

   ### Describe the enhancement requested
   
   Currently, trying to read a bloom filter from an encrypted Parquet file 
raises an exception.
   
   It would be nice to implement this at some point. Two pieces of data need to 
be [decrypted 
separately](https://github.com/apache/parquet-format/blob/master/Encryption.md#442-aad-suffix):
 the Thrift-serialized bloom filter header ("BloomFilter Header" with module id 
8), and the bloom filter data that follows it ("BloomFilter Bitset" with module 
id 9).
   
   Some inspiration can be found in the `PageIndex` implementation: see 
https://github.com/apache/arrow/blob/b2e8f2505ba3eafe65a78ece6ae87fa7d0c1c133/cpp/src/parquet/page_index.cc#L259-L269
 and 
https://github.com/apache/arrow/blob/b2e8f2505ba3eafe65a78ece6ae87fa7d0c1c133/cpp/src/parquet/page_index.cc#L970-L973
   
   ### Component(s)
   
   C++, Parquet


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to