[I] Performance Issue: High Decompressor Allocation Due to doc_id Partitioning During User-Based Filtering [iceberg]

via GitHub Wed, 23 Apr 2025 23:19:29 -0700


rameshkanna3 opened a new issue, #12888:
URL: https://github.com/apache/iceberg/issues/12888


   ### Query engine
   
   ### Description:
   Hi Iceberg Community,
   
   I’m facing a performance issue when querying an Iceberg table that is 
partitioned on a high-cardinality field (doc_id) while applying a filter on a 
different column (user).
   
   **Context:**
   My table is defined as:
   
   ```sql
   CREATE TABLE catalog.schem.table (
      "doc_id" varchar NOT NULL,
      "user" varchar NOT NULL
   )
   WITH (
      delete_mode = 'copy-on-write',
      format = 'PARQUET',
      format_version = '2',
      location = '<table-location>',
      partitioning = ARRAY['doc_id']
   )
   ```
   - At query time, I’m using the Java API and applying a filter on the user 
column, as shown below:
   
   ```java
   Expression userOrGroupFilter = Expressions.in("user", usersNameList); 
//['usr1','us2','usr3'.....]
   CloseableIterable<Record> records = IcebergGenerics.read(table)
       .where(userOrGroupFilter)
       .select("doc_id")
       .build();
   ```
   For Example : 
   when i try to query with 2000 number of unique DocId ,
   - There are ~2000 unique doc_ids, and therefore 2000 partitions / Parquet 
files.
   - When the above filte query runs, it causes 2000+ decompressor instances to 
be allocated, making the application slow or appear stuck.(Nearly it's taking 3 
minutes to fetch 2k record using IcebergGenerics).
   
   ```cpp
   Got brand-new decompressor [.gz]
   Got brand-new decompressor [.gz]
   ...
   ```
   - Although the filter is applied only on user, due to partitioning on 
doc_id, the system ends up scanning every partition. This leads to unnecessary 
reads and decompressor overhead, especially at scale (~2000 small Parquet 
files).
   - This makes the application extremely slow and in some cases seemingly 
stuck due to the overhead.
   
   Environment:
   - Iceberg Version: 1.6.1
   -Format: Parquet
   -Querying via: Java API (IcebergGenerics)
   -Data Volume: ~2000 partitions (one per doc_id)
   
   
   ### Question
   
   1. Is there a better way to efficiently read from such high-cardinality 
partitioned tables using Iceberg APIs without causing so many decompressor 
allocations? Instead of this approach
   
    ```java
   Expression userOrGroupFilter = Expressions.in("user", usersToFilter);
   CloseableIterable<Record> records = IcebergGenerics.read(table)
       .where(userOrGroupFilter)
       .select("doc_id")
       .build();
   ```
   2. Any recommendations to reduce the decompressor overhead (e.g., config 
tuning, scan planning optimizations)?
   3. Is this behavior expected when querying a column (user) that is not part 
of the partition spec?
   
   Goal:
   To improve read performance when filtering by user on a table partitioned by 
doc_id, and reduce decompressor overhead caused by scanning every partition 
unnecessarily.
   
   Any best practices or tuning guidance would be greatly appreciated. Thank 
you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

[I] Performance Issue: High Decompressor Allocation Due to doc_id Partitioning During User-Based Filtering [iceberg]

Reply via email to