qlong opened a new issue, #16448:
URL: https://github.com/apache/iceberg/issues/16448

   ### Feature Request / Improvement
   
   Iceberg can write shredded variant columns to Parquet (#14297). On the read 
path,
   `SparkScanBuilder` does not implement Spark 4.1's 
`SupportsPushDownVariantExtractions`,
   so Spark never rewrites `variant_get(...)` into struct field accesses and 
never prunes the
   scan output schema to the requested shredded fields.
   
   As a result, queries that only need one or two paths (e.g. 
`variant_get(payload, '$.size', 'long')`)
   still read the full shredded Parquet layout for each variant column 
(hundreds of
   `typed_value.*` columns in practice), materialize a full `VARIANT` value per 
row, and evaluate
   `variant_get` in a Spark `Filter` above `BatchScan`. That reconstruction is 
expensive: on the
   GitHub Activities 1-day shredded Iceberg table (GHA), a simple filter/count 
query was ~14×
   slower than the same workload on a plain JSON string column (~63s vs ~4.4s), 
despite only
   ~1.6× more storage.
   
   This issue implements the DSv2 contract (plan rewrite). A follow-on change 
will wire the annotated readSchema() into the Parquet reader and avoid 
full-variant reconstruction (I/O reduction).
   
   **Plan rewrite example**
   For this query:
   ```sql
   CREATE TABLE events (
     id INT,
     type STRING,
     payload VARIANT
   ) USING iceberg
   TBLPROPERTIES ('format-version' = '3'); 
   
   SELECT count(*) AS large_events
   FROM events
   WHERE type = 'PushEvent'
     AND variant_get(payload, '$.size', 'long') > 5;
   ```
   
   **Before (today — no SupportsPushDownVariantExtractions)**
   ```
   
   Aggregate [count(1)]
   +- Filter (... type = PushEvent)
                 AND (variant_get(payload#22, $.size, LongType, ...) > 5))    ← 
still a function call
         +- RelationV2[type#19, payload#22]
                       ↑
                 payload is VariantType (full variant)
   
   Filter (variant_get(payload#22, $.size, ...) > 5)     ← runs per row AFTER 
scan
     +- BatchScan [type#15, payload#18]
          IcebergScan(..., filters=type IS NOT NULL, payload IS NOT NULL, type 
= 'PushEvent')
          ReadSchema: full payload variant / all shredded columns
   ```
   
   **After (DSv2 plan rewrite)**
   ```
   Aggregate [count(1)]
   +- Filter (... type = PushEvent)
                 AND (payload#25.0 > 5))                    ← struct field 
access, not variant_get
         +- RelationV2[type#24, payload#25]
                       ↑
                 payload is StructType { "0": LongType [VariantMetadata 
path=$.size] }
   
   Filter (isnotnull(payload#25) AND (payload#25.0 > 5))     ← compares long 
field directly
     +- BatchScan [type#24, payload#25]
          readSchema(): struct<payload: struct<0: bigint, ...>>
   ```
   
   **Note**
   
   This issue changes the logical/physical plan shape and readSchema() 
contract. Parquet may still read all shredded columns  until the follow-on 
wires readSchema() into the reader.
   
   
   ### Query engine
   
   Spark
   
   ### Willingness to contribute
   
   - [x] I can contribute this improvement/feature independently
   - [x] I would be willing to contribute this improvement/feature with 
guidance from the Iceberg community
   - [ ] I cannot contribute this improvement/feature at this time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to