dejangvozdenac opened a new issue, #13328:
URL: https://github.com/apache/iceberg/issues/13328

   ### Apache Iceberg version
   
   1.9.1 (latest release)
   
   ### Query engine
   
   Trino
   
   ### Please describe the bug 🐞
   
   In Spark, we create a nested struct `address.street`. The outermost field 
`address` is optional, but the innermost field `street` is required. When 
querying with trino with condition `address.street is null` with projection 
pushdown disabled, trino reads the entire file and returns those fields where 
address is null (and thus `address.street` is null). However, when using 
projection pushdown, Trino delegates the planning decision to Iceberg and it 
seems to get no eligible files to read, leading to no rows returned.
   
   I can't find anything in the docs that says what's the right behavior here 
(as in, does `address.street is null` mean that `address` exists and 
`address.street` is null or that `address.street` is not set in that row in any 
way), but agreement between iceberg and Trino is essential. 
   
   Here is the spark-sql commands that I used to create the table
   ```
   spark-sql>  CREATE TABLE default.dejan_test (
     id INT NOT NULL,
     name STRING NOT NULL,
     age INT NOT NULL,
     address STRUCT<street: STRING NOT NULL, address_info: STRUCT<city: STRING 
NOT NULL, county: STRING NOT NULL, state: STRING NOT NULL>>)
   USING iceberg;
   spark-sql> INSERT INTO default.dejan_test (id, name, age, address)
   VALUES (
     0, 
     'Jane Doe', 
     27, 
     NULL
   );
   spark-sql> INSERT INTO default.dejan_test (id, name, age, address)
   VALUES (
     1, 
     'John Doe', 
     30, 
     STRUCT(
       '123 Main St',
       STRUCT('San Francisco', 'San Francisco County', 'California')
     )
   );
   ```
   
   Here are the two different results we get from Trino:
   ```
   trino> 
   set session iceberg.projection_pushdown_enabled=false;
   SET SESSION
   trino> 
   select
     id
   from
     iceberg.default.dejan_test
   where
     address.street is null;
    id 
   ----
     0 
   (1 row)
   
   Query 20250613_033713_00001_xn59q, FINISHED, 1 node
   Splits: 2 total, 2 done (100.00%)
   2.85 [2 rows, 4.43KiB] [0 rows/s, 1.56KiB/s]
   
   
   trino> 
   set session iceberg.projection_pushdown_enabled=true;
   SET SESSION
   trino> 
   select
     id
   from
     iceberg.default.dejan_test
   where
     address.street is null;
    id 
   ----
   (0 rows)
   
   Query 20250613_034027_00008_xn59q, FINISHED, 1 node
   Splits: 1 total, 1 done (100.00%)
   0.36 [0 rows, 0B] [0 rows/s, 0B/s]
   ```
   
   
   
   Full issue and reproduction steps can be found here: 
https://github.com/trinodb/trino/issues/20511#issuecomment-2968932230
   
   ### Willingness to contribute
   
   - [ ] I can contribute a fix for this bug independently
   - [x] I would be willing to contribute a fix for this bug with guidance from 
the Iceberg community
   - [ ] I cannot contribute a fix for this bug at this time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to