dor-bernstein opened a new issue, #2452:
URL: https://github.com/apache/iceberg-rust/issues/2452

   ### Describe the bug
   
   When using Apache Comet 0.16 with Iceberg (Spark 3.5.6, AWS Glue catalog), 
scanning certain Parquet files fails with:
   
   ```
   org.apache.comet.CometNativeException: Iceberg scan error: Unexpected => 
file scan task generate failed, source: Unexpected => Parquet file metadata 
does not contain a column index
   ```
   
   This appears to affect Parquet files that were written before column indexes 
were standard (i.e. migrated/older files that lack column index metadata).
   
   ### Steps to reproduce
   
   1. Use Apache Comet 0.16 with Spark 3.5.6 and Iceberg (AWS Glue catalog)
   2. Run a query against an Iceberg table whose Parquet files lack a column 
index in their metadata
   
   ### Expected behavior
   
   Iceberg should handle Parquet files that don't have a column index 
gracefully, falling back to row group statistics or skipping column index 
pruning.
   
   ### Additional context
   
   Reported in 
[apache/datafusion-comet#4125](https://github.com/apache/datafusion-comet/issues/4125#issuecomment-4431270259).
 A Comet maintainer suggested this is likely an iceberg-rust issue since it 
relates to migrated Parquet files that lack column index metadata.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to