GabrielM98 opened a new issue, #404: URL: https://github.com/apache/iceberg-go/issues/404
### Apache Iceberg version main (development) ### Please describe the bug 🐞 There appears to be an issue with the way in which partition filtering is applied to manifest entries when partitioning by a struct field. In my case, I have a table with the following partition spec... ```json { "spec-id": 3, "fields": [ { "name": "event_metadata.timing.created_at_year", "transform": "year", "source-id": 19, "field-id": 1000 }, { "name": "event_metadata.timing.created_at_month", "transform": "month", "source-id": 19, "field-id": 1001 }, { "name": "user_uuid_bucket_256", "transform": "bucket[256]", "source-id": 5, "field-id": 1002 } ] } ``` When I then attempt to query said table, the call to `(*table.Scan).PlanFiles` returns an empty `[]table.FileScanTask`. With Delve & GoLand, I believe I've managed to narrow down the issue to the `getPartitionRecord` function [here](https://github.com/apache/iceberg-go/blob/091352672b4191a4bb11b603c1fb9bd2ab6c2aaf/table/scanner.go#L116)... ```go func getPartitionRecord(dataFile iceberg.DataFile, partitionType *iceberg.StructType) partitionRecord { partitionData := dataFile.Partition() out := make(partitionRecord, len(partitionType.FieldList)) for i, f := range partitionType.FieldList { out[i] = partitionData[f.Name] } return out } ``` It's returning a `partitionRecord` like so... <img width="315" alt="Image" src="https://github.com/user-attachments/assets/1485f26e-2f56-4090-a279-e4ed105d0bb6" /> Whereas the first and second element in the slice should be `55` and `663` respectively. If we look at the `Name` field values for the first two elements of `partitionType.FieldList`, they're given as `event_metadata.timing.created_at_year` and `event_metadata.timing.created_at_month`, whereas in the `partitionData` map, the keys that correspond to these fields are given as `event_metadata_x2Etiming_x2Ecreated_at_year` and `event_metadata_x2Etiming_x2Ecreated_at_month`...  Consequently, the access to the `partitionData` map by field name is returning a `nil` empty interface. Then the partition filter isn't applied correctly to the manifest entry and is filtered out, resulting in an empty slice of `table.ScanFileTask`. From what I can tell, the `partitionData` map comes from the `iceberg.DataFile` that is instantiated from decoding the Avro manifest entry. Placing a breakpoint [here](https://github.com/apache/iceberg-go/blob/091352672b4191a4bb11b603c1fb9bd2ab6c2aaf/manifest.go#L489) in the `fetchManifestEntries` function I can see the following as the result of the decoded Avro... <img width="1848" alt="Image" src="https://github.com/user-attachments/assets/6dbaee1c-c5ee-4e9e-813d-88047d45e339" /> Is this maybe some Avro decoding quirk that hasn't been accounted for? I don't believe there's any issues with the manner in which the data is being written, as I've been able to reproduce this regardless of whether I've written the data via Spark or the Iceberg sink connector. Thanks in advance! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org