weijiii commented on issue #12971:
URL: https://github.com/apache/iceberg/issues/12971#issuecomment-2864714088

   > It seems that In your case someone is producing the equality delete file 
with custom code using Iceberg Java SDK. Is the identifier fields set for the 
table schema? Validation above should fail such schema change. If a writer is 
not conforming to the spec, it is an implementation bug of the writer.
   
   Ah interesting I take it you mean that when we use `RowDelta::addDeletes` 
(assuming this is the correct API) to add a `DeleteFile`, the 
`identifier-field-ids` would also be configured if not already? And this would 
constitute a schema change? I am new to the project so I will have to get 
familiar with the code base here 😄 
   
   > Schema class already validate the field types of identifier fields.
   
https://github.com/apache/iceberg/blob/main/api/src/main/java/org/apache/iceberg/Schema.java#L157
   
   But I am curious isn't this method validating the constraints for the 
identifier fields? From the [equality-delete 
spec](https://iceberg.apache.org/spec/?h=equality#equality-delete-files) it 
says there are some exceptions right?
   ```
   The column restrictions for columns used in equality delete files are the 
same as those for [identifier 
fields](https://iceberg.apache.org/spec/?h=equality#identifier-field-ids) with 
the exception that optional columns and columns nested under optional structs 
are allowed (if a parent struct column is null it implies the leaf column is 
null).
   ```
   
   FWIW the case I am seeing is directly using the delete writer, e.g. you can 
see this 
[usage](https://github.com/trinodb/trino/blob/993a08840651a3e7c7a6e16ffbbc6bc2c02346af/plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/util/EqualityDeleteUtils.java#L74)
 for Trino testing; I was working on this Trino 
[PR](https://github.com/trinodb/trino/pull/25729) to allow equality deletes on 
columns of struct type (row type in Trino).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to