mdwint commented on issue #2426: URL: https://github.com/apache/iceberg-python/issues/2426#issuecomment-3257392284
> So I think the filter expressions are now correct, but the upsert logic still doesn't handle updates with null values correctly. Wild guess: maybe it has to do with the way null values are compared (null doesn't equal itself?). I've isolated the problem to step 3 in `get_rows_to_update`: https://github.com/apache/iceberg-python/blob/075a966c96fbb113197652df1ccc8997f39bdedf/pyiceberg/table/upsert_util.py#L123-L124 This join ignores `None` values. It returns no matches, but I want it to match `bar: [[1]] foo: [[null]]`: ```python (Pdb) p source_index pyarrow.Table bar: int32 foo: large_string __source_index: int64 ---- bar: [[1,null]] foo: [[null,"lemon"]] __source_index: [[0,1]] (Pdb) p target_index pyarrow.Table bar: int32 foo: large_string __target_index: int64 ---- bar: [[1]] foo: [[null]] __target_index: [[0]] (Pdb) p matching_indices pyarrow.Table bar: int32 foo: large_string __source_index: int64 __target_index: int64 ---- bar: [] foo: [] __source_index: [] __target_index: [] ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
