mdwint commented on issue #2426:
URL: 
https://github.com/apache/iceberg-python/issues/2426#issuecomment-3257392284

   > So I think the filter expressions are now correct, but the upsert logic 
still doesn't handle updates with null values correctly. Wild guess: maybe it 
has to do with the way null values are compared (null doesn't equal itself?).
   
   I've isolated the problem to step 3 in `get_rows_to_update`: 
https://github.com/apache/iceberg-python/blob/075a966c96fbb113197652df1ccc8997f39bdedf/pyiceberg/table/upsert_util.py#L123-L124
   
   This join ignores `None` values. It returns no matches, but I want it to 
match `bar: [[1]] foo: [[null]]`:
   ```python
   (Pdb) p source_index
   pyarrow.Table
   bar: int32
   foo: large_string
   __source_index: int64
   ----
   bar: [[1,null]]
   foo: [[null,"lemon"]]
   __source_index: [[0,1]]
   
   (Pdb) p target_index
   pyarrow.Table
   bar: int32
   foo: large_string
   __target_index: int64
   ----
   bar: [[1]]
   foo: [[null]]
   __target_index: [[0]]
   
   (Pdb) p matching_indices
   pyarrow.Table
   bar: int32
   foo: large_string
   __source_index: int64
   __target_index: int64
   ----
   bar: []
   foo: []
   __source_index: []
   __target_index: []
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to