mattmartin14 commented on PR #1534:
URL: https://github.com/apache/iceberg-python/pull/1534#issuecomment-2628065002

   @Fokko  - i added some additional smoke tests to test for situations where 
the primary key is a string or a date; the filter list code you wrote works 
fine for ints and strings, but on dates, i'm getting a type error such as this:
   
   ```bash
   TypeError: Invalid literal value: datetime.date(2021, 1, 1)
   ```
   
   For reference, here is the function to help jog your memory. Do you know how 
we can handle updating this function to handle situations where a date is a 
joined column?
   
   ```python
   def get_filter_list(df: pyarrow_table, join_cols: list) -> BooleanExpression:
   
       unique_keys = df.select(join_cols).group_by(join_cols).aggregate([])
   
       pred = None
   
       if len(join_cols) == 1:
           pred = In(join_cols[0], unique_keys[0].to_pylist())
       else:
           pred = Or(*[
               And(*[
                   EqualTo(col, row[col])
                   for col in join_cols
               ])
               for row in unique_keys.to_pylist()
           ])
   
       return pred
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to