mattmartin14 commented on PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#issuecomment-2628065002
@Fokko - i added some additional smoke tests to test for situations where the primary key is a string or a date; the filter list code you wrote works fine for ints and strings, but on dates, i'm getting a type error such as this: ```bash TypeError: Invalid literal value: datetime.date(2021, 1, 1) ``` For reference, here is the function to help jog your memory. Do you know how we can handle updating this function to handle situations where a date is a joined column? ```python def get_filter_list(df: pyarrow_table, join_cols: list) -> BooleanExpression: unique_keys = df.select(join_cols).group_by(join_cols).aggregate([]) pred = None if len(join_cols) == 1: pred = In(join_cols[0], unique_keys[0].to_pylist()) else: pred = Or(*[ And(*[ EqualTo(col, row[col]) for col in join_cols ]) for row in unique_keys.to_pylist() ]) return pred ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org