Fokko commented on code in PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#discussion_r1935497033
########## pyiceberg/table/__init__.py: ########## @@ -1064,6 +1064,125 @@ def name_mapping(self) -> Optional[NameMapping]: """Return the table's field-id NameMapping.""" return self.metadata.name_mapping() + def merge_rows(self, df: pa.Table, join_cols: list + ,merge_options: dict = {'when_matched_update_all': True, 'when_not_matched_insert_all': True} + ) -> Dict: + """ + Shorthand API for performing an upsert/merge to an iceberg table. + + Args: + df: The input dataframe to merge with the table's data. + join_cols: The columns to join on. Review Comment: The primary-key equivalent of Iceberg is the identifier fields, so we could also get it from the table like this: ```python if join_cols is None: identifier_field_ids = self.schema().identifier_field_ids if len(identifier_field_ids) > 0: join_cols = [ self.schema().find_column_name(identifier_field_id) for identifier_field_id in identifier_field_ids ] else: raise ValueError("The table doesn't have identifier fields, please set join_cols.") ``` We can also do this in a follow-up PR. ########## pyiceberg/table/__init__.py: ########## @@ -1064,6 +1064,125 @@ def name_mapping(self) -> Optional[NameMapping]: """Return the table's field-id NameMapping.""" return self.metadata.name_mapping() + def merge_rows(self, df: pa.Table, join_cols: list + ,merge_options: dict = {'when_matched_update_all': True, 'when_not_matched_insert_all': True} + ) -> Dict: + """ + Shorthand API for performing an upsert/merge to an iceberg table. + + Args: + df: The input dataframe to merge with the table's data. + join_cols: The columns to join on. Review Comment: The primary-key equivalent of Iceberg is the identifier fields, so we could also get it from the table like this: ```python if join_cols is None: identifier_field_ids = self.schema().identifier_field_ids if len(identifier_field_ids) > 0: join_cols = [ self.schema().find_column_name(identifier_field_id) for identifier_field_id in identifier_field_ids ] else: raise ValueError("The table doesn't have identifier fields, please set join_cols.") ``` We can also do this in a follow-up PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org