Re: [PR] Write support [iceberg-python]

via GitHub Sun, 14 Jan 2024 16:13:13 -0800


rdblue commented on code in PR #41:
URL: https://github.com/apache/iceberg-python/pull/41#discussion_r1451832048



##########
pyiceberg/table/__init__.py:
##########
@@ -831,6 +887,46 @@ def history(self) -> List[SnapshotLogEntry]:
     def update_schema(self, allow_incompatible_changes: bool = False, 
case_sensitive: bool = True) -> UpdateSchema:
         return UpdateSchema(self, 
allow_incompatible_changes=allow_incompatible_changes, 
case_sensitive=case_sensitive)
 
+    def append(self, df: pa.Table) -> None:
+        if len(self.spec().fields) > 0:
+            raise ValueError("Cannot write to partitioned tables")
+
+        snapshot_id = self.new_snapshot_id()
+
+        data_files = _dataframe_to_data_files(self, df=df)
+        merge = _MergeAppend(operation=Operation.APPEND, table=self, 
snapshot_id=snapshot_id)
+        for data_file in data_files:
+            merge.append_datafile(data_file)
+
+        if current_snapshot := self.current_snapshot():
+            for manifest in current_snapshot.manifests(io=self.io):
+                for entry in manifest.fetch_manifest_entry(io=self.io):
+                    merge.append_datafile(entry.data_file, added=False)

Review Comment:
   I think that the `_MergeAppend` should be responsible for handling the 
existing data. It doesn't make sense to me that an append operation would 
require the caller to re-add the data files that were in the table already. 
That puts too much on the caller, which should just add files and not worry 
about existing data or state.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Write support [iceberg-python]

Reply via email to