Fokko commented on code in PR #498:
URL: https://github.com/apache/iceberg-python/pull/498#discussion_r1537251443


##########
pyiceberg/catalog/rest.py:
##########
@@ -524,7 +537,32 @@ def create_table(
         except HTTPError as exc:
             self._handle_non_200_response(exc, {409: TableAlreadyExistsError})
 
-        table_response = TableResponse(**response.json())
+        return TableResponse(**response.json())
+
+    @retry(**_RETRY_ARGS)
+    def _create_staged_table(

Review Comment:
   The inheritance feels off here. I would just expect to 
`create_table_transaction`. The current `_create_staged_table` on the `Catalog` 
is now very specific for non-REST catalogs. Should we introduce another layer 
in the OOP hierarchy there? Then we can override `create_table_transaction` 
there. 



##########
pyiceberg/table/__init__.py:
##########
@@ -1707,7 +1831,8 @@ def union_by_name(self, new_schema: Union[Schema, 
"pa.Schema"]) -> UpdateSchema:
         visit_with_partner(
             Catalog._convert_schema_if_needed(new_schema),
             -1,
-            UnionByNameVisitor(update_schema=self, 
existing_schema=self._schema, case_sensitive=self._case_sensitive),  # type: 
ignore
+            UnionByNameVisitor(update_schema=self, 
existing_schema=self._schema, case_sensitive=self._case_sensitive),

Review Comment:
   Can you revert this change?



##########
pyiceberg/catalog/__init__.py:
##########
@@ -717,6 +791,10 @@ def _get_updated_props_and_update_summary(
 
         return properties_update_summary, updated_properties
 
+    @staticmethod
+    def empty_table_metadata() -> TableMetadata:
+        return TableMetadataV1(location="", last_column_id=-1, schema=Schema())

Review Comment:
   I recommend creating a V2 table by default. This is also the case for Java. 
Partition evolution is very awkward for V1 tables (keeping the old partitions 
as null-transforms).



##########
pyiceberg/catalog/__init__.py:
##########
@@ -288,6 +291,78 @@ def __init__(self, name: str, **properties: str):
     def _load_file_io(self, properties: Properties = EMPTY_DICT, location: 
Optional[str] = None) -> FileIO:
         return load_file_io({**self.properties, **properties}, location)
 
+    def _create_staged_table(
+        self,
+        identifier: Union[str, Identifier],
+        schema: Union[Schema, "pa.Schema"],
+        location: Optional[str] = None,
+        partition_spec: PartitionSpec = UNPARTITIONED_PARTITION_SPEC,
+        sort_order: SortOrder = UNSORTED_SORT_ORDER,
+        properties: Properties = EMPTY_DICT,
+    ) -> StagedTable:
+        """Create a table and return the table instance without committing the 
changes.
+
+        Args:
+            identifier (str | Identifier): Table identifier.
+            schema (Schema): Table's schema.
+            location (str | None): Location for the table. Optional Argument.
+            partition_spec (PartitionSpec): PartitionSpec for the table.
+            sort_order (SortOrder): SortOrder for the table.
+            properties (Properties): Table properties that can be a string 
based dictionary.
+
+        Returns:
+            Table: the created table instance.
+
+        Raises:
+            TableAlreadyExistsError: If a table with the name already exists.
+        """
+        schema: Schema = self._convert_schema_if_needed(schema)  # type: ignore
+
+        database_name, table_name = 
self.identifier_to_database_and_table(identifier)
+
+        location = self._resolve_table_location(location, database_name, 
table_name)
+        metadata_location = self._get_metadata_location(location=location)
+        metadata = new_table_metadata(
+            location=location, schema=schema, partition_spec=partition_spec, 
sort_order=sort_order, properties=properties
+        )
+        io = load_file_io(properties=self.properties, 
location=metadata_location)
+        return StagedTable(
+            identifier=(self.name, database_name, table_name),
+            metadata=metadata,
+            metadata_location=metadata_location,
+            io=io,
+            catalog=self,
+        )
+
+    def create_table_transaction(

Review Comment:
   I was leaning towards staged since we use the same name for the REST 
catalog. I'm not super strong on this one. Let's keep it transaction unless 
anyone else jumps in here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to