rdblue commented on code in PR #8374:
URL: https://github.com/apache/iceberg/pull/8374#discussion_r1313615392
##########
python/mkdocs/docs/api.md:
##########
@@ -133,52 +118,140 @@ partition_spec = PartitionSpec(
from pyiceberg.table.sorting import SortOrder, SortField
from pyiceberg.transforms import IdentityTransform
-sort_order = SortOrder(SortField(source_id=4, transform=IdentityTransform()))
-
-catalog = load_catalog("prod")
+# Sort on the symbol
+sort_order = SortOrder(SortField(source_id=2, transform=IdentityTransform()))
catalog.create_table(
- identifier="default.bids",
- location="/Users/fokkodriesprong/Desktop/docker-spark-iceberg/wh/bids/",
+ identifier="docs_example.bids",
schema=schema,
partition_spec=partition_spec,
sort_order=sort_order,
)
```
-### Update table schema
+## Load a table
+
+### Catalog table
+
+Loading the `bids` table:
+
+```python
+table = catalog.load_table("docs_example.bids")
+# Equivalent to:
+table = catalog.load_table(("docs_example", "bids"))
+# The tuple syntax can be used if the namespace or table contains a dot.
+```
+
+This returns a `Table` that represents an Iceberg table that can be queried
and altered.
+
+### Static table
+
+To load a table directly from a metadata file (i.e., **without** using a
catalog), you can use a `StaticTable` as follows:
+
+```python
+from pyiceberg.table import StaticTable
+
+static_table = StaticTable.from_metadata(
+ # For example:
+ #
"s3a://warehouse/wh/nyc.db/taxis/metadata/00002-6ea51ce3-62aa-4197-9cf8-43d07c3440ca.metadata.json",
+ tbl.metadata_location,
+ properties={
+ "s3.endpoint": "http://127.0.0.1:9000",
+ "py-io-impl": "pyiceberg.io.pyarrow.PyArrowFileIO",
+ "s3.access-key-id": "admin",
+ "s3.secret-access-key": "password",
+ },
+)
+```
+
+The static-table is considered read-only.
-Add new columns through the `Transaction` or `UpdateSchema` API:
+## Schema evolution
-Use the Transaction API:
+PyIceberg supports full schema evolution through the Python API. It takes care
of setting the field-IDs and makes sure that only non-breaking changes are done
(can be overriden).
+
+In the examples below, the `.update_schema()` is called from the table itself.
+
+```python
+with table.update_schema() as update:
+ update.add_column("some_field", IntegerType(), "doc")
+```
+
+You can also initiate a transaction if you want to make more changes than just
evolving the schema:
```python
with table.transaction() as transaction:
- transaction.update_schema().add_column("x", IntegerType(), "doc").commit()
+ with transaction.update_schema() as update_schema:
+ update.add_column("some_other_field", IntegerType(), "doc")
+ # ... Update properties etc
```
-Or, without a context manager:
+### Add column
+
+Using `add_column` you can add a column, without having to worry about the
field-id:
```python
-transaction = table.transaction()
-transaction.update_schema().add_column("x", IntegerType(), "doc").commit()
-transaction.commit_transaction()
+with table.update_schema() as update:
+ update.add_column("retries", IntegerType(), "Number of retries to place
the bid")
+ # In a struct
+ update.add_column("details.confirmed_by", StringType(), "Name of the
exchange")
```
-Or, use the UpdateSchema API directly:
+### Rename column
+
+Renaming a field in an Iceberg table is simple:
```python
with table.update_schema() as update:
- update.add_column("x", IntegerType(), "doc")
+ update.rename("retries", "num_retries")
+ # In a struct, only the new name field
+ update.rename("properties.confirmed_by", "exchange")
```
-Or, without a context manager:
+### Rename column
+
+Move a field inside of struct:
```python
-table.update_schema().add_column("x", IntegerType(), "doc").commit()
+with table.update_schema() as update:
+ update.move_first("symbol")
+ update.move_after("bid", "ask")
+ # In a struct, only the new name field
Review Comment:
What do you mean here?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]