Fokko commented on code in PR #361: URL: https://github.com/apache/iceberg-python/pull/361#discussion_r1477436914
########## mkdocs/docs/index.md: ########## @@ -158,6 +177,14 @@ df = table.scan(row_filter="tip_per_mile > 0").to_arrow() len(df) ``` +### Explore Iceberg data and metadata files + +Since the catalog was configured to use the local filesystem, we can explore how Iceberg saved data and metadata files from the above operations. + +```shell +ls /tmp/warehouse/ Review Comment: This shows the whole tree: ```suggestion find /tmp/warehouse/ ``` ########## mkdocs/docs/index.md: ########## @@ -62,6 +62,27 @@ You either need to install `s3fs`, `adlfs`, `gcs`, or `pyarrow` to be able to fe Iceberg leverages the [catalog to have one centralized place to organize the tables](https://iceberg.apache.org/catalog/). This can be a traditional Hive catalog to store your Iceberg tables next to the rest, a vendor solution like the AWS Glue catalog, or an implementation of Icebergs' own [REST protocol](https://github.com/apache/iceberg/tree/main/open-api). Checkout the [configuration](configuration.md) page to find all the configuration details. +For the sake of demonstration, we'll configure the catalog to use the `SqlCatalog` implementation, which will store information in a local `sqlite` database. We'll also configure the catalog to store data files in the local filesystem instead of an object store. Review Comment: ```suggestion For the sake of demonstration, we'll configure the catalog to use the `SqlCatalog` implementation, which will store information in a local `sqlite` database. We'll also configure the catalog to store data files in the local filesystem instead of an object store. This should not be used in production due to the limited scalability. ``` ########## mkdocs/docs/index.md: ########## @@ -62,6 +62,27 @@ You either need to install `s3fs`, `adlfs`, `gcs`, or `pyarrow` to be able to fe Iceberg leverages the [catalog to have one centralized place to organize the tables](https://iceberg.apache.org/catalog/). This can be a traditional Hive catalog to store your Iceberg tables next to the rest, a vendor solution like the AWS Glue catalog, or an implementation of Icebergs' own [REST protocol](https://github.com/apache/iceberg/tree/main/open-api). Checkout the [configuration](configuration.md) page to find all the configuration details. +For the sake of demonstration, we'll configure the catalog to use the `SqlCatalog` implementation, which will store information in a local `sqlite` database. We'll also configure the catalog to store data files in the local filesystem instead of an object store. + +Create a temporary location for Iceberg: + +```shell +mkdir /tmp/warehouse +``` + +```python Review Comment: ```suggestion Open a Python 3 REPL to set up the in-memory catalog: ```python ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org