Re: [PR] Get Started: Add sqlcatalog and local fs warehouse [iceberg-python]

via GitHub Sun, 04 Feb 2024 12:58:46 -0800


Fokko commented on code in PR #361:
URL: https://github.com/apache/iceberg-python/pull/361#discussion_r1477436914



##########
mkdocs/docs/index.md:
##########
@@ -158,6 +177,14 @@ df = table.scan(row_filter="tip_per_mile > 0").to_arrow()
 len(df)
 ```
 
+### Explore Iceberg data and metadata files
+
+Since the catalog was configured to use the local filesystem, we can explore 
how Iceberg saved data and metadata files from the above operations.
+
+```shell
+ls /tmp/warehouse/

Review Comment:
   This shows the whole tree:
   ```suggestion
   find /tmp/warehouse/
   ```



##########
mkdocs/docs/index.md:
##########
@@ -62,6 +62,27 @@ You either need to install `s3fs`, `adlfs`, `gcs`, or 
`pyarrow` to be able to fe
 
 Iceberg leverages the [catalog to have one centralized place to organize the 
tables](https://iceberg.apache.org/catalog/). This can be a traditional Hive 
catalog to store your Iceberg tables next to the rest, a vendor solution like 
the AWS Glue catalog, or an implementation of Icebergs' own [REST 
protocol](https://github.com/apache/iceberg/tree/main/open-api). Checkout the 
[configuration](configuration.md) page to find all the configuration details.
 
+For the sake of demonstration, we'll configure the catalog to use the 
`SqlCatalog` implementation, which will store information in a local `sqlite` 
database. We'll also configure the catalog to store data files in the local 
filesystem instead of an object store.

Review Comment:
   ```suggestion
   For the sake of demonstration, we'll configure the catalog to use the 
`SqlCatalog` implementation, which will store information in a local `sqlite` 
database. We'll also configure the catalog to store data files in the local 
filesystem instead of an object store. This should not be used in production 
due to the limited scalability.
   ```



##########
mkdocs/docs/index.md:
##########
@@ -62,6 +62,27 @@ You either need to install `s3fs`, `adlfs`, `gcs`, or 
`pyarrow` to be able to fe
 
 Iceberg leverages the [catalog to have one centralized place to organize the 
tables](https://iceberg.apache.org/catalog/). This can be a traditional Hive 
catalog to store your Iceberg tables next to the rest, a vendor solution like 
the AWS Glue catalog, or an implementation of Icebergs' own [REST 
protocol](https://github.com/apache/iceberg/tree/main/open-api). Checkout the 
[configuration](configuration.md) page to find all the configuration details.
 
+For the sake of demonstration, we'll configure the catalog to use the 
`SqlCatalog` implementation, which will store information in a local `sqlite` 
database. We'll also configure the catalog to store data files in the local 
filesystem instead of an object store.
+
+Create a temporary location for Iceberg:
+
+```shell
+mkdir /tmp/warehouse
+```
+
+```python

Review Comment:
   ```suggestion
   
   Open a Python 3 REPL to set up the in-memory catalog:
   
   ```python
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Get Started: Add sqlcatalog and local fs warehouse [iceberg-python]

Reply via email to