Fokko commented on PR #8075:
URL: https://github.com/apache/iceberg/pull/8075#issuecomment-1636879833

   I use the [`docker-spark-iceberg` docker-compose 
setup](https://github.com/tabular-io/docker-spark-iceberg). You can easily 
point to it in your `cat ~/.pyiceberg.yaml`:
   
   ```yaml
   default-catalog: local
   
   catalog:
       local:
           uri: http://127.0.0.1:8181
           s3.endpoint: http://127.0.0.1:9000
           py-io-impl: pyiceberg.io.pyarrow.PyArrowFileIO
           s3.access-key-id: admin
           s3.secret-access-key: password
   ```
   
   The `PyIceberg - Getting Started.ipynb` creates a table of five months of 
taxi data. You could add more data to it since there is 
[more](https://github.com/tabular-io/docker-spark-iceberg/blob/main/spark/Dockerfile#L81-L95).
 Or you could change the daily partitioning to an hourly once, creating a crazy 
amount of partitions (which Iceberg handles fine but should create large 
manifest files).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to