Fokko commented on PR #8075: URL: https://github.com/apache/iceberg/pull/8075#issuecomment-1636879833
I use the [`docker-spark-iceberg` docker-compose setup](https://github.com/tabular-io/docker-spark-iceberg). You can easily point to it in your `cat ~/.pyiceberg.yaml`: ```yaml default-catalog: local catalog: local: uri: http://127.0.0.1:8181 s3.endpoint: http://127.0.0.1:9000 py-io-impl: pyiceberg.io.pyarrow.PyArrowFileIO s3.access-key-id: admin s3.secret-access-key: password ``` The `PyIceberg - Getting Started.ipynb` creates a table of five months of taxi data. You could add more data to it since there is [more](https://github.com/tabular-io/docker-spark-iceberg/blob/main/spark/Dockerfile#L81-L95). Or you could change the daily partitioning to an hourly once, creating a crazy amount of partitions (which Iceberg handles fine but should create large manifest files). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
