Fokko commented on code in PR #364: URL: https://github.com/apache/iceberg-python/pull/364#discussion_r1478159369
########## mkdocs/docs/configuration.md: ########## @@ -46,7 +46,20 @@ The environment variable picked up by Iceberg starts with `PYICEBERG_` and then For example, `PYICEBERG_CATALOG__DEFAULT__S3__ACCESS_KEY_ID`, sets `s3.access-key-id` on the `default` catalog. -## FileIO +# Tables + +Iceberg tables support table properties to configure table behavior. + +## Write options + +| Key | Options | Default | Description | +| --------------------------------- | --------------------------------- | ------- | ------------------------------------------------------------------------------------------- | +| `write.parquet.compression-codec` | `{uncompressed,zstd,gzip,snappy}` | zstd | Sets the Parquet compression coddec. | +| `write.parquet.compression-level` | Integer | null | Parquet compression level for the codec. If not set, it is up to PyIceberg | +| `write.parquet.page-size-bytes` | Size in bytes | 1MB | Set a target threshold for the approximate encoded size of data pages within a column chunk | +| `write.parquet.dict-size-bytes` | Size in bytes | 1MB | Set the dictionary page size limit per row group | Review Comment: Ah, Arrow has as a default 1MB. I've set it to 2MB. Note that I didn't put it in a constant, so we can do that as part of https://github.com/apache/iceberg-python/issues/365 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org