youth0526 opened a new issue, #15195: URL: https://github.com/apache/iceberg/issues/15195
### What happened When using Apache Iceberg with Spark and MinIO in a Docker environment, Spark may try to access the S3 bucket using a virtual-host-style endpoint (e.g. `warehouse.minio`). In such cases, Docker cannot resolve `warehouse.minio`, which results in `UnknownHostException` during Iceberg write or commit operations. ### Why this is confusing for beginners This behavior is confusing because: - Reading data may succeed - Writing data fails at the commit phase - The error message points to DNS resolution rather than S3 configuration Without prior knowledge of S3 access styles, it is not obvious why this happens. ### Environment - Apache Iceberg: 1.6.0 - Spark: 3.5.x - Catalog: REST catalog - Storage: MinIO (Docker) - Access: S3-compatible endpoint ### Related issues This looks related to issues such as #7709, where S3 bucket names are interpreted as part of the hostname and cause DNS resolution failures. However, this issue focuses on Spark + Iceberg + MinIO in Docker environments and the lack of documentation about explicit path-style configuration. ### Suggestion I think it would be helpful to add a short note to the documentation explaining that when using MinIO with Spark, users may need to explicitly enable path-style S3 access to avoid DNS resolution issues. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
