Al-Moatasem opened a new issue, #974: URL: https://github.com/apache/iceberg-python/issues/974
### Apache Iceberg version 0.6.1 (latest release) ### Please describe the bug 🐞 Hi, I am trying to use the **rest** catalog and writing the data into **Minio**, the script I am using can communicate with Minio (it creates the `metadata.json` file under `metadata` directory, however, it raises `OSError: When initiating multiple part upload for key 'poc_new/coordinates/data/00000-0-f27b7921-a6d7-4c7e-b034-2d12221e5054.parquet' in bucket 'warehouse': AWS Error NETWORK_CONNECTION during CreateMultipartUpload operation: Encountered network error when sending http request` this is the docker compose file that I use ```yaml version: '3' services: rest: image: tabulario/iceberg-rest:1.5.0 container_name: iceberg-rest ports: - 8181:8181 environment: - AWS_ACCESS_KEY_ID=admin - AWS_SECRET_ACCESS_KEY=password - AWS_REGION=us-east-1 - CATALOG_WAREHOUSE=s3://warehouse/ - CATALOG_IO__IMPL=org.apache.iceberg.aws.s3.S3FileIO - CATALOG_S3_ENDPOINT=http://minio:9000 networks: iceberg-rest: minio: image: minio/minio:RELEASE.2024-05-10T01-41-38Z container_name: minio environment: - MINIO_ROOT_USER=admin - MINIO_ROOT_PASSWORD=password - MINIO_DOMAIN=minio ports: - 9001:9001 - 9000:9000 command: [ "server", "/data", "--console-address", ":9001" ] networks: iceberg-rest: aliases: - warehouse.minio mc: depends_on: - minio image: minio/mc:RELEASE.2024-05-09T17-04-24Z container_name: mc entrypoint: | /bin/sh -c " until (/usr/bin/mc config host add minio http://minio:9000 admin password) do echo '...waiting...' && sleep 1; done; /usr/bin/mc rm -r --force minio/warehouse; /usr/bin/mc mb minio/warehouse; /usr/bin/mc policy set public minio/warehouse; tail -f /dev/null " environment: - AWS_ACCESS_KEY_ID=admin - AWS_SECRET_ACCESS_KEY=password - AWS_REGION=us-east-1 networks: iceberg-rest: networks: iceberg-rest: ``` And this the script file ```py import pyarrow as pa from pyiceberg.catalog import load_rest from pyiceberg.exceptions import NamespaceAlreadyExistsError, TableAlreadyExistsError catalog = load_rest( name="rest", conf={ "uri": "http://localhost:8181/", }, ) namespace = "poc_new" try: catalog.create_namespace(namespace) except NamespaceAlreadyExistsError as e: pass df = pa.Table.from_pylist( [ {"lat": 52.371807, "long": 4.896029}, {"lat": 52.387386, "long": 4.646219}, {"lat": 52.078663, "long": 4.288788}, ], ) schema = df.schema table_name = "coordinates" table_identifier = f"{namespace}.{table_name}" try: table = catalog.create_table( identifier=table_identifier, schema=schema, ) except TableAlreadyExistsError as e: pass table = catalog.load_table(table_identifier) table.append(df) ``` The Traceback ``` Traceback (most recent call last): File "d:\flink_iceberg\poc_01_iceberg_rest.py", line 40, in <module> table.append(df) File "D:\flink_iceberg\.venv2\Lib\site-packages\pyiceberg\table\__init__.py", line 1068, in append for data_file in data_files: File "D:\flink_iceberg\.venv2\Lib\site-packages\pyiceberg\table\__init__.py", line 2423, in _dataframe_to_data_files yield from write_file(table, iter([WriteTask(write_uuid, next(counter), df)])) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\flink_iceberg\.venv2\Lib\site-packages\pyiceberg\io\pyarrow.py", line 1726, in write_file with fo.create(overwrite=True) as fos: ^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\flink_iceberg\.venv2\Lib\site-packages\pyiceberg\io\pyarrow.py", line 299, in create output_file = self._filesystem.open_output_stream(self._path, buffer_size=self._buffer_size) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "pyarrow\_fs.pyx", line 868, in pyarrow._fs.FileSystem.open_output_stream File "pyarrow\error.pxi", line 144, in pyarrow.lib.pyarrow_internal_check_status File "pyarrow\error.pxi", line 115, in pyarrow.lib.check_status OSError: When initiating multiple part upload for key 'poc_new/coordinates/data/00000-0-efc0be57-453d-442d-af13-2e0b2382a53d.parquet' in bucket 'warehouse': AWS Error NETWORK_CONNECTION during CreateMultipartUpload operation: Encountered network error when sending http request ``` In Minio, the metadata directory is created and it stores the `metadata.json` file, but, no `data` directory.  Also, this is the requirements.txt file ``` annotated-types==0.7.0 apache-beam==2.48.0 apache-flink==1.19.1 apache-flink-libraries==1.19.1 avro-python3==1.10.2 certifi==2024.7.4 charset-normalizer==3.3.2 click==8.1.7 cloudpickle==2.2.1 colorama==0.4.6 confluent-kafka==2.5.0 crcmod==1.7 dill==0.3.1.1 dnspython==2.6.1 docopt==0.6.2 duckdb==0.9.2 duckdb_engine==0.13.0 Faker==26.0.0 fastavro==1.9.5 fasteners==0.19 fsspec==2023.12.2 greenlet==3.0.3 grpcio==1.65.1 hdfs==2.7.3 httplib2==0.22.0 idna==3.7 kafka-python==2.0.2 markdown-it-py==3.0.0 mdurl==0.1.2 mmhash3==3.0.1 numpy==1.24.4 objsize==0.6.1 orjson==3.10.6 packaging==24.1 pandas==2.2.2 polars==1.2.1 proto-plus==1.24.0 protobuf==4.23.4 py4j==0.10.9.7 pyarrow==11.0.0 pydantic==2.8.2 pydantic-settings==2.3.4 pydantic_core==2.20.1 pydot==1.4.2 Pygments==2.18.0 pyiceberg==0.6.1 pymongo==4.8.0 pyparsing==3.1.2 python-dateutil==2.9.0.post0 python-dotenv==1.0.1 pytz==2024.1 regex==2024.7.24 requests==2.32.3 rich==13.7.1 ruamel.yaml==0.18.6 ruamel.yaml.clib==0.2.8 six==1.16.0 sortedcontainers==2.4.0 SQLAlchemy==2.0.31 strictyaml==1.7.3 typing_extensions==4.12.2 tzdata==2024.1 urllib3==2.2.2 zstandard==0.23.0 ``` I checked [this Slack thread](https://apache-iceberg.slack.com/archives/C029EE6HQ5D/p1707633685716559) for the same issue, but, it doesn't contain any fix for my case. OS: Windows 10 environment variables contain `aws` in the three containers `iceberg-rest` container ``` iceberg@ce79d3f11b5f:/usr/lib/iceberg-rest$ env | grep -i aws AWS_REGION=us-east-1 CATALOG_IO__IMPL=org.apache.iceberg.aws.s3.S3FileIO AWS_SECRET_ACCESS_KEY=password AWS_ACCESS_KEY_ID=admin ``` `minio` container, doesn't have any ENV with `aws` `mc` container ``` AWS_REGION=us-east-1 AWS_SECRET_ACCESS_KEY=password AWS_ACCESS_KEY_ID=admin ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org