eladc opened a new issue, #45432: URL: https://github.com/apache/arrow/issues/45432
### Describe the bug, including details regarding any error messages, version, and platform. Hello, This is very similar to bug #[36007](https://github.com/apache/arrow/issues/36007) the requesting machine is in the same region as the s3 bucket. joblib is used to parallelize the download, up to 56 threads. it is very difficult to reproduce, happens at least once a day to random users who are using the same code to download, but different parquets. Installed packages: **arrow** 1.3.0 **pyarrow** 14.0.1 ``` File "/opt/venv/lib/python3.10/site-packages/pyarrow/parquet/core.py", line 3003, in read_table return dataset.read(columns=columns, use_threads=use_threads, File "/opt/venv/lib/python3.10/site-packages/pyarrow/parquet/core.py", line 2631, in read table = self._dataset.to_table( File "pyarrow/_dataset.pyx", line 556, in pyarrow._dataset.Dataset.to_table File "pyarrow/_dataset.pyx", line 3713, in pyarrow._dataset.Scanner.to_table File "pyarrow/error.pxi", line 154, in pyarrow.lib.pyarrow_internal_check_status File "pyarrow/error.pxi", line 91, in pyarrow.lib.check_statusError: IOError: AWS Error NETWORK_CONNECTION during GetObject operation: curlCode: 28, Timeout was reached ``` How can I debug this further? Thank you. ### Component(s) Python -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org