apmorton opened a new issue, #46214:
URL: https://github.com/apache/arrow/issues/46214
### Describe the bug, including details regarding any error messages,
version, and platform.
When constructing a S3FileSystem object with a region explicitly specified
arrow will still end up causing aws metadata lookup operations, unless
explicitly (and globally) disabled with the `AWS_EC2_METADATA_DISABLED `
environment variable.
This goes against the documented (and I believe intended?) behavior of the
`region` kwarg:
```
AWS region to connect to. If not set, the AWS SDK will attempt to determine
the region using heuristics such as environment variables, configuration
profile, EC2 metadata, or default to ‘us-east-1’ when SDK version <1.8.
```
```python
pyarrow.fs.S3FileSystem(
access_key='key',
secret_key='secret',
endpoint_override='https://my.appliance.uri',
region='region',
)
```
using py-spy I can observe the following stack:
```
Aws::Http::CurlHttpClient::MakeRequest (libaws-cpp-sdk-core.so)
Aws::Internal::AWSHttpResourceClient::GetResourceWithAWSWebServiceResult[abi:cxx11]
(libaws-cpp-sdk-core.so)
Aws::Internal::EC2MetadataClient::GetCurrentRegion[abi:cxx11]
(libaws-cpp-sdk-core.so)
Aws::Client::ClientConfiguration::ClientConfiguration
(libaws-cpp-sdk-core.so)
Aws::S3::S3ClientConfiguration::S3ClientConfiguration
(libaws-cpp-sdk-s3.so)
__gnu_cxx::new_allocator<arrow::fs::S3FileSystem::Impl>::construct<arrow::fs::S3FileSystem::Impl,
arrow::fs::S3Options const&, arrow::io::IOContext const&>
(libarrow.so.1801.0.0)
arrow::fs::S3FileSystem::S3FileSystem (libarrow.so.1801.0.0)
arrow::fs::S3FileSystem::Make (libarrow.so.1801.0.0)
S3FileSystem___init__ (pyarrow/_s3fs.cpython-311-x86_64-linux-gnu.so)
```
This is caused by default construction of `S3ClientConfiguration` in
`ClientBuilder`.
On our machines (which aren't in aws and have no idms running) this takes 6+
seconds.
A workaround is something as follows:
```cpp
#ifdef ARROW_S3_HAS_S3CLIENT_CONFIGURATION
Aws::S3::S3ClientConfiguration
client_config_{Aws::Client::ClientConfigurationInitValues{false}};
#else
Aws::Client::ClientConfiguration
client_config_{Aws::Client::ClientConfigurationInitValues{false}};
#endif
```
which disables idms during configuration construction.
Some additional work would be required to add back in IDMS lookup of region
when otherwise not specified.
### Component(s)
C++
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]