mderoy opened a new issue, #44695:
URL: https://github.com/apache/arrow/issues/44695
### Describe the bug, including details regarding any error messages,
version, and platform.
Our application uses many instances running arrow in parallel on linux.
Occasionally we will get a SIGPIPE signal from our calls to the s3 filesystem
(for both reads and writes). We believe these are coming from the S3 api thats
being embedded by arrow. Since arrow is multithreaded, handling these signals
around the scope of our arrow api calls is ineffective.
AWS exposes an option to install their own handler
in SDKOptions.HttpOptions.installSigPipeHandler
https://github.com/aws/aws-sdk-cpp/blob/a154acd5893e2b2c913844e312648149d12a12d1/src/aws-cpp-sdk-core/include/aws/core/Aws.h#L103
but arrow has no way to set this. S3GlobalOptions only exposes the s3 log
level and number of event loop threads, and the rest of the options are kept
private within arrow.
Arrow needs a way to pass this (and probably other AWS options). So I'm
opening this bug report.
if the community is okay with adding the installSigPipeHandler option to
S3GlobalOptions I'd be happy to contribute.... though I wonder why we don't
open the floodgates to let the user set s3 options directly at will?
### Component(s)
C++
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]