fjetter opened a new issue, #40279:
URL: https://github.com/apache/arrow/issues/40279
### Describe the bug, including details regarding any error messages,
version, and platform.
Deserializing a pickled S3FileSystem instance is surprisingly slow
```python
import boto3
from pyarrow.fs import S3FileSystem
# Going via boto is not strictly necessary but setting all the keys and
tokens already avoids one HTTP request during init
session = boto3.session.Session()
credentials = session.get_credentials()
fs = S3FileSystem(
secret_key=credentials.secret_key,
access_key=credentials.access_key,
region="us-east-2",
session_token=credentials.token,
)
# Note: This can also be seen by using just S3FileSystem() but this then
posts one HTTP request and I want to emphasize the slow json parser, see below
```
```python
%timeit pickle.loads(pickle.dumps(fs))
```
takes `1.01 ms ± 153 µs per loop` on my machine
Looking at a py-spy profile shows that most of the time is spent in some
internal JSON parsing. Is there a way to avoid this?

### Component(s)
Python
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]