ovcharenko opened a new issue, #34097:
URL: https://github.com/apache/arrow/issues/34097
### Describe the bug, including details regarding any error messages,
version, and platform.
We noticed such errors happened from time to time (about 10% of time) after
upgrading to PyArrow version 11.0.0
`python[10868]: segfault at 10 ip 00007f1f6a9c3cb7 sp 00007fffdba28b50 error
4 in libarrow.so.1100[7f1f69655000+18d3000]`
Trying to reproduce the error we ran the following script dozen times:
```python
import concurrent.futures
from collections import defaultdict
from timeit import timeit
import pyarrow as pa
def main():
def convert(attempt):
return pa.table([pa.array([attempt])], names=["int"])
collected_data = defaultdict(pa.Table)
with concurrent.futures.ThreadPoolExecutor() as pool_executor:
attempts = range(100)
for index, data in zip(attempts, pool_executor.map(convert,
attempts)):
collected_data[index] = data
if __name__ == "__main__":
timeit("main()", globals=locals(), number=1000)
```
With PyArrow version 10.0.1 the task just keep running fine in a loop, like
```
# while true; do python test.py; echo $?; done
0
0
0
0
0
0
0
...
```
But for version 11.0.0 it randomly crashes.
### Component(s)
Python
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]