ovcharenko opened a new issue, #34097:
URL: https://github.com/apache/arrow/issues/34097

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   We noticed such errors happened from time to time (about 10% of time) after 
upgrading to PyArrow version 11.0.0
   `python[10868]: segfault at 10 ip 00007f1f6a9c3cb7 sp 00007fffdba28b50 error 
4 in libarrow.so.1100[7f1f69655000+18d3000]`
   
   Trying to reproduce the error we ran the following script dozen times:
   ```python
   import concurrent.futures
   from collections import defaultdict
   from timeit import timeit
   
   import pyarrow as pa
   
   
   def main():
       def convert(attempt):
           return pa.table([pa.array([attempt])], names=["int"])
   
       collected_data = defaultdict(pa.Table)
   
       with concurrent.futures.ThreadPoolExecutor() as pool_executor:
           attempts = range(100)
   
           for index, data in zip(attempts, pool_executor.map(convert, 
attempts)):
               collected_data[index] = data
   
   
   if __name__ == "__main__":
       timeit("main()", globals=locals(), number=1000)
   ```
   
   With PyArrow version 10.0.1 the task just keep running fine in a loop, like
   
   ```
   # while true; do python test.py; echo $?; done
   0
   0
   0
   0
   0
   0
   0
   ...
   ```
   
   But for version 11.0.0 it randomly crashes. 
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to