erikhansenwong opened a new issue, #46224: URL: https://github.com/apache/arrow/issues/46224
### Describe the bug, including details regarding any error messages, version, and platform. On some calls to `Table.join_asof` my python process becomes unresponsive and is using zero cpu. It appears to be a thread deadlock or something similar. I have created an example that causes the deadlock with high probability on my laptop. Here are the details of my setup: - Python 3.12.7 - pyarrow==19.0.1 - numpy==2.2.4 - pandas==2.2.3 - Ubuntu 22.04.5 - CPU: 13th Gen Intel(R) Core(TM) i9-13980HX I was also able to produce the deadlock on a colleague's Mac laptop with Apple silicon using this example, so I assume it won't make a big difference what hardware it runs on. On my laptop this always gets deadlocked before the 300th iteration ```python import numpy as np import pandas as pd import pyarrow as pa n_left = 100 n_right = 200_000 left_start = pd.Timestamp("2025-04-07T07:45:55", tz="UTC") right_start = pd.Timestamp("2025-04-07T00:00:00", tz="UTC") time_end = pd.Timestamp("2025-04-07T12:05:59", tz="UTC") tolerance_nanos = 60 * 1_000_000_000 np.random.seed(0) def get_timestamps(start, end, n): seconds = (end - start).total_seconds() td = np.random.uniform(0, 1, n) td *= np.random.choice([0, 1], n) td *= seconds / td.sum() td = td.cumsum() return start + pd.to_timedelta(td, "seconds") left_schema = pa.schema([pa.field("timestamp", pa.timestamp("ns", "UTC"))]) right_schema = pa.schema( [ pa.field("timestamp", pa.timestamp("ns", "UTC")), pa.field("value", pa.float64()), ] ) left = pa.table( {"timestamp": get_timestamps(left_start, time_end, n_left)}, schema=left_schema, ) right = pa.table( { "timestamp": get_timestamps(right_start, time_end, n_right), "value": np.random.normal(100, 5, n_right), }, schema=right_schema, ) for i in range(1000): print(f"{i:>5} | {pd.Timestamp.now()}") left.join_asof( right, on="timestamp", by=[], tolerance=tolerance_nanos, ) ``` ### Component(s) Python -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org