Jayclifford345 opened a new issue, #37852:
URL: https://github.com/apache/arrow/issues/37852
### Describe the bug, including details regarding any error messages,
version, and platform.
We are currently facing a Python Kernal crash within a VScode notebook which
is being caused by:
`flight_client.do_get()`
We are currently trying to return a dataset which has 165 columns using
Pyarrow Flight. This works successfully with columns that return a smaller
column number. If we return the full range we are met with the following crash:
```
14:43:41.160 [info] Handle Execution of Cells 16 for ~\OneDrive - Vertical
Aerospace Group Ltd\Documents\Test Data Access\vertical-data\examples.ipynb
14:43:41.166 [info] Kernel acknowledged execution of cell 16 @ 1695390221162
14:43:43.635 [info] End cell 16 execution @ 1695390223632, started @
1695390221162, elapsed time = 2.47s
14:44:45.415 [error] Disposing session as kernel process died ExitCode:
3221225477, Reason:
14:44:45.415 [info] Dispose Kernel process 12772.
14:44:45.473 [info] End cell 16 execution @ undefined, started @
1695390223632, elapsed time = -1695390223.632s
```
This issue appears to predominantly occur within Pyarrow running natively on
Windows. This functionality is required as I will eventually be interfaced
with matlab.
We are interested to understand why the crash occurs at
`flight_client.do_get()` rather than during the return of the dataset. Here is
the function for completeness:
```python
def query(self, query, language="sql", mode="all",
database=None,**kwargs ):
"""
Query data from InfluxDB.
:param query: The query string.
:type query: str
:param language: The query language; "sql" or "influxql" (default is
"sql").
:type language: str
:param mode: The mode of fetching data (all, pandas, chunk, reader,
schema).
:type mode: str
:param database: The database to query from. If not provided, uses
the database provided during initialization.
:type database: str
:param kwargs: Additional arguments for the query.
:return: The queried data.
"""
if database is None:
database = self._database
try:
headers = [(b"authorization", f"Bearer
{self._token}".encode('utf-8'))]
# Create an authorization header
_options = FlightCallOptions(headers=headers, **kwargs)
ticket_data = {"database": database, "sql_query": query,
"query_type": language}
ticket = Ticket(json.dumps(ticket_data).encode('utf-8'))
flight_reader = self._flight_client.do_get(ticket, _options)
mode_func = {
"all": flight_reader.read_all,
"pandas": flight_reader.read_pandas,
"chunk": lambda: flight_reader,
"reader": flight_reader.to_reader,
"schema": lambda: flight_reader.schema
}.get(mode, flight_reader.read_all)
return mode_func() if callable(mode_func) else mode_func
except Exception as e:
raise e
```
Many thanks in advance for any help that can be provided please let us know
if you need any more information.
### Component(s)
Python
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]