Jayclifford345 opened a new issue, #37852:
URL: https://github.com/apache/arrow/issues/37852

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   We are currently facing a Python Kernal crash within a VScode notebook which 
is being caused by:
   `flight_client.do_get()`
   
   We are currently trying to return a dataset which has 165 columns using 
Pyarrow Flight. This works successfully with columns that return a smaller 
column number. If we return the full range we are met with the following crash:
   ```
   14:43:41.160 [info] Handle Execution of Cells 16 for ~\OneDrive - Vertical 
Aerospace Group Ltd\Documents\Test Data Access\vertical-data\examples.ipynb
   14:43:41.166 [info] Kernel acknowledged execution of cell 16 @ 1695390221162
   14:43:43.635 [info] End cell 16 execution @ 1695390223632, started @ 
1695390221162, elapsed time = 2.47s
   14:44:45.415 [error] Disposing session as kernel process died ExitCode: 
3221225477, Reason: 
   14:44:45.415 [info] Dispose Kernel process 12772.
   14:44:45.473 [info] End cell 16 execution @ undefined, started @ 
1695390223632, elapsed time = -1695390223.632s
   ```
   This issue appears to predominantly occur within Pyarrow running natively on 
Windows.  This functionality is required as I will eventually be interfaced 
with matlab.
   
   We are interested to understand why the crash occurs at  
`flight_client.do_get()` rather than during the return of the dataset. Here is 
the function for completeness:
   ```python
       def query(self, query, language="sql", mode="all", 
database=None,**kwargs ):
           """
           Query data from InfluxDB.
   
           :param query: The query string.
           :type query: str
           :param language: The query language; "sql" or "influxql" (default is 
"sql").
           :type language: str
           :param mode: The mode of fetching data (all, pandas, chunk, reader, 
schema).
           :type mode: str
           :param database: The database to query from. If not provided, uses 
the database provided during initialization.
           :type database: str
           :param kwargs: Additional arguments for the query.
           :return: The queried data.
           """
           
   
           if database is None:
               database = self._database
           
           try:
               headers = [(b"authorization", f"Bearer 
{self._token}".encode('utf-8'))]
       
               # Create an authorization header
               _options = FlightCallOptions(headers=headers, **kwargs)
               ticket_data = {"database": database, "sql_query": query, 
"query_type": language}
               ticket = Ticket(json.dumps(ticket_data).encode('utf-8'))
               flight_reader = self._flight_client.do_get(ticket, _options)
   
               mode_func = {
                   "all": flight_reader.read_all,
                   "pandas": flight_reader.read_pandas,
                   "chunk": lambda: flight_reader,
                   "reader": flight_reader.to_reader,
                   "schema": lambda: flight_reader.schema
               }.get(mode, flight_reader.read_all)
   
               return mode_func() if callable(mode_func) else mode_func
           except Exception as e:
               raise e
   ```
   
   Many thanks in advance for any help that can be provided please let us know 
if you need any more information. 
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to