[I] ## Add Server-Side Cursor/Streaming Support to `adbc_driver_postgresql.dbapi` [arrow-adbc]

via GitHub Sat, 29 Mar 2025 10:17:28 -0700


2sls opened a new issue, #2656:
URL: https://github.com/apache/arrow-adbc/issues/2656


   ### What feature or improvement would you like to see?
   
   ## Add Server-Side Cursor/Streaming Support to `adbc_driver_postgresql.dbapi`
   
   Maybe I'm missing soemthing, but I’m hitting memory issues with large 
PostgreSQL queries in `adbc_driver_postgresql.dbapi` because it fetches full 
result sets upfront. PostgreSQL supports server-side cursors (`DECLARE CURSOR`, 
`FETCH`) for chunked retrieval, but the driver doesn’t use them natively. I’d 
like streaming or cursor support to process big results as Arrow tables in 
chunks, avoiding `LIMIT/OFFSET`.
   
   Current behavior:
   ```python
   with adbc_driver_postgresql.dbapi.connect(uri) as conn:
       cursor = conn.cursor()
       cursor.execute("SELECT * FROM huge_table")
       chunk = cursor.fetchmany(1000)  # Still loads all rows into memory first
   ```
   Request: Add a `chunk_size` param or `fetch_arrow_stream()` to fetch results 
incrementally, e.g.:
   ```
   cursor.execute("SELECT * FROM huge_table", chunk_size=1000)
   for chunk in cursor.fetch_arrow_stream():
       print(chunk.num_rows)  # 1000 rows at a time
   ```
   Could leverage `libpq` cursors or single-row mode. I'm guessing workaround 
is manual SQL cursors, but that’s clunky. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[I] ## Add Server-Side Cursor/Streaming Support to `adbc_driver_postgresql.dbapi` [arrow-adbc]

Reply via email to