jonnor opened a new issue, #3201: URL: https://github.com/apache/arrow-adbc/issues/3201
### What feature or improvement would you like to see? Hi, thank you for the work on the ADBC library and specifically the Python support for Postgres. I have tested it over the last few weeks, and for fetching large dataframes (time-series in our case) - it is much faster than psycopg with pandas. Between 5-10x throughput, and with lower CPU usage both on client and database side. So that is very promising. I am now trying to integrate it into an existing Python web application, which uses psycopg2 for the database driver, with SQLAlchemy for connection management. And gevent for concurrency. And I would need to have the ADBC efficient queries integrated with that system somehow. Right now proper integration is not really possible, since the driver uses its own IO, with blocking reads. So the parts using ADBC would have to duplicate connection management (annoying but managable), and the blocking IO prevents concurrency with gevent (severely reduces performance, voiding the main motivation for using the project). So I was wondering if it would be possible to expose functions/classes that would take data on the Postgresql binary data format, and convert that into Arrow tables. That way, one could use the existing database driver for IO, and avoid conflicts wrt connection management and concurrency. Most PostgreSQL drivers support this now, for example there is `copy_expert()` in psycopg2, `cursor.copy()` in psycopg3, and `copy_from_query()` in asyngpg. I believe this would greatly ease the integration of ADBC into existing codebases. Which might in turn increase adoption. I am aware that this approach _might_ leave some performance on the table, and that is acceptable in my case - I am pretty sure it will continue to be much faster than approaches that use serialization. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
