sfc-gh-jramizares opened a new issue, #3169:
URL: https://github.com/apache/arrow-adbc/issues/3169

   ### What would you like help with?
   
   I have a customer who is experiencing slow data transfer from Snowflake to 
Power BI using ODBC driver version 3.2.2, despite the query execution in 
Snowflake performing well. They are exploring the new ADBC driver recommended 
by Microsoft to potentially improve performance. However, the issue remains. 
Fetching large result sets (e.g., 160M rows) via ADBC is still slow due to 
small default fetch batch sizes (~10K rows). 
   
   They noticed that the extraction is happening in cursor mode, in chunks 
e.g., data fetching is in small batches (10000 rows) which took 20 min to fetch 
160MM rows.
   
   As such, the customer has the following question. Kindly provide any 
guidance you can.
   1. What are the best practices for using the ADBC driver with Power BI?
   2. Are there any configurations to optimize the ADBC driver further?
   3. The environment variable needed for the ADBC driver (http_proxy) is 
generic and impacts other applications. Is there a way to rename this? Can we 
implement an ADBC-specific proxy configuration to avoid conflicts with other 
connectors?
   4. What can be done to maximize throughput (batching, parallelism, or other 
configurations)?
   5. Do we have any ability to configure the chunk size or exclude, if 
possible?
   6. Are there any Snowflake-provided benchmarks or statistics on data load 
timing we can reference? 
   7. Are there any stats on the data load timing from Snowflake team that can 
be referred for the benchmarking ?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to