[I] adbc_ingest() is dropping rows in Snowflake [arrow-adbc]

via GitHub Thu, 09 May 2024 11:21:18 -0700


davlee1972 opened a new issue, #1847:
URL: https://github.com/apache/arrow-adbc/issues/1847


   ### What happened?
   
   I'm trying to load 98 million rows from a set of CSV files (5 year period), 
but only 95 to 96 million rows are getting inserted into Snowflake uisng 
adbc_ingest.. The distribution of missing data is pretty random and is around 
~16k records per day.
   
   I tried passing to adbc_ingest(), a pyarrow table and recordbatches.. In 
both cases rows are being dropped..
   
   Here's a screenshot of my notebook code..
   
![image](https://github.com/apache/arrow-adbc/assets/24494353/a65d3207-d65e-4d72-aa4d-47ddf909e645)
   
   The odd thing is that sometimes it inserts 95 million rows and other times 
it inserts 96 million rows.. The total sum of inserted rows matches what I'm 
seeing in Snowflake logs in I add up all the rows created using COPY INTO sql 
commands..
   
   It looks like we're not sending all the batches across the wire..
   
   ### How can we reproduce the bug?
   
   _No response_
   
   ### Environment/Setup
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[I] adbc_ingest() is dropping rows in Snowflake [arrow-adbc]

Reply via email to