[I] Overwrite with Filter Conditions Example - Large Amount of Filter Conditions [iceberg-python]

via GitHub Fri, 24 Jan 2025 08:22:00 -0800


lelandroling opened a new issue, #1571:
URL: https://github.com/apache/iceberg-python/issues/1571


   ### Question
   
   Checking through the GitHub issues, I noticed very few examples and I did 
see the open requests for improved documentation. Understandably, I understand 
that I can use MERGE INTO using Pyspark. My specific example is attempting to 
avoid the large overhead of Pyspark, but if that's the solution... ok. But 
before I walk down that path, I'm trying to understand how the use case looks 
for the ```.overwrite()``` and ```overwrite_filter```.
   
   ```
   conditions = []
   for row in values:
   
   row_condition = And(*[EqualTo(k, v) for k, v in zip(newKeys, row)])
   conditions.append(row_condition)
   
   filter_condition = Or(*conditions)
   ```
   
   I'm using this code to build out the filter_condition, then assigning that 
to overwrite_filter. What I've noticed is that if I have 1000 records, I'm 
hitting a maximum recursion error. My assumption is that I'm not understanding 
how to structure the filter_condition. Or the process can't handle this right 
now and I should move to MERGE INTO and Pyspark.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

[I] Overwrite with Filter Conditions Example - Large Amount of Filter Conditions [iceberg-python]

Reply via email to