Re: [I] Issue during Upsert [iceberg-python]

2025-05-09 Thread via GitHub
kevinjqliu commented on issue #1759: URL: https://github.com/apache/iceberg-python/issues/1759#issuecomment-2867391759 @deepika094 would be great to post the stacktrace to help debug further. Let's create a new issue since the problem seems to be different from the current one -- This i

Re: [I] Issue during Upsert [iceberg-python]

2025-05-08 Thread via GitHub
deepika094 commented on issue #1759: URL: https://github.com/apache/iceberg-python/issues/1759#issuecomment-2862418320 Hi @kevinjqliu Yes i am using 0.9.1. I tried to use batch sizing as well. If i run without batch it just keeps running for hours. If i batch it into 1000 rows it w

Re: [I] Issue during Upsert [iceberg-python]

2025-05-02 Thread via GitHub
kevinjqliu commented on issue #1759: URL: https://github.com/apache/iceberg-python/issues/1759#issuecomment-2847410229 @deepika094 a couple of questions 1. are you using the latest version, `0.9.1`? We added a few fixes for upserts in the new release 2. the comparison operation was mo

Re: [I] Issue during Upsert [iceberg-python]

2025-05-02 Thread via GitHub
deepika094 commented on issue #1759: URL: https://github.com/apache/iceberg-python/issues/1759#issuecomment-2847371843 Hey guys, I still have similar issue.. i have around 5 million rows for a given day. I ran the process once it inserted data. But lets say i want to rerun the proces

Re: [I] Issue during Upsert [iceberg-python]

2025-03-31 Thread via GitHub
Fokko closed issue #1759: Issue during Upsert URL: https://github.com/apache/iceberg-python/issues/1759 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [I] Issue during Upsert [iceberg-python]

2025-03-12 Thread via GitHub
kevinjqliu commented on issue #1759: URL: https://github.com/apache/iceberg-python/issues/1759#issuecomment-2718968518 @mattmartin14 its based on the scale of the input. For insert filters, it depends on the output of `create_match_filter` and the table schema. Essentially anytime `bind

Re: [I] Issue during Upsert [iceberg-python]

2025-03-12 Thread via GitHub
mattmartin14 commented on issue #1759: URL: https://github.com/apache/iceberg-python/issues/1759#issuecomment-2718975527 What if instead of using a filter for the insert rows, we instead do a pyarrow compute anti-left join to identify the rows needing to be inserted? That might help avoid

Re: [I] Issue during Upsert [iceberg-python]

2025-03-11 Thread via GitHub
mattmartin14 commented on issue #1759: URL: https://github.com/apache/iceberg-python/issues/1759#issuecomment-2715872005 Hey @kevinjqliu , From my original testing, insert filters were not affected by this problem. It was only the overwrite filters that were an issue. Has somethin

Re: [I] Issue during Upsert [iceberg-python]

2025-03-11 Thread via GitHub
kevinjqliu commented on issue #1759: URL: https://github.com/apache/iceberg-python/issues/1759#issuecomment-2715325319 Thanks everyone. i think this is a more generic issue with `bind` and the visitors which i opened #1785 to track. I believe this issue is showing up in `upsert` in

Re: [I] Issue during Upsert [iceberg-python]

2025-03-10 Thread via GitHub
mattmartin14 commented on issue #1759: URL: https://github.com/apache/iceberg-python/issues/1759#issuecomment-2712180183 I encountered this issue when writing the upsert method. If your join column is a single column, then the function scales fine and won't hit a recursion limit. I stresse

Re: [I] Issue during Upsert [iceberg-python]

2025-03-10 Thread via GitHub
smaheshwar-pltr commented on issue #1759: URL: https://github.com/apache/iceberg-python/issues/1759#issuecomment-2712089897 Have a test that I think repros this in (draft) https://github.com/apache/iceberg-python/pull/1783. Also, (draft) https://github.com/apache/iceberg-python/pull/

Re: [I] Issue during Upsert [iceberg-python]

2025-03-10 Thread via GitHub
Fokko commented on issue #1759: URL: https://github.com/apache/iceberg-python/issues/1759#issuecomment-2711738238 Thanks everyone, for raising this. The number of columns that you upsert on can cause problems since each column will duplicate the complexity of the predicate. As a in

Re: [I] Issue during Upsert [iceberg-python]

2025-03-08 Thread via GitHub
heman026 commented on issue #1759: URL: https://github.com/apache/iceberg-python/issues/1759#issuecomment-2702980670 ### Full stack trace of the issue: **Exception has occurred: RecursionError** > maximum recursion depth exceeded > AttributeError: '_thread._local' object has no

Re: [I] Issue during Upsert [iceberg-python]

2025-03-07 Thread via GitHub
suarez-agu commented on issue #1759: URL: https://github.com/apache/iceberg-python/issues/1759#issuecomment-2706773605 I also have the same issue trying to upsert 10k rows. Starting from an empty Table btw -- This is an automated message from the Apache Git Service. To respond to

Re: [I] Issue during Upsert [iceberg-python]

2025-03-05 Thread via GitHub
heman026 commented on issue #1759: URL: https://github.com/apache/iceberg-python/issues/1759#issuecomment-2700632648 > can you share more on what code you ran? specifically what is `table`, `pyarrow_table`, and `join_cols` ``` iceberg_table = 'test.table1' table = catalog.load_

Re: [I] Issue during Upsert [iceberg-python]

2025-03-04 Thread via GitHub
kevinjqliu commented on issue #1759: URL: https://github.com/apache/iceberg-python/issues/1759#issuecomment-2698005954 can you share more on what code you ran? specifically what is `table`, `pyarrow_table`, and `join_cols` -- This is an automated message from the Apache Git Service. To r

[I] Issue during Upsert [iceberg-python]

2025-03-04 Thread via GitHub
heman026 opened a new issue, #1759: URL: https://github.com/apache/iceberg-python/issues/1759 ### Question I am getting the following error when using table.upsert() to update data. > Exception has occurred: RecursionError > maximum recursion depth exceeded > AttributeErro