jayceslesar commented on issue #2159:
URL:
https://github.com/apache/iceberg-python/issues/2159#issuecomment-3033347997
@Fokko @kevinjqliu do you think its worth setting up a roadmap for what
should be candidates for rolling wheels from rust? Would really help focus
efforts on lacking part
koenvo commented on issue #2159:
URL:
https://github.com/apache/iceberg-python/issues/2159#issuecomment-309213
Totally agree. Lets start exploring the iceberg-rust codebase
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
jayceslesar commented on issue #2159:
URL:
https://github.com/apache/iceberg-python/issues/2159#issuecomment-3033171165
> Honestly, I think it would be a better use of community resources to
invest more in the iceberg-rust/datafusion path so that the bulk of this logic
can be moved out
corleyma commented on issue #2159:
URL:
https://github.com/apache/iceberg-python/issues/2159#issuecomment-3032961475
I think @Anton-Tarazi's original point -- creating a bunch of (Python
object) filter expressions for every row in a large dataframe is going to be
slow, and we do that befor
Fokko commented on issue #2159:
URL:
https://github.com/apache/iceberg-python/issues/2159#issuecomment-3031132431
Hey @koenvo thanks for raising this discussion. Nothing is set in stone, so
there are always possibilities to optimize, and I agree, we started with rough
building blocks.
koenvo commented on issue #2159:
URL:
https://github.com/apache/iceberg-python/issues/2159#issuecomment-3026872310
This aligns well with the discussion here:
https://github.com/apache/iceberg-python/issues/2138#issuecomment-2997190853
While there have been improvements to `upsert` -
Anton-Tarazi commented on issue #2159:
URL:
https://github.com/apache/iceberg-python/issues/2159#issuecomment-3016154418
Being able to provide a "hint" seems like a decent workaround, but then we
have to rely on the user providing the correct filter, otherwise the upsert
won't work properl
jayceslesar commented on issue #2159:
URL:
https://github.com/apache/iceberg-python/issues/2159#issuecomment-3016142200
Haha I was just looking at this last week -- I wonder if it would make sense
for a user to be able to supply their own filter into the util? That was what
enabled me to s
Anton-Tarazi opened a new issue, #2159:
URL: https://github.com/apache/iceberg-python/issues/2159
### Feature Request / Improvement
## Feature Request / Improvement
Upserting large dataframes (tens of millions of rows) in un-usably slow due
to creating a massive `BooleanExpress