mgmarino commented on PR #12129:
URL: https://github.com/apache/iceberg/pull/12129#issuecomment-2660113119
Ok, I actually found some things to help me out, documenting for later
(can't look at this just right now).
There's a test that explicitly removes data from the cache of the bloc
mgmarino commented on PR #12129:
URL: https://github.com/apache/iceberg/pull/12129#issuecomment-2660002476
Hi @Fokko, yes, it could be that the underlying issue is the same (i.e.
Spark moving the SerializedTable to disk and closing the IO that is still in
use).
I will see if I can f
Fokko commented on PR #12129:
URL: https://github.com/apache/iceberg/pull/12129#issuecomment-2659989552
Thanks for raising this @mgmarino. I think this is related to another issue
I fixed recently: https://github.com/apache/iceberg/pull/11858
Would it be possible to add a test to illu
mgmarino commented on PR #12129:
URL: https://github.com/apache/iceberg/pull/12129#issuecomment-2659943896
@nastra should I ask in the dev mailing list to try for feedback?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and u
mgmarino commented on PR #12129:
URL: https://github.com/apache/iceberg/pull/12129#issuecomment-2630199667
@aokolnychyi Any chance you'll be able to give some feedback on this? Thanks!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
mgmarino commented on PR #12129:
URL: https://github.com/apache/iceberg/pull/12129#issuecomment-2621728020
I am happy to get input here as to whether or not this is the correct way to
solve this issue and am happy to adapt as necessary. Thanks!
This effectively reverts: #8924
--
Th
mgmarino opened a new pull request, #12129:
URL: https://github.com/apache/iceberg/pull/12129
This is to fix: #12046
To summarize, the issue is that Spark can remove broadcast variables from
memory and
persist them to disk in case that memory needs to be freed. In the case
that