gaborkaszab commented on PR #16207:
URL: https://github.com/apache/iceberg/pull/16207#issuecomment-4387564159
Thanks again for the explanation @grantatspothero !
Let's see if @amogh-jahagirdar also chimes in as said on the dev@ list. I'm
not saying that this enhancement is not needed but would be great to see wider
community consensus.
In the meantime, I think logically the fix is 2 parts:
1) Not to cache snapshotLog on the client side if we are in REFS mode
2) Lazily load snapshotLog if requested similarly to snapshots
I think this PR only takes care of the second part, but we still put the
entire snapshotLog into TableMetadata if sent by the server. Or in other words,
if the REST server returns a reduced snapshotLog in REFS mode, than your
problem is already solved without this PR, if it sends the full snapshotLog
then this PR won't solve the caching overhead issues for you.
Take a look at [this
change](https://github.com/gaborkaszab/iceberg/commit/8ce07d5d8d00fceeb7f952cd3fa2ce2738e74752)
I've put on top of yours: I changed the reference REST catalog that we use for
REST catalog tests to still return a full snapshotLog as before this PR, and I
kept all the other changes you had. The test I introduced to check the length
of the snapshotLog expects reduced length, but fails.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]