amogh-jahagirdar commented on PR #11525: URL: https://github.com/apache/iceberg/pull/11525#issuecomment-2472056218
>Just as a gut comment, if we just compressed them shouldn't we get almost all the benefits we are looking for? They are just a bunch of strings so the binary representation of all of them should be pretty compressible. It's true the broadcast would be compressed via `spark.broadcast.compress`, I think the concern is more so when we need to load the map broadcast variable on executor side we'd ultimately need to decompress the chunks. So the goal for relativization was to minimize how much would take in the in-memory representation of the map after decompression. Let me know if that makes sense. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org