pvary commented on PR #6043: URL: https://github.com/apache/iceberg/pull/6043#issuecomment-1290016237
When we were developing updates for Hive tables, the first version of the ACID implementation was to store only the updated data, which is very similar to the partial updates suggested here. With ACID V2 we moved away from this solution, because query planning became a serious issue. We had to make sure that every reader read all of the related update files. In situations when all, or most of the data is updated, then these files could become large. The decisions was to move to tombstones which was more predictable performance around the board. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org