JanKaul commented on issue #6420: URL: https://github.com/apache/iceberg/issues/6420#issuecomment-1913116304
Hi @szehon-ho, thanks for trying to move the process of reaching consensus along. To be honest, I don't know how the community normally reaches consensus on these kinds of topics. But I still have the feeling that we are lacking the feedback from some key stakeholders. Until now I was waiting with the PRs to get more feedback. But it seems like creating the PRs is the right thing to move the proposal along. In any case it might be good to bring this up at a community sync. I would argue for 2 PRs, one for the view metadata and one for the table metadata. #### Regarding the open questions 1. Question: I agree that we mostly have a consensus there 2. Question: I get the impression from the google doc that people would prefer Option 1, also you voted for 1 3. Question: Because of its versatility I would really argue for Option 3 #### Regarding your 2cents 1. I totally agree, just using the storage table pointer is a cleaner solution. 2. The important information here are the snapshot-ids of the source-tables(base-tables) corresponding to the last refresh operation of the materialized view. The materialized view requires this information to determine if the precomputed data is still fresh. At the end of a refresh operation the materialized view stores the snapshot-ids of its source tables as it's lineage information. Later on, it can check if the snapshot-ids are still equal to the current snapshot-ids of the source tables. If they're not equal, it knows the data changed and its precomputed data needs to be updated. 3. As the last point of https://docs.google.com/document/d/1UnhldHhe3Grz8JBngwXPA6ZZord1xMedY5ukEhZYF-A/edit?pli=1#heading=h.m5kli4l5q7ui, we agreed to leave the refresh strategy to the query engine. If available, an incremental refresh is always preferable to a full refresh. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org