Re: [I] [Proposal] Iceberg Materialized View Spec [iceberg]

via GitHub Sat, 27 Jan 2024 03:01:00 -0800


JanKaul commented on issue #6420:
URL: https://github.com/apache/iceberg/issues/6420#issuecomment-1913116304

Hi @szehon-ho, thanks for trying to move the process of reaching consensus
along. To be honest, I don't know how the community normally reaches consensus
on these kinds of topics. But I still have the feeling that we are lacking the
feedback from some key stakeholders. Until now I was waiting with the PRs to
get more feedback. But it seems like creating the PRs is the right thing to
move the proposal along.

In any case it might be good to bring this up at a community sync.

I would argue for 2 PRs, one for the view metadata and one for the table
metadata.

#### Regarding the open questions

1. Question: I agree that we mostly have a consensus there
2. Question: I get the impression from the google doc that people would
prefer Option 1, also you voted for 1
3. Question: Because of its versatility I would really argue for Option 3

#### Regarding your 2cents

1. I totally agree, just using the storage table pointer is a cleaner
solution.
2. The important information here are the snapshot-ids of the
source-tables(base-tables) corresponding to the last refresh operation of the
materialized view. The materialized view requires this information to determine
if the precomputed data is still fresh. At the end of a refresh operation the
materialized view stores the snapshot-ids of its source tables as it's lineage
information. Later on, it can check if the snapshot-ids are still equal to the
current snapshot-ids of the source tables. If they're not equal, it knows the
data changed and its precomputed data needs to be updated.
3. As the last point of
https://docs.google.com/document/d/1UnhldHhe3Grz8JBngwXPA6ZZord1xMedY5ukEhZYF-A/edit?pli=1#heading=h.m5kli4l5q7ui,
we agreed to leave the refresh strategy to the query engine. If available, an
incremental refresh is always preferable to a full refresh.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] [Proposal] Iceberg Materialized View Spec [iceberg]

Reply via email to