Re: [DISCUSS] CEP-48: First-Class Materialized View Support

2025-05-15 Thread Runtian Liu
The previous table compared the complexity of full repair and MV repair when reconciling one dataset with another. In production, we typically use a replication factor of 3 in one datacenter. This means full repair involves 3n rows, while MV repair involves comparing 6n rows (base + MV). Below is a

Re: [DISCUSS] CEP-48: First-Class Materialized View Support

2025-05-15 Thread Jon Haddad
> They are not two unordered sets, but rather two sets ordered by different keys. I think this is a distinction without a difference. Merkle tree repair works because the ordering of the data is mostly the same across nodes. On Thu, May 15, 2025 at 9:27 AM Runtian Liu wrote: > > what we're try

Re: [VOTE] CEP-46: Finish Transient Replication/Witnesses

2025-05-15 Thread Ariel Weisberg
Hi, With 15 binding +1s and no -1s the vote passes. Thanks! Ariel On Tue, May 13, 2025, at 3:48 AM, Mick Semb Wever wrote: > > . > > >> The vote will be open for 72 hours. A vote passes if there are at least 3 >> binding +1s and no binding vetoes. > > > > +1 > >

Re: [DISCUSS] CEP-48: First-Class Materialized View Support

2025-05-15 Thread Runtian Liu
> what we're trying to achieve here is comparing two massive unordered sets. They are not two unordered sets, but rather two sets ordered by different keys. This means that when building Merkle trees for the base table and the materialized view (MV), we need to use different strategies to ensure t

Re: [DISCUSS] CEP-48: First-Class Materialized View Support

2025-05-15 Thread Josh McKenzie
> I think in order to address this, the view should be propagated to the base > replicas *after* it's accepted by all or a majority of base replicas. This is > where I think mutation tracking could probably help. Yeah, the idea of "don't reflect in the MV until you hit the CL the user requested

Re: [DISCUSS] CEP-48: First-Class Materialized View Support

2025-05-15 Thread Paulo Motta
> I think requiring a rebuild is a deal breaker for most teams. In most instances it would be having to also expand the cluster to handle the additional disk requirements. It turns an inconsistency problem into a major operational headache that can take weeks to resolve. Agreed. The rebuild wou

Re: [DISCUSS] CEP-48: First-Class Materialized View Support

2025-05-15 Thread Jon Haddad
I think requiring a rebuild is a deal breaker for most teams. In most instances it would be having to also expand the cluster to handle the additional disk requirements. It turns an inconsistency problem into a major operational headache that can take weeks to resolve. On Thu, May 15, 2025 a

Re: [DISCUSS] CEP-48: First-Class Materialized View Support

2025-05-15 Thread Paulo Motta
> There's bi-directional entropy issues with MV's - either orphaned view data or missing view data; that's why you kind of need a "bi-directional ETL" to make sure the 2 agree with each other. While normal repair would resolve the "missing data in MV" case, it wouldn't resolve the "data in MV that'

Re: [DISCUSS] CEP-48: First-Class Materialized View Support

2025-05-15 Thread Josh McKenzie
There's bi-directional entropy issues with MV's - either orphaned view data or missing view data; that's why you kind of need a "bi-directional ETL" to make sure the 2 agree with each other. While normal repair would resolve the "missing data in MV" case, it wouldn't resolve the "data in MV that