jerryshao commented on PR #10696: URL: https://github.com/apache/gravitino/pull/10696#issuecomment-4251602985
> @jerryshao > > I have updated and replied to the comments. Still, I have the following concerns: > > * We use version check to fulfill strong consistency for user/role, but still use eventual consistency for id-name mapping, ownership relationship. It does not look so elegant, as it will always introduce 2 extra DB queries compared to the previous one, and finally gain an eventual consistency. Theoretically speaking, the performance will definitely degrade. Is that truly acceptable? I asked myself several times, "Why not also make the user/role eventually consistent, too ?" In this way, tiny changes to the table definition and a good performance will be realized (all use db-version + pull internally). It would be preferable if you could give me some advice on this point. That is why it hinders me to add id-mapping cache. I think user/role eventual-consistency is acceptable to me. The reason why we chose this solution is affected by Polaris implementation. This is a simple solution to achieve strong consistency. If compared to eventual-consistency solution, the performance doesn't decrease a lot. I think this is simpler than other broadcasting solutions. For entity-cache, we have two scenarios, one is external datasource, another is managed entity by Gravitino alone. For the entities that both combine the entity from datasource and gravitino. We can use try the underlying source as SSOT to evict the cache. For the entities that are managed by Gravitino only. The polaris solution LGTM. > * I have introduced a new table named `entity_change_log` to gain fine-grained recording of each ENTITY change. I'm not 100% sure whether it's the best option, as it will add a new table again. It's not very friendly for future refactoring. > Other options include adding `updated_at` for each table, employing `update-time` in the audit information, and using `current_version`/`last_version`, which are also not very desirable. If the owner and name-id mapping cache is still unclear, we can move to the next step task. Currently, we don't have a clear answer whether 2 additional DB queries is expensive or not. We should implement the phase one and do more tests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
