yuqi1129 commented on PR #10696:
URL: https://github.com/apache/gravitino/pull/10696#issuecomment-4251722751

   > > @jerryshao
   > > I have updated and replied to the comments. Still, I have the following 
concerns:
   > > 
   > > * We use version check to fulfill strong consistency for user/role, but 
still use eventual consistency for id-name mapping, ownership relationship. It 
does not look so elegant, as it will always introduce 2 extra DB queries 
compared to the previous one, and finally gain an eventual consistency. 
Theoretically speaking,  the performance will definitely degrade. Is that truly 
acceptable? I asked myself several times, "Why not also make the user/role 
eventually consistent, too ?" In this way, tiny changes to the table definition 
and a good performance will be realized (all use db-version + pull internally). 
 It would be preferable if you could give me some advice on this point. That is 
why it hinders me to add id-mapping cache.
   > 
   > I think user/role eventual-consistency is acceptable to me. The reason why 
we chose this solution is affected by Polaris implementation. This is a simple 
solution to achieve strong consistency. If compared to eventual-consistency 
solution, the performance doesn't decrease a lot. I think this is simpler than 
other broadcasting solutions.
   > 
   > For entity-cache, we have two scenarios, one is external datasource, 
another is managed entity by Gravitino alone.
   > 
   > For the entities that both combine the entity from datasource and 
gravitino. We can use try the underlying source as SSOT to evict the cache.
   > 
   > For the entities that are managed by Gravitino only. We can leverage 
Polaris solution.
   > 
   > > * I have introduced a new table named `entity_change_log` to gain 
fine-grained recording of each ENTITY change. I'm not 100% sure whether it's 
the best option, as it will add a new table again. It's not very friendly for 
future refactoring.
   > >   Other options include adding `updated_at` for each table, employing 
`update-time` in the audit information, and using 
`current_version`/`last_version`, which are also not very desirable.
   > 
   > If the owner and name-id mapping cache is still unclear, we can move to 
the next step task. Currently, we don't have a clear answer whether 2 
additional DB queries is expensive or not. We should implement the phase one 
and do more tests.
   
   @yuqi1129 
   Thanks for your insightful point. I will be coding according to the current 
document and do enough testing before merging. 
   If an extra 2 DB query is not acceptable for performance testing, I will 
make the user/role/group also eventually consistent to reduce db queries.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to