pingtimeout commented on PR #1339: URL: https://github.com/apache/polaris/pull/1339#issuecomment-2815122531
Thanks for the help @ben-manes. I went back to the drawing board and have a better understanding of the kind of issues that can happen. I have reimplemented the `EntityCache` using a plain `Cache` instance as well as a Guava `BiMap` to hold the bidirectional mapping between entities id and name. I also used a single `ReentrantReadWriteLock` to guard all modifications to the cache. This implementation : * passes all the unit tests as well as the concurrency tests * only issues the strict minimum number of calls to the database * does not expose any internal desynchronization between the cache and the name-to-id mapping I am going to push the code so that a CI run fully validates it. In the meantime, I started working on a better version that would use a `Striped<ReadWriteLock>` as you did with the `IndexedCache`. But the main issue is that with such a lock, we get a false sense of security. Because entities can be "locked" twice: once using the id and once using the name as the key. This issue happens in the `EntityCache` as well as in my current code. A possible workaround that I can think of is to: 1. Acquire a striped lock using the same value for all entities retrieved by name (say `0L`) 2. Check if an entity with that name is in cache 3. If not, load it from the database and delegate the "add to cache" logic to the same logic that does this by the entity ID 4. This will, in turn, grab another striped lock based on the entity ID and do the rest as usual I need to double check (and test) that no deadlock can happen with this double locking sequence. I will post updates here... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
