[ https://issues.apache.org/jira/browse/GEODE-8553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204416#comment-17204416 ]
ASF GitHub Bot commented on GEODE-8553: --------------------------------------- gaussianrecurrence opened a new pull request #660: URL: https://github.com/apache/geode-native/pull/660 - ThinClientLocatorHelper uses a mutex called m_locatorLock, which is used in the entire scope of all the public member functions for this class. - Problem with that is all of those functions perform some network communication and in the event of networking issues, this will result in having all the application threads inter-locked. - Also, the only purpose of this mutex is to avoid that the locators list is read while it's being updated. - Given all of the above, the class has been refactored, so every time the locators list has to be accessed, a copy of it is created, being that copy created only while owning the mutex. - And for the case of the update, a new locators list is composed and its only at the end of the update function where the mutex is locked and the locators list swapped by the new one. - This whole change ensures that the time the mutex is in use is minimal while the functionality remains intact. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Reduce ThinClientLocatorHelper lock time > ---------------------------------------- > > Key: GEODE-8553 > URL: https://issues.apache.org/jira/browse/GEODE-8553 > Project: Geode > Issue Type: Improvement > Components: native client > Affects Versions: 1.12.0, 1.13.0 > Reporter: Mario Salazar de Torres > Priority: Major > > During my troublshootings, I've seen that locking m_locatorLock for the whole > scope of the class function members might cause some inter-locks. > Problem here and in many other places of the NC is that networking operations > are performed while a mutex is locked. Therefore if *thread A* takes longer > than expected in performing its network operation, it might block another one > which does not requires any resource of the first *thread A*. Hence, the > inter-lock. > This improvement is the first one of a series regarding to lock scope > reduction when it comes with code regarding networking in NC. > -- This message was sent by Atlassian Jira (v8.3.4#803005)