[ https://issues.apache.org/jira/browse/GEODE-8553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17206174#comment-17206174 ]
ASF GitHub Bot commented on GEODE-8553: --------------------------------------- gaussianrecurrence edited a comment on pull request #660: URL: https://github.com/apache/geode-native/pull/660#issuecomment-702736992 Hi again, Regarding the test proposal, I've got an idea I would like to share with you in order to get your opinion on the matter. The purpose of this PR is avoid that whenever a thread is connecting to a locator or performing a request, other threads accesing the pool locator **don't get locked**. Therefore what I thought is to: - Introduce a new abstract class **ConnectorFactory**, which basically creates a new **Connector**. - Create a **ConnectorFactory** implementation called **TcpConnFactory** which basically creates **TcpConn**/**TcpSslConn**. This itself would also help reduced duplicated code, as connections creation is duplicated both in **TcrConnection** and **ThinClientLocatorHelper**. - PoolFactory would be the one responsible for setting the instance of ConnectorFactory into the created pool, adding therefore a method to set ConnectorFactory instance in there. - ConnectorFactory should be then accesible throught the Pool. - Create a Google Mock for ConnectorFactory. - Create a Google Mock for Connector. **PST.** While seeking a solution for the test proposal I think I've noticed something: "Is it possible that many of the factory class existing in NC, should be instead builders?" Ok, so having set all the above scaffolding, then the idea would be to create a ThinClientLocatorHelper TS which tries to perform requests by multiple threads, and having the ability to control whats returned by the connections and how much time is spent on a connection operation, you can therefore check that if one thread is waiting for an operation to complete, the other thread can inmmediatly return. An pseudo-code example would be the following: **testGetAllServerWhileUpdating** ` createCache(); createPoolFactoryWithCustomConnectorFactory(); updateT = createUpdateThread(connector_sleep_time=10s); getAllServersT = createGetAllServersThread(connector_sleep_time=0s); ASSERT_EQ(getAllServersT.wait_for(1s), std::future_state::ready); ` --- So want do you think about it? What I don't know is where would I put the test, for that I might need a hand from your side :) BR /Mario ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Reduce ThinClientLocatorHelper lock time > ---------------------------------------- > > Key: GEODE-8553 > URL: https://issues.apache.org/jira/browse/GEODE-8553 > Project: Geode > Issue Type: Improvement > Components: native client > Affects Versions: 1.12.0, 1.13.0 > Reporter: Mario Salazar de Torres > Assignee: Mario Salazar de Torres > Priority: Major > Labels: pull-request-available > > During my troublshootings, I've seen that locking m_locatorLock for the whole > scope of the class function members might cause some inter-locks. > Problem here and in many other places of the NC is that networking operations > are performed while a mutex is locked. Therefore if *thread A* takes longer > than expected in performing its network operation, it might block another one > which does not requires any resource of the first *thread A*. Hence, the > inter-lock. > This improvement is the first one of a series regarding to lock scope > reduction when it comes with code regarding networking in NC. > -- This message was sent by Atlassian Jira (v8.3.4#803005)