Alberto, I can only speak to the WAN question in your email. The conserve-sockets setting was (or is) a limitation on serial WAN, but I just ran a few tests, and it is not deadlocking. Its been a while since I've tried serial WAN with conserve-sockets=true, but I'm pretty sure a test with several servers in each site and a multi-threaded client doing puts would cause the deadlock. That is not happening in my tests. We would need way more than a few simple tests to prove that it doesn't deadlock in other scenarios, though.
Barry ________________________________ From: Alberto Gomez <alberto.go...@est.tech> Sent: Friday, April 8, 2022 4:17 AM To: dev@geode.apache.org <dev@geode.apache.org>; u...@geode.apache.org <u...@geode.apache.org> Subject: On conserve-sockets=true with WAN and/or transactions - Follow-up on April's Geode Community Meeting ⚠ External Email Hi, Following up on the discussion we had yesterday in the Apache Geode Community meeting around the "Reflections on conserve-sockets setting in Apache Geode" topic, I'd like to post here some questions that could not be fully answered during the meeting: The Geode documentation states the following about conserve-sockets and WAN deployments in [1]: "WAN deployments increase the messaging demands on a Geode system. To avoid hangs related to WAN messaging, always set `conserve-sockets=false` for Geode members that participate in a WAN deployment." It also states the following about conserve-sockets and transactions in [2]: "When you have transactions operating on EMPTY, NORMAL or PARTITION regions, make sure that conserve-sockets is set to false to avoid distributed deadlocks." Doing a search on the Geode tests, the only test case related to deadlocks with conserve-sockets=true that I have found is: https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fgeode%2Fblob%2F41eb49989f25607acfcbf9ac5afe3d4c0721bb35%2Fgeode-wan%2Fsrc%2FdistributedTest%2Fjava%2Forg%2Fapache%2Fgeode%2Finternal%2Fcache%2Fwan%2Fserial%2FSerialGatewaySenderDistributedDeadlockDUnitTest.java%23L176&data=04%7C01%7Cboglesby%40vmware.com%7C99bf10e9a0504739006a08da1951657c%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637850134638236362%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=vnNkVWk0vTSjkAg1neUK91qe%2BwMyfoFyf9gabnT%2BKXs%3D&reserved=0 According to the comments in the test, it always causes a distributed deadlock, and it is commented out. Nevertheless, the test case is actually NOT commented out and, in fact, if you execute it, you see it passing without any failure/deadlock. And here the questions: Could it be that deadlocks with conserve-sockets=true and WAN and/or transactions over partitioned regions was some legacy issue that has already been fixed? Otherwise, could someone please provide some more information about why these deadlocks could happen? It would be great if there were test cases that showcase this possibility. It looks like a big limitation of Geode that you are forced to set conserve-sockets to false (with the implications this has on resources usage) when you are using WAN replication and/or transactions on partitioned regions. Could it be that there are other elements (for example also using CacheListeners as Anthony Baker pointed out) that would increase the risk of hitting a distributed deadlock? Thanks in advance, Alberto [1]: https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgeode.apache.org%2Fdocs%2Fguide%2F114%2Fmanaging%2Fmonitor_tune%2Fsockets_and_gateways.html&data=04%7C01%7Cboglesby%40vmware.com%7C99bf10e9a0504739006a08da1951657c%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637850134638236362%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=S%2B9DPPcFSrxIlCHtPFB0QUUVwT3fTcvHPapoP6vd97U%3D&reserved=0 [2]: https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgeode.apache.org%2Fdocs%2Fguide%2F114%2Fmanaging%2Fmonitor_tune%2Fperformance_controls_controlling_socket_use.html&data=04%7C01%7Cboglesby%40vmware.com%7C99bf10e9a0504739006a08da1951657c%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637850134638236362%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=%2FtF2LJ7T6yLn%2FL0ZRySokjK8%2BOSUvXTV1BiFtNA2cpI%3D&reserved=0 ________________________________ ⚠ External Email: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender.