I would vote for fixing the tests to use gradle's normal forking. If we are going to invest time and effort, let's invest in an option that can reduce our dependencies ________________________________ From: Jacob Barrett <jabarr...@vmware.com> Sent: Tuesday, June 30, 2020 11:30 AM To: dev@geode.apache.org <dev@geode.apache.org> Subject: Us vs Docker vs Gradle vs JUnit
All, We are in a bit of a pickle. As you recall from a few years back in an effort to both stabilize and parallelize integration, distributed and other integration/system like test we use Docker. Many of the tests reused the same ports for services which cause them to fail or interact with each other when run in parallel. By using Docker to isolate a test we put a bandage on that issue. The plugin overrides Gradle’s default forked runner by starting the runners in Docker containers and marshaling the execution parameters to those Dockerized runners. The Docker test plugin is effectively unmaintained. The author seems content on keeping it compatible with Gradle 4. We forked it to work with Gradle 5 and various other issues we have hit over the years. We have shared patches in the past with little luck in having them merged and still its only compatible with Gradle 4.8 at best. I spent some time trying to port it to Gradle 6 but its going to be a larger undertaking given that Gradle 6 is fully Java modules compatible. They added new members throughout to handle modules in addition to class paths. Long story short because our tests can’t be parallelized without a container system we are stuck. We can’t go to JUnit 5 without updating Docker plugin (potentially minor changes). We can’t go to Gradle 6 without updating the Docker plugin (potentially huge changes). Being stuck is not a good place. I see two paths out of this: 1) We buckle down and fix the tests so they can run in parallel via the normal forking mechanism of Gradle. I know some effort has been expended in this by using our new rules for starting servers. We should need to go further. 2) Fully invest in the Docker plugin. We would need to fork this off as a fully maintain sub-project of Geode. We would need to add to it support for both Gradle 6 and JUnit 5. My money is on fixing the tests. It is clear, at least from my exhaustive searching, nobody in the Gradle and JUnit communities are isolating their tests with containers. They are creating containers to host service for system level testing, see Testcontainers project. The tests themselves run in the local kernel space (not in container). We made this push in the C++ and .NET tests, a much smaller set of tests, and it works great. The framework takes care to create clusters that do not interact with each other on the same host. Some things in Geode make this harder than others, like http service not support ephemeral port selection, and gfsh not providing machine readable output about ephemeral port selections. We use port knocking to prevent the OS from assigning the port ephemerally to another process. The framework knocks, opens and then closes, all the ports it needs for the server/locator services and starts them explicitly on those ports. Because of port recycling rules in the OS another ephemeral port request won’t get those ports for some time after they are closed. It's not perfect but it works. Fixing Geode to support ephemeral port selection and a better reporting mechanisms for those port choices would be more ideal. Also, we only start services necessary for the test, like don’t start the http ports if they aren’t going to be used. I would love some feedback and thoughts on this issue. Does anyone else see a different path forward? -Jake