It feels like, first, we should choose right resources/tools that is suited for 
the task in hand and helps in achieving the expected result (Testing - easier 
to develop, run, monitor and report); and then invest in that once. Even if it 
means to add new tools/subroutines in the product.

E.g.:
Best suited for above requirement:
Runtime environment - Containers (?)
Testing framework - Junit
Build tools - gradle
Reporting/logging - (?)
Managing/Monitoring - (?) 

-Anil.


On 6/30/20, 1:21 PM, "Donal Evans" <doev...@vmware.com> wrote:

    +1 for fixing the tests. It'll be a lot of work, but it'll only be a lot of 
work once, as opposed to taking on maintenance of our own custom Docker plugin, 
which will be an ongoing effort and not at all immune from getting broken again 
at some point in the future.
    ________________________________
    From: Jinmei Liao <jil...@vmware.com>
    Sent: Tuesday, June 30, 2020 12:28 PM
    To: dev@geode.apache.org <dev@geode.apache.org>
    Subject: Re: Us vs Docker vs Gradle vs JUnit

    I would vote for fixing the tests to use gradle's normal forking. If we are 
going to invest time and effort, let's invest in an option that can reduce our 
dependencies
    ________________________________
    From: Jacob Barrett <jabarr...@vmware.com>
    Sent: Tuesday, June 30, 2020 11:30 AM
    To: dev@geode.apache.org <dev@geode.apache.org>
    Subject: Us vs Docker vs Gradle vs JUnit

    All,

    We are in a bit of a pickle. As you recall from a few years back in an 
effort to both stabilize and parallelize integration, distributed and other 
integration/system like test we use Docker. Many of the tests reused the same 
ports for services which cause them to fail or interact with each other when 
run in parallel. By using Docker to isolate a test we put a bandage on that 
issue. The plugin overrides Gradle’s default forked runner by starting the 
runners in Docker containers and marshaling the execution parameters to those 
Dockerized runners.

    The Docker test plugin is effectively unmaintained. The author seems 
content on keeping it compatible with Gradle 4. We forked it to work with 
Gradle 5 and various other issues we have hit over the years. We have shared 
patches in the past with little luck in having them merged and still its only 
compatible with Gradle 4.8 at best. I spent some time trying to port it to 
Gradle 6 but its going to be a larger undertaking given that Gradle 6 is fully 
Java modules compatible. They added new members throughout to handle modules in 
addition to class paths.

    Long story short because our tests can’t be parallelized without a 
container system we are stuck. We can’t go to JUnit 5 without updating Docker 
plugin (potentially minor changes). We can’t go to Gradle 6 without updating 
the Docker plugin (potentially huge changes). Being stuck is not a good place. 
I see two paths out of this:

    1) We buckle down and fix the tests so they can run in parallel via the 
normal forking mechanism of Gradle. I know some effort has been expended in 
this by using our new rules for starting servers. We should need to go further.

    2) Fully invest in the Docker plugin. We would need to fork this off as a 
fully maintain sub-project of Geode. We would need to add to it support for 
both Gradle 6 and JUnit 5.

    My money is on fixing the tests. It is clear, at least from my exhaustive 
searching, nobody in the Gradle and JUnit communities are isolating their tests 
with containers. They are creating containers to host service for system level 
testing, see Testcontainers project. The tests themselves run in the local 
kernel space (not in container).

    We made this push in the C++ and .NET tests, a much smaller set of tests, 
and it works great. The framework takes care to create clusters that do not 
interact with each other on the same host. Some things in Geode make this 
harder than others, like http service not support ephemeral port selection, and 
gfsh not providing machine readable output about ephemeral port selections. We 
use port knocking to prevent the OS from assigning the port ephemerally to 
another process. The framework knocks, opens and then closes, all the ports it 
needs for the server/locator services and starts them explicitly on those 
ports. Because of port recycling rules in the OS another ephemeral port request 
won’t get those ports for some time after they are closed. It's not perfect but 
it works. Fixing Geode to support ephemeral port selection and a better 
reporting mechanisms for those port choices would be more ideal. Also, we only 
start services necessary for the test, like don’t start the http ports if they 
aren’t going to be used.

    I would love some feedback and thoughts on this issue. Does anyone else see 
a different path forward?

    -Jake






Reply via email to