+1 after chatting with Mick who clarified the picture for me. Thx Mick.

On 12/4/21 20:32, Brandon Williams wrote:
> While I'm certain we could push through all these tests to support
> parallelism, I think it will end up requiring continual work since
> there is a class of tests that won't always work under concurrency,
> but also that won't be immediately obvious until the damage is done.
>
> I'm +1 on punting to docker to parallelize.
>
> On Mon, Apr 12, 2021 at 1:17 PM Mick Semb Wever <m...@apache.org> wrote:
>> Cassandra's build.xml supports parallel test runners. This
>> functionality is available through `-Dtest.runners` and the
>> `testparallel` ant macro.
>>
>> It's always been there, but hasn't been active recently since both
>> ci-cassandra and circleci call testclasslist instead of test.
>>
>> Recently testclasslist was updated to enable multiple runners too.
>> Since then we witnessed a lot more test failures… The distributed
>> in-jvm tests just don't work with parallel runners, and currently they
>> need `-Dtest.runners=1` specified to work. And plenty of flakies where
>> tests use fixed ports (StorageServiceServerTest), byteman (eg
>> BMUnitRunner), and around conf files on disk.
>>
>> From here, I can see two ways forward, a) fix everything to be
>> parallel ready or b) remove test.runners and parallelise with docker
>> instead.
>>
>> All in all, I think this is kinda odd to do (a) when docker is readily
>> available, especially on the CI servers where we are concerned about
>> build times.
>>
>> For (b)… to remove everything related to 'testparallel' and
>> 'test.runners' from the build.xml an example patch is here:
>> https://github.com/apache/cassandra/compare/trunk...thelastpickle:mck/16587-2/trunk
>>
>> Then replacing 'ant task parallelism' with docker containers would be
>> done something like this:
>> https://github.com/apache/cassandra-builds/compare/trunk...thelastpickle:mck/16587-2/trunk
>> (this is just a quick PoC, aimed at the ci-cassandra agents that have
>> 4 cores and 16gb ram available to each executor, but I imagine instead
>> something that spawns a number of containers based on system
>> resources, like we currently do with get-cores and get-mem). Also
>> worth noting the overhead here, compared with the ant approach, docker
>> builds everything in each container from scratch, but this too can be
>> improved easily enough.
>>
>> What are folks' opinions?
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Reply via email to