+1 after chatting with Mick who clarified the picture for me. Thx Mick. On 12/4/21 20:32, Brandon Williams wrote: > While I'm certain we could push through all these tests to support > parallelism, I think it will end up requiring continual work since > there is a class of tests that won't always work under concurrency, > but also that won't be immediately obvious until the damage is done. > > I'm +1 on punting to docker to parallelize. > > On Mon, Apr 12, 2021 at 1:17 PM Mick Semb Wever <m...@apache.org> wrote: >> Cassandra's build.xml supports parallel test runners. This >> functionality is available through `-Dtest.runners` and the >> `testparallel` ant macro. >> >> It's always been there, but hasn't been active recently since both >> ci-cassandra and circleci call testclasslist instead of test. >> >> Recently testclasslist was updated to enable multiple runners too. >> Since then we witnessed a lot more test failures… The distributed >> in-jvm tests just don't work with parallel runners, and currently they >> need `-Dtest.runners=1` specified to work. And plenty of flakies where >> tests use fixed ports (StorageServiceServerTest), byteman (eg >> BMUnitRunner), and around conf files on disk. >> >> From here, I can see two ways forward, a) fix everything to be >> parallel ready or b) remove test.runners and parallelise with docker >> instead. >> >> All in all, I think this is kinda odd to do (a) when docker is readily >> available, especially on the CI servers where we are concerned about >> build times. >> >> For (b)… to remove everything related to 'testparallel' and >> 'test.runners' from the build.xml an example patch is here: >> https://github.com/apache/cassandra/compare/trunk...thelastpickle:mck/16587-2/trunk >> >> Then replacing 'ant task parallelism' with docker containers would be >> done something like this: >> https://github.com/apache/cassandra-builds/compare/trunk...thelastpickle:mck/16587-2/trunk >> (this is just a quick PoC, aimed at the ci-cassandra agents that have >> 4 cores and 16gb ram available to each executor, but I imagine instead >> something that spawns a number of containers based on system >> resources, like we currently do with get-cores and get-mem). Also >> worth noting the overhead here, compared with the ant approach, docker >> builds everything in each container from scratch, but this too can be >> improved easily enough. >> >> What are folks' opinions? >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org >> For additional commands, e-mail: dev-h...@cassandra.apache.org >> > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > For additional commands, e-mail: dev-h...@cassandra.apache.org >
--------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org