On Sat, Feb 7, 2015 at 12:05 AM, Aymeric Augustin < aymeric.augus...@polytechnique.org> wrote:
> Hello, > > As the test suite is growing, it’s getting slower. I’ve tried to make it > faster by running tests in parallel. > > The current state of my experiment is here: > https://github.com/django/django/pull/4063 > > I’m distributing the tests to workers with the multiprocessing module. > While the idea is simple, the unittest APIs make its implementation painful. > > ** Results ** > > Without the patch: > > Ran 9016 tests in 350.610s > ./runtests.py 355,86s user 20,48s system 92% cpu 6:48,23 total > > With the patch > > Ran 9016 tests in 125.778s > ./runtests.py --parallel 512,31s user 29,92s system 300% cpu 3:00,73 total > > Since it takes almost one minute to create databases, parallelization > makes the execution of tests go from 6 minutes to 2 minutes. > > This isn’t bad, but the x3 speedup is a bit disappointing given that I > have 4 physical / 8 logical cores. Perhaps the IPC is expensive. > > Does anyone have insights about scaling with multiprocessing? > Are you resource locking on anything else? e.g., is disk access becoming the bottleneck? Even memory throughput could potentially be a bottleneck - are you hitting disk cache? > ** Limitations ** > > This technique works well with in-memory SQLite databases. Each process > gets its own copy of the database in its memory space. > > It fails with on-disk SQLite databases. SQLite can’t cope with this level > of concurrency. It timeouts while attempting to lock the database. > > It fails with PostgreSQL (and, I assume, other databases) because tests > collide, for instance when they attempt to load the same fixture. > I've thought about (but never done anything) about this problem in the past - my thought for this problem was to use multiple test databases, so you have isolation. Yes this means you need to do more manual setup (createdb test_database_1; createdb test_database_2; etc), but it means you don't have any collision problems multiprocessing an on-disk database. > ** Next steps ** > > At this point, the patch improves the common use case of running > `./runtests.py` locally to check a database-independent change, and little > else. > > Do you think it would be useful to include it in Django anyway? Do you > have concerns about the implementation? Charitably, I’ll say that “it > works”… > It's definitely worth pursuing. Faster test suite == double plus good. Multiprocessing would seem to be an obvious approach, too. My only "concern" relates to end-of-test reporting - how are you reporting test success/failure? Do you get a single coherent test report at the end? Do you get progress reporting, or just "subprocess 1 has completed; 5 failures, 200 passes" at the end of a subprocess? My interest here isn't strictly about Django - it's about tooling, and integration of a parallelized test suite with IDEs, or tools like Cricket. Releasing it separately as a custom test runner may be more appropriate. > For me, it depends on how much code we're talking about, and how invasive you need to be on the core APIs. If the multiprocessing bit is fairly minor (and I suspect it probably is), but you need to make bunch of invasive changes to the test infrastructure, then you might as well include it in Django's core as a utility. However, if the whole thing can stand alone with minimal (or no) internal modifications - or modifications that make sense in a general sense - then you might as well release as a standalone package. Yours, Russ Magee %-) -- You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.com. To post to this group, send email to django-developers@googlegroups.com. Visit this group at http://groups.google.com/group/django-developers. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CAJxq84_P0eHmtmFh6WME_YnjSvVQ_G9UQHU1E3UrWeZw7%3DUApQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.