On Sat, Feb 7, 2015 at 12:05 AM, Aymeric Augustin <
aymeric.augus...@polytechnique.org> wrote:

> Hello,
>
> As the test suite is growing, it’s getting slower. I’ve tried to make it
> faster by running tests in parallel.
>
> The current state of my experiment is here:
> https://github.com/django/django/pull/4063
>
> I’m distributing the tests to workers with the multiprocessing module.
> While the idea is simple, the unittest APIs make its implementation painful.
>
> ** Results **
>
> Without the patch:
>
> Ran 9016 tests in 350.610s
> ./runtests.py  355,86s user 20,48s system 92% cpu 6:48,23 total
>
> With the patch
>
> Ran 9016 tests in 125.778s
> ./runtests.py --parallel  512,31s user 29,92s system 300% cpu 3:00,73 total
>
> Since it takes almost one minute to create databases, parallelization
> makes the execution of tests go from 6 minutes to 2 minutes.
>
> This isn’t bad, but the x3 speedup is a bit disappointing given that I
> have 4 physical / 8 logical cores. Perhaps the IPC is expensive.
>
> Does anyone have insights about scaling with multiprocessing?
>

Are you resource locking on anything else? e.g., is disk access becoming
the bottleneck? Even memory throughput could potentially be a bottleneck -
are you hitting disk cache?


> ** Limitations **
>
> This technique works well with in-memory SQLite databases. Each process
> gets its own copy of the database in its memory space.
>
> It fails with on-disk SQLite databases. SQLite can’t cope with this level
> of concurrency. It timeouts while attempting to lock the database.
>
> It fails with PostgreSQL (and, I assume, other databases) because tests
> collide, for instance when they attempt to load the same fixture.
>

I've thought about (but never done anything) about this problem in the past
- my thought for this problem was to use multiple test databases, so you
have isolation. Yes this means you need to do more manual setup (createdb
test_database_1; createdb test_database_2; etc), but it means you don't
have any collision problems multiprocessing an on-disk database.


> ** Next steps **
>
> At this point, the patch improves the common use case of running
> `./runtests.py` locally to check a database-independent change, and little
> else.
>
> Do you think it would be useful to include it in Django anyway? Do you
> have concerns about the implementation? Charitably, I’ll say that “it
> works”…
>

It's definitely worth pursuing. Faster test suite == double plus good.
Multiprocessing would seem to be an obvious approach, too.

My only "concern" relates to end-of-test reporting - how are you reporting
test success/failure? Do you get a single coherent test report at the end?
Do you get progress reporting, or just "subprocess 1 has completed; 5
failures, 200 passes" at the end of a subprocess? My interest here isn't
strictly about Django - it's about tooling, and integration of a
parallelized test suite with IDEs, or tools like Cricket.

Releasing it separately as a custom test runner may be more appropriate.
>

For me, it depends on how much code we're talking about, and how invasive
you need to be on the core APIs. If the multiprocessing bit is fairly minor
(and I suspect it probably is), but you need to make bunch of invasive
changes to the test infrastructure, then you might as well include it in
Django's core as a utility. However, if the whole thing can stand alone
with minimal (or no) internal modifications - or modifications that make
sense in a general sense - then you might as well release as a standalone
package.

Yours,
Russ Magee %-)

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CAJxq84_P0eHmtmFh6WME_YnjSvVQ_G9UQHU1E3UrWeZw7%3DUApQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to