I'm on board merging this without Oracle support - if I understand right, it means we just have to run those non-parallel for now, so we're not losing anything, right?
Andrew On Sun, Aug 30, 2015 at 1:19 PM, Aymeric Augustin < aymeric.augus...@polytechnique.org> wrote: > Hello, > > I polished my test parallelization patch a bit. I'd like to merge it before > 1.9 alpha. I rewrote history heavily to make the changes easier to review: > https://github.com/django/django/pull/4761 > > Eventually I settled for the approach I dismissed in a previous email: run > tests in workers, pass "events" back to the master process, feed them to > the > master test runner. I depend on the `tblib` package to make tracebacks > pickleable. > > The patch is still missing support for Oracle, despite significant help I > received from Shai and others. I'm proposing to merge it anyway because I > don't want to delay it indefinitely. It's possible to implement the missing > methods on Oracle at any point in the future. > > I used various techniques to make tests safe for concurrent execution on a > case by case basis. When possible, I moved all filesystem writes to a > temporary directory. When not possible, I serialized execution of > conflicting > test cases by locking a file. > > If you have concerns about adding this feature to Django or about the > implementation -- which isn't the most beautiful code I've written -- now > is a > good time to bring them up. I’d like to merge the patch in one or two > weeks. > > Also it would be interesting to try parallelization on test suites other > than > Django's own. Just apply my PR and run `django-admin test --parallel` or > `django-admin test --parallel-num=N`. (Unfortunately I suspect few projects > are compatible with Django's development version and have a large test > suite.) > > Thanks! > > -- > Aymeric. > > > > > On 22 févr. 2015, at 16:29, Aymeric Augustin < > aymeric.augus...@polytechnique.org> wrote: > > > > **tl;dr** I can run the full test suite in 85 seconds on SQLite, a 4.8x > speedup. > > > > > > Hello, > > > > Since I last wrote about this project, I improved parallelization by: > > > > - reworking the IPC to avoid exchanging tracebacks > > - implementing database duplication for SQLite, PostgreSQL and MySQL > > > > The code is still rough. Several options of runtests.py don't work with > > parallelization. > > > > Initially performance was disappointing. I couldn’t max out my cores > during > > the whole run. With some basic monitoring I noticed that the CPU load > > plummeted when there were spikes of disk writes. This led me to believe > I was > > disk I/O bound even with an in-memory database and a SSD. > > > > So I started optimizing disk I/O, which means doing less I/O and doing > it only > > in a RAM-mounted temporary directory. Writing to RAM instead of writing > to > > disk helps a lot. It’s as simple as creating a RAMdisk (that depends on > your > > OS) and pointing the TMPDIR environment variable to the RAMdisk. > > > > Unfortunately, i18n and migrations management commands write in the > > application directories. I haven't found (yet) a way to point them to a > > temporary directory instead. > > > > My 2012 MacBook Pro with a 2.3 GHz Intel Core i7 (4 cores, 8 threads) > takes: > > > > - 30 seconds for creating the two databases > > > > - 240 seconds to run the actual tests in a single process > > - 72 seconds in 4 processes (3.3x faster) > > - 60 seconds in 6 processes (4x faster) > > - 55 seconds in 8 processes (4.4x faster) > > > > That looks quite close to what my hardware can do. Hyperthreading > doesn't help > > as much as multiple cores when it comes to running multiple processes > and the > > synchronization costs increase with the number of processes. > > > > Creating the database accounts for more than a third of the total > runtime. I'm > > not sure how much time is spent in the migrations framework and how much > doing > > the table creations. --keepdb helps but it requires an on-disk database. > An > > on-RAMdisk database would most likely be the best option. I haven't > tried it > > yet. It should work with at least SQLite and PostgreSQL (using > tablespaces). > > > > If you want to help, I'd be interested in: > > > > - reports of whether parallelization works for test suites other than > Django's > > own -- apply my pull request and run `django-admin test --parallel` or > > `django-admin test --parallel-num=N` > > - a patch implementing database duplication on Oracle > > > > Let me know if you have questions or concerns. > > > > -- > > Aymeric. > > -- > You received this message because you are subscribed to the Google Groups > "Django developers (Contributions to Django itself)" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to django-developers+unsubscr...@googlegroups.com. > To post to this group, send email to django-developers@googlegroups.com. > Visit this group at http://groups.google.com/group/django-developers. > To view this discussion on the web visit > https://groups.google.com/d/msgid/django-developers/F7BA7C0F-3214-42E2-AAD2-B90B62EC07E2%40polytechnique.org > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.com. To post to this group, send email to django-developers@googlegroups.com. Visit this group at http://groups.google.com/group/django-developers. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CAFwN1upDg_-KLWB5sBVo-4OVE1shXT58_aXjg5D5AsFgC5TQyg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.