Howdy, I'd like to share some findings I made recently with regards to deferred constraint checks during test runs. There's a bit of background here (https://code.djangoproject.com/ticket/3615) and here(https://code.djangoproject.com/ticket/11665), but this is the gist of what's going on now:
As you all know, when the Django test suite runs it executes each test within a DB transaction, which is opened during setUp and rolled back during tearDown. This works great for testing, but there is a problem, which Ramiro brought up the other day on IRC and I observed as well. The problem is that by default in most of our backends, a transaction runs with constraint checks deferred. This means that foreign keys are *not* checked for integrity until a transaction is committed. But here's the rub: we never commit a transaction during tests, we roll it back instead. Therefore, none of the queries against the DB are being checked for integrity. (The exception is MySQL InnoDB, which I'll get to in a sec.) As part of the work I was doing recently to disable and re-enable constraint checks during fixture loading, I realized that we were handling this sort of incorrectly across our backends. Postgresql in fact does provide a facility for enabling constraint checks: SET CONSTRAINT CHECKS ALL IMMEDIATE. This statement causes Postgresql to behave as it normally would, when it was executing queries outside of a transaction. So I tried this out on Postgresql, and what I got when I executed the test suite was: Ran 3947 tests in 570.399s FAILED (failures=53, errors=233, skipped=16, expected failures=2) Interestingly, this looks familiar. Here's the result of the test suite on MySQL with Innodb: Ran 3948 tests in 5171.412s FAILED (failures=67, errors=213, skipped=21, expected failures=2) Pretty darn close. Anyhow, as I started to dig a bit deeper I realized something I considered to be quite important: MySQL, which has sort of been dismissed as being unable to handle constraint checks like a "real" database has in fact been running like a real database all along. The bugs that have accumulated during MySQL tests are in fact real bugs. When Postgresql is configured with constraint checks set to immediate -- and as a result behaves as it would in production -- these very same bugs are revealed. The good news is there are really just a few minor problems triggering all these errors, and most of them are bugs in the test suite rather than production code. As a matter of fact, with a bit of hacking yesterday and today, I managed to get most of this under control. As of right now, when running the test suite in a fixed branch I have, I get: Postgresql: Ran 3993 tests in 946.555s FAILED (failures=3, errors=6, skipped=17, expected failures=3) MySQL: Ran 3994 tests in 4857.777s FAILED (failures=8, errors=7, skipped=25, expected failures=3) In other words, we're in striking distance from getting all tests to pass on all backends, and I uncovered a handful of issues (again, mostly bad test code) which were hidden from sight because constraint checks weren't being run. Right now I've got everything on an experimental git branch, which is still somewhat a work in progress: https://github.com/jsdalton/django/tree/experimental_check_constraints_fixed . As I said, I was mostly hacking away to kill the bugs and cut through the fog, so at this point I don't think what i have is ready to be proposed as a patch. Note that this is a branch off of the work I did on #3615 to fix fixture loading, which is sort of an important underlying piece that had to come together to be able to flush everything out. In brief, here are the issues I uncovered, running with Postgresql. * Content types get flushed from the DB after each test. However, they remain cached in the manager. When constraint checks are working, certain tests were failing because the a Content type wasn't being created in the DB in get_for_model(). Fix was to clear the contenttype cache at the start of every test run. * Some deletion tests were erroring because constraint checks needed to be temporarily disabled during batch deletes. * Big sneaky one that I pulled most of my hair out on until I finally found it...this bad boy: https://github.com/jsdalton/django/commit/b452248a0d9a025c7aa387091fa6d6b1f51113df . This is the reason why there are hundreds of errors on MySQL when the test suite is run on master right now. Basically the monkey patched routers were not being restored properly. As a result, the 'other' database was no longer being used. Tests were failing all over the place for every test that ran after this one. Fixing this single type caused about 200 tests to start passing again in MySQL, and Postgresql as well with constraint checks immediate. The above are the issues I was able to patch up. Other fixes might be desired on the above, but again my goal was just to get tests passing again. There were some additional issues that came up that I was not able to patch so easily. Here's what's behind the remaining issues that are not causing tests to fail. * A bit hard to explain but this test: regressiontests.multiple_database.tests.RouterTestCase.test_m2m_managers and a few others like it don't work right. This is the code that's causing a problem: pro = Book.objects.using('other').create(pk=1, title="Pro Django", published=datetime.date(2008, 12, 16)) marty = Person.objects.using('other').create(pk=1, name="Marty Alchin") pro.authors = [marty] authors is a join. The problem is that the join database is in the default db, but the records created are in 'other'. The join record raises an integrity error because the records it's trying to relate to aren't in its db. I don't know if this is a bug or shouldn't work in the first place, but I didn't have an easy fix, since I'm not familiar with this part of the code base. This is a great example of an issue that has been obscured until now and probably requires a ticket and a fix of some kind from someone who is familiar with it. * Something funky is going on with a few of the serializer_regress tests. That's a big mess of a test structure and I didn't have it in me to try to sort through what was going on there. Could be a real bug, probably just a test code problem. * assertNumQueries is responsible for the rest of the failures. This is a bit of a chin scratcher for me at the moment. The failures are purely an accounting issue: the new constraint check disable/enable code runs two extra queries when it is executed, thus there are 2 more queries than expected. The problem here is that only a certain backends (right now, SQLite is missing them) support this, and as a result there are inconsistent query counts based on the backend you are using. I'm sure there's a way around it but it's not really my call. Either find a way to exclude certain queries from the count or handle it in test code. * There were a handful of other MySQL specific errors relating to collation mix ups and a few things that at some point I might try to look at. Anyhow, that's pretty much a detailed list of what i uncovered. Where to go from here? These are my recommendations, though obviously a core dev or two need to step in here and make the real decisions. * The fixture loading and constraint checking code might need a bit of cleanup and tweaking here and there but is for the most part ready. Getting this committed and in place is a good first step. * A decision needs to be made about whether constraint checks should be set to immediate at the start of every test. MySQL InnoDB already does this, and Oracle and Postgresql are capable of doing it now by explicitly setting this in fixtureSetUp(). SQLite cannot (and will not, as far as I can tell) support this. In other words, without a new creative solution SQLite will continue to allow bad data to be entered during testing, as it does presently. MySQL MyISAM is in the same boat. I am going to come out very strongly in favor of this. It's required to make tests be more realistic, as Ramiro also discovered and pointed out separately from me. A big concern that he brought up over IRC is that it could cause some user tests to break, if these tests had previously been inserting bad data during testing. * Assuming the above goes forward, that means we have test failures in the main test suites. Some of them do not have an immediate fixes. The fixes I put together in my hacking may not fly or may require independent discussion before implementing. The other issues I couldn't fix need work. My recommendation -- though I don't know if this is unacceptable -- would be to commit the change that enables immediate constraint checks and then file blocker-level tickets on the issues that are prevented from passing. It seems like people will need that code in place in order to be able to exercise the code properly and get to the root of those issues. I guess an alternative would be a new branch but I know we're on SVN and that can be unpleasant. Anyhow, that's about it. I think there are probably a number of other tickets floating around out there (Ramiro pointed out a few to me) that relate to all this mess. Again the good news is I didn't uncover anything horrific, I think it's mostly an issue of cleaning up under the covers a bit and moving forward with improved realism in DB testing. Also great news is that we are definitely in striking distance of all backends passing the entire test suite, for real. As someone whose company uses MySQL in production this would be pretty awesome for me, especially since it would help continue to address some other MySQL issues that have been hard to tackle because running those tests is such a nightmare. Next week I have a ton of other work I need to attend to (the kind I get paid to do that is :) ) so i won't have a lot of time I can spend on this. But I hope I've done at least a passing job at uncovering the issue and presenting it clearly enough that you guys can all agree on a clear path forward. I know Ramiro has been pretty deeply involved in exploring this issue as well, so he might have some perspective or opinion of his own he'd like to share. If i flubbed any of the above or anyone requires clarification please do let me know. Apologies for the lengthy post but hopefully it was all pertinent to the issue at hand. Cheers, Jim Dalton -- You received this message because you are subscribed to the Google Groups "Django developers" group. To view this discussion on the web visit https://groups.google.com/d/msg/django-developers/-/kA7LoSB12MIJ. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.