Re: MySQL data loss possibility with concurrent ManyToManyField saves
Le lundi 21 mars 2016 13:35:19 UTC+1, Aymeric Augustin a écrit : > > > The problem is to determine what “safe” means here :-) I’m afraid this > won’t be possible in general (at least not in a remotely efficient fashion) > because Django inevitably depends on the guarantees of the underlying > database, and those guarantees are weak at lower isolation levels. > Sure, some of it is left to interpretation. There are some tricky cases, like two conflicting updates happening at the same time. Would safe mean that they should they both succeed, even if there would be a lost update? Or would it mean that one of should them fail? But here, we have "two non-conflicting updates (no-ops, even) causes data to be lost". I bet no one would call this safe. The guarantees offered by the database may not be perfect, but my point is that those guarantees are used badly in Django. Consistent read is a "feature", it can be used to make things "safer", but in that bug, it's simply used incorrectly because of false assumptions. I reckon that given those same guarantees, and knowing how they may be different from backend to backend, they can be used to have a "safer" behavior than this. That's also why I don't think lowering the isolation level is a good idea. The higher the isolation level, the more "guarantees" to build upon there are. This bug is an unfortunate case where an assumption that is true in a lower isolation level is not anymore in a higher one. That doesn't mean that the assumption was safe to make in the first place. I applied the patch proposed by Shai (thanks!) in the ticket, that replaces the SELECT by a SELECT FOR UPDATE. This effectively fixes the bug. I think this is the right kind of fix. Fix the code, do not change the isolation level because it fixes the code by chance. -- You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.com. To post to this group, send email to django-developers@googlegroups.com. Visit this group at https://groups.google.com/group/django-developers. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/08fed447-1ad9-4e38-a23c-e170d255cb07%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: MySQL data loss possibility with concurrent ManyToManyField saves
Le mardi 22 mars 2016 00:11:31 UTC+1, Shai Berger a écrit : > I disagree. The ORM cannot keep you safe against MySql's REPEATABLE READ. >> Incidentally, to prove my point, >> this has been changed in Django 1.9 and data-loss doesn't happen anymore, >> in that same default isolation level. >> > >That indeed seems to support your point, but it is far from proving it. I'm unsure that your relentless rant against MySQL is useful in this topic. You have to acknowledge that the data-loss behavior, as far as the bug that started this topic is concerned, is fixed in 1.9 by making better queries against the database. It is also fixed by your SELECT FOR UPDATE patch. The point I'm trying to make is not that the ORM should magically be perfect against all backend, however broken they may be, but that it should guarantee the integrity of data as much as is allowed by a intelligent use of the backend. Using SELECT FOR UPDATE or whatever is done in 1.9 is a smarter use of the MySQL backend. What are you trying to prove here? That MySQL is broken? That may or may not be the case (probably not), but this is quite off-topic. Even if it doesn't adhere to the ISO/ANSI-standard for isolation levels behavior, there's not a lot you can do. MySQL is by far the most popular open-source DBMS. I hate it as much as you and would heartily prefer to use PostgreSQL, but that's a fact. There may be grounds for whining on MySQL bugtracker, but I'm not sure this would go anywhere. They probably won't care, by their popularity, they basically are the de facto SQL standard. And even if they care, they would be unwilling to break compatibility by introducing changes to how isolation levels work. And even if they do change the isolation levels, it would take some time for users to get that new MySQL version. In all cases, Django would still need fixing. The reason for that, and the major point you seem to be missing, is that > Django 1.8's DELETE-SELECT-INSERT dance for updating M2M's is not reserved > to > Django's internals; you can be quite certain that there are a *lot* of > similar > examples in user code. And that user code, making assumptions that hold > everywhere except with MRR, are not buggy. It is MySql's documented > behavior > that is insensible. > You're missing the point. First, that same "dance" in user code is nowhere near as much used as the one in Django internals. As a user, I've never ever used explicit Django transactions, at all. Have I saved models? Sure. Loads. Second, the onus is Django's ORM to get its moves right. It's on me to get mine right. I know that. I accept that. As a user, that's probably the most important reason why I'm using an ORM. Because I don't know a lot about database transactions caveats, and therefore I trust the ORM to do transactions right and not eat my data. If someday I ever need to do tricky things and I need to do transactions myself, I will take a hard look on how transactions work on my database backend, and if I get it wrong, it will be my fault. Maybe a note on Django's "Database transactions" page would be helpful to warn users about the limitations of MySQL? I can reproduce this. But frankly, I cannot understand it. Can you explain > what happens here? Point to some documentation? The way I understand it, > the > DELETE from session B either attempts to delete the pre-existing record > (in > which case, when it resumes after the lock, it should report "0 records > affected" -- that record has been already deleted) or the one inserted by > session A (in which case, the second SELECT in session B should retrieve > no > records). In either case, the SELECT in session B should not be finding > any > record which has already been deleted, because all deletions have either > been > committed (and we are in READ COMMITTED) or have been done in its own > transaction. What is going on here? > > The point of the above question, besides academic interest, is that this > behavior is also suspect of being a bug. And if it is, Django should try > to > work around it and help users work around it, but not base its strategic > decisions on its existence. > Sorry, I have no idea why this example behaves like that. I tried it, thinking "that could break, maybe", and it did, and that's it. > Again: The burden of "finding operation sequences" on Django is relatively > small -- most ORM operations are single queries. > Great, then maybe this bug is one of the last of its kind :) And a manual search for others, by finding "maybe broken" multiple-queries operations, then testing them concurrently, then if need be optimizing/fixing them would be within reach. -- You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.c
Re: MySQL data loss possibility with concurrent ManyToManyField saves
Le mardi 22 mars 2016 23:33:10 UTC+1, Shai Berger a écrit : > It is Django's job to try, as best as it can, to fulfill these expectations. How could I disagree? Of course, if there is one single sensible, obvious default, that would help 99.9 % of users, like your webserver analogy, it should be set. > MySql's REPEATABLE READ is unsuitable for use as a default by Django. This is where things are not as obvious... MySQL's REPEATABLE READ may have it flaws, it may cause repeating bugs because that level is a bit awry, with its reads and writes that don't work the same, but all in all, it IS a higher isolation level than READ COMMITTED. It DOES provide the REPEATABLE READ guarantee (for reads) that READ COMMITTED doesn't have. For each "DELETE-SELECT-INSERT" bug that happens "because of" REP READ, how many, I don't know, "SELECT-SELECT-again-and-not-have-the-same-results" bugs are prevented by it? That would be hard to say for sure. For that user-transaction behavior, I'm in favor of a documentation fix. On the "Transaction" documentation page, have a not that read something like "Each backend can be configured in multiple isolation levels, and the isolation guarantees differ by isolation level and also between backends. In particular, MySQL's REP READ is weird". Maybe offer a way to select the desired isolation level in code, for each transaction even. Caveat emptor. After all, apart from SERIALIZABLE, transactions will never be perfect. But I still want to insist on the fact that the bug discussed in the ticket is quite independent from the choice of the default isolation level. Sure, setting a lesser default isolation level fixes it coincidentally, but it can also be fixed (in two different ways, even) with a better use of the same isolation level. It shouldn't by itself justify the change of the default isolation level. -- You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.com. To post to this group, send email to django-developers@googlegroups.com. Visit this group at https://groups.google.com/group/django-developers. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/a2c9cfbe-97db-473c-9055-570fdb6a055f%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Requesting comments about bug "set_language with i18n_patterns"
Hi, I posted a bug almost a month ago, about set_language and i18n_patterns failing in some cases when used together. It didn't gather a single comment and is still in Unreviewed state, which seems most unusual for this well kept bugtracker :) https://code.djangoproject.com/ticket/26556 Do any of you have any comment or need further info? Cheers, Hugo. -- You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.com. To post to this group, send email to django-developers@googlegroups.com. Visit this group at https://groups.google.com/group/django-developers. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/b7984b47-25cc-49ba-a5b3-826b07694a59%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.