On 19 helmi, 02:31, Carl Meyer <c...@oddbird.net> wrote: > On 02/18/2013 02:27 PM, Aymeric Augustin wrote: > > > Problem #1: Is it worth re-executing the connection setup at the beginning > > of > > every request? > > > The connection setup varies widely between backends: > > - SQLite: none > > - > > PostgreSQL:https://github.com/django/django/blob/master/django/db/backends/postg... > > - > > MySQL:https://github.com/django/django/blob/master/django/db/backends/mysql... > > - > > Oracle:https://github.com/django/django/blob/master/django/db/backends/oracl... > > > The current version of the patch repeats it every time. In theory, this > > isn't > > necessary. Doing it only once would be more simple. > > > It could be backwards incompatible, for instance, if a developer changes the > > connection's time zone. But we can document to set CONN_MAX_AGE = 0 to > > restore > > the previous behavior in such cases. > > It seems silly to re-run queries per-request that were only ever > intended to be per-connection setup. So I favor the latter. I think this > change will require prominent discussion in the > potentially-backwards-incompatible section of the release notes regardless. > > (This could form an argument for keeping the built-in default 0, and > just setting it to a higher number in the project template. But I don't > like polluting the default project template, and I think a majority of > existing upgrading projects will benefit from this without problems, so > I don't actually want to do that.)
Maybe we need another setting for what to do in request.start. It does seem somewhat likely that users could do SET SEARCH_PATH in middleware to support multitenant setups for example, and expect that set to go away when connection is closed after the request. Any other SET is a likely candidate for problems in PostgreSQL, and I am sure other DBs have their equivalent of session state, too. (In this case doing RESET ALL; run connection setup again is the right thing to do in PostgreSQL). It would be nice to support this use case, but just documenting this change clearly in the release notes, and point out that if you have such requirements, then set max_age to 0. More features can always be added later on. > > Problem #2: How can Django handle situations where the connection to the > > database is lost? > > > Currently, with MySQL, Django pings the database server every time it > > creates > > a cursor, and reconnects if that fails. This isn't a good practice and this > > behavior should be removed:https://code.djangoproject.com/ticket/15119 > > > Other backends don't have an equivalent behavior. If a connection was > > opened, > > Django assume it works. Worse, if the connection breaks, Django fails to > > close > > it, and keeps the broken connection instead of opening a new one: > >https://code.djangoproject.com/ticket/15802 > > > Thus, persistent connections can't make things worse :) but it'd be nice to > > improve Django in this area, consistently across all backends. > > > I have considered four possibilities: > > > (1) Do nothing. At worst, the connection will be closed after max_age and > > then > > reopened. The worker will return 500s in the meantime. This is the > > current > > implementation. > > > (2) "Ping" the connection at the beginning of each request, and if it > > doesn't > > work, re-open it. As explained above, this isn't a good practice. Note > > that if Django repeats the connection setup before each request, it can > > take this opportunity to check that the connection works and reconnect > > otherwise. But I'm not convinced I should keep this behavior. > > > (3) Catch database exceptions in all appropriate places, and if the > > exception > > says that the connection is broken, reconnect. In theory this is the > > best > > solution, but it's complicated to implement. I haven't found a > > conclusive > > way to identify error conditions that warrant a reconnection. > > > (4) Close all database connections when a request returns a 500. It's a bad > > idea because it ties the HTTP and database layers. It could also hide > > problems. > > I'd be inclined to go for (1), with the intent of moving gradually > towards (3) as specific detectable error conditions that happen in real > life and do warrant closing the connection and opening a new one are > brought to our attention. Unfortunately handling those cases is likely > to require parsing error messages, as pretty much anything related to > DBAPI drivers and error conditions does :/ > > I tried to dig for the origins of the current MySQL behavior to see if > that would illuminate such a case, but that code goes way back into the > mists of ancient history (specifically, merger of the magic-removal > branch), beyond which the gaze of "git annotate" cannot penetrate. > > Option (4) is very bad IMO, and (2) is not much better. I hope this discussion is about what to do at request finish/start time. I am very strongly opposed to anything where Django suddenly changes connections underneath you. At request finish/start this is OK (you should expect new connections then anyways), but otherwise if you get broken connection, it isn't Django's business to swap the connection underneath you. There is a reasonable expectation that while you are using single connections[alias] in a script for example, you can expect the underlying connection to be the same for the whole time. Otherwise SET somevar in postgresql could break for example. Now, one could argument that SET somevar should not be used with Django's connections. But this is very, very limiting for some real world use cases (multitenant and SET SEARCH_PATH for one). There is no way to actually force such a limitation, so the limitation would be documentation only. In addition the result is that very rarely you get a weird (potentially data corrupting) problem because your connection was swapped at the wrong moment. Nearly impossible to debug (especially if this is not logged either). If the connection swapping is still wanted, then there must at least be a way to tell Django that do NOT swap connections unless told to do so. I think a good approach would be to mark the connection potentially broken on errors in queries, and then in request_finished check for this potentially broken flag. If flag set, then and only then run ping() / select 1. So, this is a slight modification of no. 3 where one can mark the connection potentially broken liberally, but the connection is swapped only when the ping fails, and only in request_finished. For most requests there should be no overhead as errors in queries are rare. BTW the remark above in Aymeric's post that persistent connections can't make things worse: I don't believe this. Persistent connections will keep the broken connection from request to request, and at least on PostgreSQL a broken connection is correctly closed in request finish. - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.com. To post to this group, send email to django-developers@googlegroups.com. Visit this group at http://groups.google.com/group/django-developers?hl=en. For more options, visit https://groups.google.com/groups/opt_out.