On 17 helmi, 13:24, Aymeric Augustin <aymeric.augus...@polytechnique.org> wrote: > **tl;dr** I believe that persistent database connections are a good idea. > Please prove me wrong :) > > -------------------- > > Since I didn't know why the idea of adding a connection pooler to Django was > rejected, I did some research before replying to the cx_Oracle SessionPooling > thread. > > The best explanation I've found is from Russell: > > > To clarify -- we've historically been opposed to adding connection > > pooling to Django is for the same reason that we don't include a web > > server in Django -- the capability already exists in third party > > tools, and they're in a position to do a much better job at it than us > > because it's their sole focus. Django doesn't have to be the whole > > stack. > > All the discussions boil down to this argument, and the only ticket on the > topic is short on details:https://code.djangoproject.com/ticket/11798 > > -------------------- > > The connection pools for Django I've looked at replace "open a connection" by > "take a connection from the pool" and "close a connection" by "return the > connection to the pool". This isn't "real" connection pooling: each worker > holds a connection for the entire duration of each request, regardless of > whether it has an open transaction or not. > > This requires as many connection as workers, and thus is essentially > equivalent to persistent database connections, except connections can be > rotated among workers. > > Persistent connections would eliminate the overhead of creating a connection > (IIRC ~50ms/req), which is the most annoying symptom, without incurring the > complexity of a "real" pooler. > > They would be a win for small and medium websites that don't manage their > database transactions manually and where the complexity of maintaining an > external connection pooler isn't justified. > > Besides, when Django's transaction middelware is enabled, each request is > wrapped in a single transaction, which reserves a connection. In this case, a > connection pooler won't perform better than persistent connections. > > Obviously, large websites should use an external pooler to multiplex their > hundreds of connections from workers into tens of connections to their > database and manage their transactions manually. I don't believe persistent > connections to the pooler would hurt in this scenario, but if it does, it > could be optional. > > -------------------- > > AFAICT there are three things to take care of before reusing a connection: > > 1) restore a pristine transaction state: transaction.rollback() should do; > > 2) reset all connection settings: the foundation was laid in #19274; > > 3) check if the connection is still alive, and re-open it otherwise: > - for psycopg2: "SELECT 1"; > - for MySQL and Oracle: connection.ping(). > > Some have argued that persistent connections tie the lifetime of databases > connections to the lifetime of workers, but it's easy to store the creation > timestamp and re-open the connection if it exceeds a given max-age. > > So -- did I miss something?
I am not yet convinced that poolers implemented inside Django core are necessary. A major reason for doing #19274 was to allow somewhat easy creation of 3rd party connection poolers. I don't see transactional connection pooling as something that forces including connection pools into Django. Transactional pooler implementation should be possible outside Django. On every execute check which connection to use. When inside transaction, then use the connection tied to the transaction, otherwise take a free connection from the pool and use that. The big problem is that Django's transaction handling doesn't actually know when the connection is inside transaction. Fix this and doing transactional poolers external to Django will be possible. (Tying the connection to transaction managed blocks could work, but then what to do for queries outside any transaction managed block?). Instead of implementing poolers inside Django would it be better to aim for DBWrapper subclassing (as done in #19274)? The subclassing approach has some nice properties, for example one could implement "rewrite to prepared statements" feature (basically, some queries will get automatically converted to use prepared statements on execution time). This setup should result in nice speedups for some use cases. Another implementation idea is to have the DB settings contain a 'POOLER' entry. By default this entry is empty, but when defined it points to a class that has a (very limited) pooler API: lend_connection(), release_connection() and close_all_connections() (the last one is needed for test teardown). And then the connections itself could have .reset(), .ping() and so on methods. This is simple and should also be extensible. It seems SQLAlchemy has a mature pooling implementation. So, yet another approach is to see if SQLAlchemy's pooling implementation could be reused. (Maybe in conjunction with the above 'POOLER' idea). I also do believe that persistent database connections are a good idea. I don't yet believe the implementation must be in Django core... - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.com. To post to this group, send email to django-developers@googlegroups.com. Visit this group at http://groups.google.com/group/django-developers?hl=en. For more options, visit https://groups.google.com/groups/opt_out.