Suggested Oracle double-fetch fix (was Re: Can we drop Oracle 9 support?)

Shai Berger Thu, 24 Nov 2011 14:01:05 -0800

On Thursday 24 November 2011, I wrote:
> On Tuesday 22 November 2011, Anssi Kääriäinen wrote:
> > I can do a patch for this if needed. Are you (or somebody else)
> > working on this?
> 
> While you're at it, [...]


While we're all at it, I would like to bring to your attention another 
suggested fix for Oracle. The patch attached here tells the backend that if a 
fetch operation retrieved less than the requested array size, then no further 
fetch attempts are necessary.

The fix is motivated by our DBA's finding, that many single-row-selects are 
followed by double fetches. When I checked the code, I realized that with the 
exception of aggregates, this is true for all single-row selects: whether it 
originates from qset.get(...) or from qset[j], the QuerySet code adjusts the 
SQL and then calls fetchmany in a loop until no results are returned. Since 
neither cx_oracle nor the Oracle client library take it upon themselves to 
stop such requests, they go all the way to the database.

The Python DB-API[0] says on fetchmany:
 """
    The method should try to fetch as many rows as indicated by the size
    parameter. If this is not possible due to the specified number of rows not
    being available, fewer rows may be returned.
"""
I interpret that as "no rows should be fetched after a call failed to fill the 
buffer". Under a loose-enough transaction isolation level, rows may "come into 
being" in mid-transaction, but I don't think it is reasonable to expect them 
to do so in mid-statement.

That being said, the problem that this patch fixes is a little hard to test 
for; it essentially requires mocking the database library. Also, I'm not sure 
this is the right fix; I don't know if other backends have the same issue. If 
they do, it might be better to make qset[j] just use fetchone instead of 
fetchmany, like the aggregates do; and it might be worthwhile to also change 
the qset.get() logic (in a more subtle way -- we don't want to lose the check 
for multiple records returned).

As for the patch, it is only "tested" in the sense that one Django-1.2 based 
application continues to run correctly with (essentially) it; I don't(yet) 
have an Oracle+Trunk testing environment.

Looking forward to your comments,

Shai.


[0] http://www.python.org/dev/peps/pep-0249/

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Index: django/db/backends/oracle/base.py
===================================================================
--- django/db/backends/oracle/base.py	(revision 17145)
+++ django/db/backends/oracle/base.py	(working copy)
@@ -643,6 +643,7 @@
     charset = 'utf-8'
 
     def __init__(self, connection):
+        self._all_fetched = False
         self.cursor = connection.cursor()
         # Necessary to retrieve decimal values without rounding error.
         self.cursor.numbersAsStrings = True
@@ -664,6 +665,7 @@
         return [p.smart_str for p in params]
 
     def execute(self, query, params=None):
+        self._all_fetched = False
         if params is None:
             params = []
         else:
@@ -688,6 +690,7 @@
             raise utils.DatabaseError, utils.DatabaseError(*tuple(e)), sys.exc_info()[2]
 
     def executemany(self, query, params=None):
+        self._all_fetched = False
         try:
             args = [(':arg%d' % i) for i in range(len(params[0]))]
         except (IndexError, TypeError):
@@ -720,10 +723,15 @@
         return _rowfactory(row, self.cursor)
 
     def fetchmany(self, size=None):
+        if self._all_fetched:
+            return DatabaseFeatures.empty_fetchmany_value
         if size is None:
             size = self.arraysize
-        return tuple([_rowfactory(r, self.cursor)
-                      for r in self.cursor.fetchmany(size)])
+        rows = tuple(_rowfactory(r, self.cursor)
+                     for r in self.cursor.fetchmany(size))
+        if len(rows) < size:
+            self._all_fetched = True
+        return rows
 
     def fetchall(self):
         return tuple([_rowfactory(r, self.cursor)

Suggested Oracle double-fetch fix (was Re: Can we drop Oracle 9 support?)

Reply via email to