Hey there! The current code assumes that .get() will likely match one result which should be the case most of the time and limit the number of possible results to prevent catastrophic matches
Your patch has the side effect of performing an additional COUNT query for every single QuerySet.get() call. It would also likely only perform better under certain circumstances (e.g. large columns retrieved, possibility of using index only scan on COUNT). For these reasons I don't think this is a good idea. In summary QuerySet.get is currently optimized for a correct usage of its API while limiting nefarious side effects of misuses and this approach would optimize for the uncommon case. Cheers, Simon Le jeudi 2 janvier 2020 08:43:47 UTC-5, Anudeep Samaiya a écrit : > > Hi everyone, > > Happy New Year!! > > Ok so I found that Querset.get() is very slow for large datasets when > multiple objects exists in very big numbers. I did following changes in my > local copy of django code and it improved the performance for very large > data sets significantly (like in a blink of second). Didn't had any obvious > effects for a table with like 10K records or so. I don't have proper stats > to prove the performance. > > So what was the issue? > Queryset.get() raises two exceptions > 1. DoesNotExist > 2. MultipleObjectsFound > > In case Multiple objects are found, Querset.get() raises an error with how > many objects are found. To do this it was evaluating query to find length > by iterating over the queryset which was creating a bottle-neck. For small > datasets this was not abovious but for large datasets with more than 1 > million recors this was slow. > > So Instead I tried changing the method of counting using the > Queryset.count(). If count == 1, only then evaluated the query by calling > Querset._fetch_all(). The results were much than before. > > So do you think this is right way? Should I raise a pr for the patch? > > > diff --git a/django/db/models/query.py b/django/db/models/query.py > index 38c1358..e442384 100644 > --- a/django/db/models/query.py > +++ b/django/db/models/query.py > @@ -420,8 +420,9 @@ class QuerySet: > if not clone.query.select_for_update or connections[clone.db]. > features.supports_select_for_update_with_limit: > limit = MAX_GET_RESULTS > clone.query.set_limits(high=limit) > - num = len(clone) > + num = clone.count() > if num == 1: > + clone._fetch_all() > return clone._result_cache[0] > if not num: > raise self.model.DoesNotExist( > > > Thanks > > Anudeep Samaiya > -- You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/9b9e1842-b888-4ff9-a1b7-1d2d0ebc5b8b%40googlegroups.com.