Hey there!

The current code assumes that .get() will likely match one result which 
should be the case
most of the time and limit the number of possible results to prevent 
catastrophic matches

Your patch has the side effect of performing an additional COUNT query for 
every single
QuerySet.get() call. It would also likely only perform better under certain 
circumstances
(e.g. large columns retrieved, possibility of using index only scan on 
COUNT). For these
reasons I don't think this is a good idea.

In summary QuerySet.get is currently optimized for a correct usage of its 
API while limiting
nefarious side effects of misuses and this approach would optimize for the 
uncommon case.

Cheers,
Simon

Le jeudi 2 janvier 2020 08:43:47 UTC-5, Anudeep Samaiya a écrit :
>
> Hi everyone,
>
> Happy New Year!!
>
> Ok so I found that Querset.get() is very slow for large datasets when 
> multiple objects exists in very big numbers. I did following changes in my 
> local copy of django code and it improved the performance for very large 
> data sets significantly (like in a blink of second). Didn't had any obvious 
> effects for a table with like 10K records or so. I don't have proper stats 
> to prove the performance.
>
> So what was the issue?
> Queryset.get() raises two exceptions
> 1. DoesNotExist
> 2. MultipleObjectsFound
>
> In case Multiple objects are found, Querset.get() raises an error with how 
> many objects are found. To do this it was evaluating query to find length 
> by iterating over the queryset which was creating a bottle-neck. For small 
> datasets this was not abovious but for large datasets with more than 1 
> million recors this was slow. 
>
> So Instead I tried changing the method of counting using the 
> Queryset.count(). If count == 1,  only then evaluated the query by calling 
> Querset._fetch_all(). The results were much than before.
>
> So do you think this is right way? Should I raise a pr for the patch?
>
>
> diff --git a/django/db/models/query.py b/django/db/models/query.py
> index 38c1358..e442384 100644
> --- a/django/db/models/query.py
> +++ b/django/db/models/query.py
> @@ -420,8 +420,9 @@ class QuerySet:
>          if not clone.query.select_for_update or connections[clone.db].
> features.supports_select_for_update_with_limit:
>              limit = MAX_GET_RESULTS
>              clone.query.set_limits(high=limit)
> -        num = len(clone)
> +        num = clone.count()
>          if num == 1:
> +            clone._fetch_all()
>              return clone._result_cache[0]
>          if not num:
>              raise self.model.DoesNotExist(
>
>
> Thanks
>
> Anudeep Samaiya
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/9b9e1842-b888-4ff9-a1b7-1d2d0ebc5b8b%40googlegroups.com.

Reply via email to