#33003: Micro-optimisation for QuerySet._chain
-------------------------------------+-------------------------------------
Reporter: Keryn | Owner: Keryn Knight
Knight |
Type: | Status: assigned
Cleanup/optimization |
Component: Database | Version: dev
layer (models, ORM) |
Severity: Normal | Keywords:
Triage Stage: | Has patch: 0
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
Whilst working on #28455 I noticed that `_chain` accepts `kwargs`, but
that appears to only be to facilitate pickling, with
`Prefetch.__getstate__` making use of the functionality. Thus most of the
time, kwargs is empty and there's no reason to touch/update the
`__dict__`. And it's faster not to do so. Note that in comparison with the
benefits of #28455 this is absolutely peanuts, but still, it's unrelated
to that ticket so here we are:
{{{
In [1]: class A:
...: def __init__(self, **kwargs):
...: self.__dict__.update(**kwargs)
In [2]: a = A(a=1, b=2, c=3, d=4)
In [3]: empty = {}
In [4]: full = {'d': 3, 'c': 1}
In [5]: %timeit a.__dict__.update(empty)
91.2 ns ± 1.25 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops
each)
In [6]: %timeit a.__dict__.update(full)
133 ns ± 0.92 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops
each)
}}}
If we just ''check'' the dictionary isn't empty (in `_chain` that's
`kwargs`):
{{{
In [7]: %timeit if empty: a.__dict__.update(empty)
19 ns ± 0.377 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops
each)
In [8]: %timeit if full: a.__dict__.update(full)
156 ns ± 1.33 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops
each)
}}}
You can see the paying the basic cost of an if (for me, that appears to be
`20ns` for an empty dict, and more like `6ns` for an interned/singleton
value like `1` or `True` or `''` as a baseline) is worthwhile, and having
pickling a Prefetch be `20ns` slower isn't too bad, given it's not exactly
the most common use case...
The difference is small but can be made more obvious by checking the
cProfile timings over many (''many'') iterations; here's the before across
100 users each with a group and permission:
{{{
In [2]: %prun -stime -l_chain for _ in range(1000):
tuple(User.objects.prefetch_related('groups', 'user_permissions',
'groups__permissions'))
48095003 function calls (46202003 primitive calls) in 34.811 seconds
Ordered by: internal time
List reduced from 384 to 1 due to restriction <'_chain'>
ncalls tottime percall cumtime percall filename:lineno(function)
312000 4.070 0.000 8.791 0.000 query.py:1325(_chain)
In [3]: %timeit -n1000 -r10 tuple(User.objects.prefetch_related('groups',
'user_permissions', 'groups__permissions'))
18.3 ms ± 391 µs per loop (mean ± std. dev. of 10 runs, 100 loops each)
}}}
And after applying the change as follows:
{{{
def _chain(...):
...
if kwargs:
obj.__dict__.update(kwargs)
...
}}}
we end up with:
{{{
In [2]: %prun -stime -l_chain for _ in range(1000):
tuple(User.objects.prefetch_related('groups', 'user_permissions',
'groups__permissions'))
47779003 function calls (45886003 primitive calls) in 33.220 seconds
Ordered by: internal time
List reduced from 384 to 1 due to restriction <'_chain'>
ncalls tottime percall cumtime percall filename:lineno(function)
312000 0.267 0.000 5.644 0.000 query.py:1325(_chain)
In [3]: %timeit -n1000 -r10 tuple(User.objects.prefetch_related('groups',
'user_permissions', 'groups__permissions'))
18.1 ms ± 289 µs per loop (mean ± std. dev. of 10 runs, 100 loops each)
}}}
I have to assume that the overall timing differences measured by cProfile
and timeit are both just standard fluctuations, despite cProfile's
`tottime` reporting there.
Branch/PR to follow. It won't be a shock if the diff is 3 lines long,
given the change is shown above.
--
Ticket URL: <https://code.djangoproject.com/ticket/33003>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
--
You received this message because you are subscribed to the Google Groups
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/django-updates/052.9e039a6558d8e5ef01abdf9b7476a453%40djangoproject.com.