Prefetch and avoiding huge IN clauses

2015-09-25 Thread Erik Cederstrand
Hi devs,

When prefetching related items for a queryset returning a large amount of 
items, the generated SQL can be quite inefficient. Here's an example:

class Category(models.Model):
type = models.PositiveIntegerField(db_index=True)


class Item(models.Model):
   category = models.ForeignKey(Category, related_name='%(class)ss')


If I have 50.000 categories of type=5, then 
"Category.objects.filter(type=5).prefetch_related('items')" will generate 
an SQL on the Item table with an IN clause containing the 50.000 Category 
ID's. This is because get_prefetch_queryset() on the Manager classes in 
django.db.models.related all do this:

def get_prefetch_queryset(self, instances, queryset=None):
[...]
query = {'%s__in' % self.query_field_name: instances}
queryset = queryset._next_is_sticky().filter(**query)
[...]


While this is great when instances is a short list, we can hope to do 
better than that when instances is large. A much more efficient query in 
the above case would be queryset._next_is_sticky().filter(category__type=5). 
We just need to make sure this alternative query will fetch (at least) the 
same items as the query with the IN clause would.

I thought I could use Prefetch('items', 
queryset=Item.objects.filter(category__type=5))to accomplish this, but 
while the custom queryset *is* used, get_prefetch_queryset() still adds the 
IN clause unconditionally. This is A) redundant B) sends a huge SQL string 
over the wire for the database to process, and C) seriously messes up the 
database query planner, often generating an inefficient execution plan.

I would like to fix this problem. The easiest way seems to be to let 
Prefetch() tell get_prefetch_queryset() to skip the IN clause. 
django.db.models.prefetch_one_level() is in charge of passing the Prefetch 
queryset to get_prefetch_queryset(), so one option would be to add a 
skip_in_clause attribute to the Prefetch model which prefetch_one_level() 
would pass to get_prefetch_queryset(). The latter could then do:

def get_prefetch_queryset(self, instances, queryset=None, skip_in_clause=
False):
[...]
if not skip_in_clause:
query = {'%s__in' % self.query_field_name: instances}
queryset = queryset._next_is_sticky().filter(**query)
[...]


In my preliminary testing on real data, this:

Category.objects.filter(type=5).prefetch_related(Prefetch('items', queryset=
Item.objects.filter(category__type=5), skip_in_clause=True))

is about 20x faster than the current implementation when the query returns 
50.000 categories. The improvement would be greater on larger datasets.

Any comments? If needed, I can provide a pull request with a suggested 
implementation. Apart from the Prefetch class and prefetch_one_level(), 
there are 4 instances of get_prefetch_queryset() in 
django.db.models.related that would need to be changed.


Thanks,
Erik

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/8fe6b600-854c-491e-9acf-e13399d1eafb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Skip IN clause on prefetch queries?

2015-09-25 Thread Erik Cederstrand
Hi devs,

My lengthy email to this list about improving performance of 
prefetch_related() seems to have disappeared. Instead, I created a ticket 
motivating my need for Prefetch() to be able to tell the prefetched to 
trust the queryset provided by Prefetch() and not generate a huge and 
unnecessary IN clause:

https://code.djangoproject.com/ticket/25464

I'd appreciate any feedback so I can improve the patch!

Thanks,
Erik

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/5a536f45-8fad-42aa-9e5e-1aae0db16fab%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Skip IN clause on prefetch queries?

2015-09-25 Thread Tim Graham
Regarding the "missing message", it's there now. If you are a first-time 
poster, your mail is held in a moderation queue.

On Friday, September 25, 2015 at 6:35:34 AM UTC-4, Erik Cederstrand wrote:
>
> Hi devs,
>
> My lengthy email to this list about improving performance of 
> prefetch_related() seems to have disappeared. Instead, I created a ticket 
> motivating my need for Prefetch() to be able to tell the prefetched to 
> trust the queryset provided by Prefetch() and not generate a huge and 
> unnecessary IN clause:
>
> https://code.djangoproject.com/ticket/25464
>
> I'd appreciate any feedback so I can improve the patch!
>
> Thanks,
> Erik
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/33e1e6c7-9507-4f81-bfff-47da2ea63607%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


simplifying the install page

2015-09-25 Thread Tim Graham
The install page mentions several different ways to install Django, from 
pip install (recommended), to `setup.py install`, to symlinking the Django 
checkout in your site-packages. Do you see any reason to keep the latter 
methods instead of recommending pip all the time? 

https://docs.djangoproject.com/en/1.8/topics/install/

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/1b837a88-1b55-4452-ab5f-94ca53a24c86%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: simplifying the install page

2015-09-25 Thread Daniele Procida
On Fri, Sep 25, 2015, Tim Graham  wrote:

>The install page mentions several different ways to install Django, from 
>pip install (recommended), to `setup.py install`, to symlinking the Django 
>checkout in your site-packages. Do you see any reason to keep the latter 
>methods instead of recommending pip all the time? 
>
>https://docs.djangoproject.com/en/1.8/topics/install/

I can't see any good reason to keep symlink method, but maybe I am missing 
something.

I'd prefer to see the use of virtualenv built-in to the process from the very 
start, and a quickstart version of the instructions as the first thing that 
people see:

virtualenv django-virtualenv
source django-virtualenv/bin/activate
pip install django

I know that's not so simple, and that Windows users need to do something a bit 
different, but at least seeing something that simple near the top of the 
document would be a promise that installation *can* be that easy.

Daniele

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/20150925170726.2021833171%40mail.wservices.ch.
For more options, visit https://groups.google.com/d/optout.


Re: simplifying the install page

2015-09-25 Thread Collin Anderson
Yeah, I bet we could get rid of the entire "Installing an official
release manually" section, as I assume we don't actually want to
recommend that. Also, the "Installing the development version" section
outlines a more manual way already.

It makes sense to me to tell people to "install virtualenv" rather
than to "install pip".

We could might want to link to virtualenv's instructions for
installing itself. They show a way of downloading and using virtualenv
without root, which some beginners (I'm thinking college students)
might not have.
https://virtualenv.pypa.io/en/latest/installation.html

It's worth noting too, that although newer versions of python include
pip and venv, some distros have decided not to install pip and venv by
default.

On Fri, Sep 25, 2015 at 1:07 PM, Daniele Procida  wrote:
> On Fri, Sep 25, 2015, Tim Graham  wrote:
>
>>The install page mentions several different ways to install Django, from
>>pip install (recommended), to `setup.py install`, to symlinking the Django
>>checkout in your site-packages. Do you see any reason to keep the latter
>>methods instead of recommending pip all the time?
>>
>>https://docs.djangoproject.com/en/1.8/topics/install/
>
> I can't see any good reason to keep symlink method, but maybe I am missing 
> something.
>
> I'd prefer to see the use of virtualenv built-in to the process from the very 
> start, and a quickstart version of the instructions as the first thing that 
> people see:
>
> virtualenv django-virtualenv
> source django-virtualenv/bin/activate
> pip install django
>
> I know that's not so simple, and that Windows users need to do something a 
> bit different, but at least seeing something that simple near the top of the 
> document would be a promise that installation *can* be that easy.
>
> Daniele
>
> --
> You received this message because you are subscribed to the Google Groups 
> "Django developers  (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to django-developers+unsubscr...@googlegroups.com.
> To post to this group, send email to django-developers@googlegroups.com.
> Visit this group at http://groups.google.com/group/django-developers.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/django-developers/20150925170726.2021833171%40mail.wservices.ch.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CAFO84S7ejDMLkkNZvX0j9ubLPStuVP%3D%3DiWNajb6f78P5e-ANow%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Some love for admindocs

2015-09-25 Thread Tim Graham
It might be interesting to fork it and develop your vision as a separate 
project so you can iterate more quickly. At some point we could evaluate 
whether the existence of an actively maintained external package would 
allow us to deprecate the version in Django itself (as has been previously 
proposed but rejected because it doesn't have a big maintenance overhead 
and some people like it).

Here's that discussion: 
https://groups.google.com/d/topic/django-developers/0-qFyCPuSRs/discussion

On Sunday, September 20, 2015 at 8:39:29 PM UTC-4, Žan Anderle wrote:
>
> Hi folks!
>
> I've been thinking about admindocs lately and that they would really 
> deserve more attention than they currently get. It's a quite useful feature 
> and I think a very underrated one.
>
> They were initially there to provide documentation for 'front-end people', 
> when working on templates. While this may still be useful, we've already 
> discussed that nowadays it's more common to use admindocs as internal 
> documentation for developers. I think I'd be great if this position was 
> reaffirmed inside the Django project. Some of the ways we could do that:
>
> 1. Have that position more clearly stated in the documentation for 
> admindocs
> 2. Mention admindocs in other parts of the Django documentation. For 
> example, it seems that following FK relations is often a lot easier with 
> admindocs. Maybe that should be mentioned somewhere?
> 3. Mention admindocs in the tutorial. Not sure about this one, since it 
> could just add unnecessary weight. But it might also make life easier for 
> some people.
> 4. Add some possible new features:
>
>- documenting management commands 
>- having some kind of README per app. Tim Baxter provided an 
> example/possible 
>implementation 
>
> 
>  
>of this on the irc channel. 
>
> Anyway, I'd love to hear your thoughts.
>
> Best,
> Žan
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/316ccf29-743e-4ed7-8096-9d5af172413e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: simplifying the install page

2015-09-25 Thread Tim Graham
Thanks for the feedback. Here's a proposal: 
https://github.com/django/django/pull/5360

On Friday, September 25, 2015 at 1:30:15 PM UTC-4, Collin Anderson wrote:
>
> Yeah, I bet we could get rid of the entire "Installing an official 
> release manually" section, as I assume we don't actually want to 
> recommend that. Also, the "Installing the development version" section 
> outlines a more manual way already. 
>
> It makes sense to me to tell people to "install virtualenv" rather 
> than to "install pip". 
>
> We could might want to link to virtualenv's instructions for 
> installing itself. They show a way of downloading and using virtualenv 
> without root, which some beginners (I'm thinking college students) 
> might not have. 
> https://virtualenv.pypa.io/en/latest/installation.html 
>
> It's worth noting too, that although newer versions of python include 
> pip and venv, some distros have decided not to install pip and venv by 
> default. 
>
> On Fri, Sep 25, 2015 at 1:07 PM, Daniele Procida  > wrote: 
> > On Fri, Sep 25, 2015, Tim Graham > 
> wrote: 
> > 
> >>The install page mentions several different ways to install Django, from 
> >>pip install (recommended), to `setup.py install`, to symlinking the 
> Django 
> >>checkout in your site-packages. Do you see any reason to keep the latter 
> >>methods instead of recommending pip all the time? 
> >> 
> >>https://docs.djangoproject.com/en/1.8/topics/install/ 
> > 
> > I can't see any good reason to keep symlink method, but maybe I am 
> missing something. 
> > 
> > I'd prefer to see the use of virtualenv built-in to the process from the 
> very start, and a quickstart version of the instructions as the first thing 
> that people see: 
> > 
> > virtualenv django-virtualenv 
> > source django-virtualenv/bin/activate 
> > pip install django 
> > 
> > I know that's not so simple, and that Windows users need to do something 
> a bit different, but at least seeing something that simple near the top of 
> the document would be a promise that installation *can* be that easy. 
> > 
> > Daniele 
> > 
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups "Django developers  (Contributions to Django itself)" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to django-develop...@googlegroups.com . 
> > To post to this group, send email to django-d...@googlegroups.com 
> . 
> > Visit this group at http://groups.google.com/group/django-developers. 
> > To view this discussion on the web visit 
> https://groups.google.com/d/msgid/django-developers/20150925170726.2021833171%40mail.wservices.ch.
>  
>
> > For more options, visit https://groups.google.com/d/optout. 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/6c1dbb99-9b64-4a30-98e2-8e1fb50a6afc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Making the admin compatible with CSP

2015-09-25 Thread Gavin Wahl
I'm very interested in getting this into 1.10. I can devote some time to it 
to help.

When I looked at it before, based on the time I had available, it didn't 
seem feasible for me to remove every single inline script. Especially with 
form widgets that include templated javascript. Instead I was looking at 
the two ways to whitelist scripts with CSP, namely script-nonce and script 
hash sources. The disadvantage with either of these approaches is that they 
need to be integrated with the middleware adding the CSP header, to 
communicate the current page nonce or the list of hashes. script-nonces 
also totally destroy caching, because each response has to have a unique 
nonce that's referenced by each inline script. 

Ideally django admin would just be compatible with whatever CSP header the 
user wants, without any specific integration, so removing all inline 
scripts and styles is certainly preferable if you have the time.

>  Oh, btw please do not handwrite JSON in templates, 

Absolutely, the view should build a data structure representing the data to 
be encoded as JSON rather than templating it.

>  which then only needs to go through the autoescape filter I think

This is actually incorrect.