>
> With an appropriate index, SQL databases should be able to find rows 
> matching a condition on two columns about as quickly as if the condition 
> was only on one column, even in a table containing million of objects.
>
Databases can of course find the rows with appropriate indexes, but the 
problem is that GenericForeignKeys are precisely not such appropriate ones. 

> I'd like to see a benchmark before rejecting this idea on performance 
> grounds. Could you point us to some data supporting your assertion that 
> this isn't scalable?
>
To prove the increased performance I actually did some benchmark. I used 
the script in the attachment. It needs a model that has only one field 
called text. It randomly generates 10,000 users and 1,000,000 objects. Then 
it assigns to each user respectively 1,000 object permissions of randomly 
selected previously created objects. At least it checks for each user 
respectively 1,000 randomly selected objects again. That's what I got:

For the first version of my contrib.auth patch:
Summary:
granted permissions in 1 day, 10:41:07.523402
checked permissions in 22:48:44.662974

For a module using GenericForeignKeys (I used django-guardian because it 
seemed to be the most used one):
granted permissions in 11:31:12.216808
checked permissions <pending>

I don't have the results for checking permissions with django-guardian yet 
because it took ages to check them. Each single check for a object 
permission took about 4 to 5 seconds! Now think about how long it would 
take to check for all users (10,000*1,000*4secs)! I think that is out of 
all proportion to my suggested solution.
This happens because of the lack of a appropriate index as you mentioned 
before. So the database has to do a full table scan and that's definitely 
not scalable at all.
By the way, the reason for the GenericForeignKeys being faster in creating 
them is that the database does not have to create the indexes (i.e. the 
foreign keys). 

> I'm -1 on a contrib app (django.contrib.auth) injecting a dynamic field 
> into every model.
>
> (Yes, Django does that for the primary key, to avoid declaring it in every 
> model. But it's core, not a contrib app.)
>
I totally agree with this but I don't see another way at the moment. 

> What's the definition of "when the auth app is initialized"? Wouldn't your 
> solution required that django.contrib.auth be loaded before all other apps?
>
Unfortunately it does but I have a open mind about better implementations.
>
> With the solution that I prefer, this doesn't seem absolutely necessary, 
> the following statements are simple enough:
>
> ObjectPermission.objects.create(user=user, object=instance)
> ObjectPermission.objects.get(user=user, object=instance).delete()
>
I think that's matter of taste ;-) 

Your need to add ‘object_permissions’ to a model is an incorrect approach 
> in my opinion as it would require any old apps to undergo a schema 
> migration. Your generic relations concerns might be valid however I would 
> like to mention Django guardian does use generic FK but so does built in 
> django auth – so I think this approach is just fine.

You don't have to add this field manually, auth app does that. And django 
auth does not use generic relationships, it only makes use of the 
ContentType field to comfortably store the type assigned to a permission. 

Best regards,
Moritz

>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/django-developers/-/fp-Bu8sgUq0J.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

#!/usr/bin/env python -u

import string
import random
from datetime import datetime

NUM_OBJECTS=1000000
NUM_USERS=10000
NUM_PERMS_PER_USER=1000

def random_string(length=25, chars=string.ascii_letters+string.digits):
    return ''.join(random.choice(chars) for i in range(length))

class Benchmark(object):

    def __init__(self, User, TestModel, perm='testapp.add_testmodel', grant_perm=lambda user, perm, obj: ''):
        self.time_grant=0
        self.time_check=0
        self._user=User
        self._model=TestModel
        self._perm=perm
        self._grant_perm=grant_perm

    def create_users(self, num=NUM_USERS):
        print 'Creating users:',
        self._user.objects.bulk_create(self._user(username=random_string()) for i in range(num))
        print 'done'

    def create_objects(self, num=NUM_OBJECTS):
        print 'Creating objects:',
        self._model.objects.bulk_create(self._model(text=random_string(50)) for i in xrange(num))
        print 'done'

    def grant_permissions(self, num_perms_per_user=NUM_PERMS_PER_USER, num_objects=NUM_OBJECTS):
        print 'Grant permissions to users:',
        k=1
        t1=datetime.now()
        for u in self._user.objects.all():
            print k,
            for i in xrange(num_perms_per_user):
                j=random.randrange(1,num_objects)
                self._grant_perm(u, self._perm, self._model.objects.get(pk=j))
            k+=1
        t2=datetime.now()
        self.time_grant=t2-t1
        print 'done in %s' % self.time_grant

    def check_permissions(self, num_perms_per_user=NUM_PERMS_PER_USER, num_objects=NUM_OBJECTS):
        print 'Check permissions:',
        k=1
        t1=datetime.now()
        for u in self._user.objects.all():
            print k,
            for i in xrange(num_perms_per_user):
                j=random.randrange(1,num_objects)
                u.has_perm(self._perm, self._model.objects.get(pk=j))
            k+=1
        t2=datetime.now()
        self.time_check=t2-t1
        print 'done in %s' % self.time_check
    
    def benchmark(self):
        self.create_users()
        self.create_objects()
        self.grant_permissions()
        self.check_permissions()
        print 'Summary:'
        print 'granted permissions in %s' % self.time_grant
        print 'checked permissions in %s' % self.time_check

Reply via email to