> > With an appropriate index, SQL databases should be able to find rows > matching a condition on two columns about as quickly as if the condition > was only on one column, even in a table containing million of objects. > Databases can of course find the rows with appropriate indexes, but the problem is that GenericForeignKeys are precisely not such appropriate ones.
> I'd like to see a benchmark before rejecting this idea on performance > grounds. Could you point us to some data supporting your assertion that > this isn't scalable? > To prove the increased performance I actually did some benchmark. I used the script in the attachment. It needs a model that has only one field called text. It randomly generates 10,000 users and 1,000,000 objects. Then it assigns to each user respectively 1,000 object permissions of randomly selected previously created objects. At least it checks for each user respectively 1,000 randomly selected objects again. That's what I got: For the first version of my contrib.auth patch: Summary: granted permissions in 1 day, 10:41:07.523402 checked permissions in 22:48:44.662974 For a module using GenericForeignKeys (I used django-guardian because it seemed to be the most used one): granted permissions in 11:31:12.216808 checked permissions <pending> I don't have the results for checking permissions with django-guardian yet because it took ages to check them. Each single check for a object permission took about 4 to 5 seconds! Now think about how long it would take to check for all users (10,000*1,000*4secs)! I think that is out of all proportion to my suggested solution. This happens because of the lack of a appropriate index as you mentioned before. So the database has to do a full table scan and that's definitely not scalable at all. By the way, the reason for the GenericForeignKeys being faster in creating them is that the database does not have to create the indexes (i.e. the foreign keys). > I'm -1 on a contrib app (django.contrib.auth) injecting a dynamic field > into every model. > > (Yes, Django does that for the primary key, to avoid declaring it in every > model. But it's core, not a contrib app.) > I totally agree with this but I don't see another way at the moment. > What's the definition of "when the auth app is initialized"? Wouldn't your > solution required that django.contrib.auth be loaded before all other apps? > Unfortunately it does but I have a open mind about better implementations. > > With the solution that I prefer, this doesn't seem absolutely necessary, > the following statements are simple enough: > > ObjectPermission.objects.create(user=user, object=instance) > ObjectPermission.objects.get(user=user, object=instance).delete() > I think that's matter of taste ;-) Your need to add ‘object_permissions’ to a model is an incorrect approach > in my opinion as it would require any old apps to undergo a schema > migration. Your generic relations concerns might be valid however I would > like to mention Django guardian does use generic FK but so does built in > django auth – so I think this approach is just fine. You don't have to add this field manually, auth app does that. And django auth does not use generic relationships, it only makes use of the ContentType field to comfortably store the type assigned to a permission. Best regards, Moritz > -- You received this message because you are subscribed to the Google Groups "Django developers" group. To view this discussion on the web visit https://groups.google.com/d/msg/django-developers/-/fp-Bu8sgUq0J. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
#!/usr/bin/env python -u import string import random from datetime import datetime NUM_OBJECTS=1000000 NUM_USERS=10000 NUM_PERMS_PER_USER=1000 def random_string(length=25, chars=string.ascii_letters+string.digits): return ''.join(random.choice(chars) for i in range(length)) class Benchmark(object): def __init__(self, User, TestModel, perm='testapp.add_testmodel', grant_perm=lambda user, perm, obj: ''): self.time_grant=0 self.time_check=0 self._user=User self._model=TestModel self._perm=perm self._grant_perm=grant_perm def create_users(self, num=NUM_USERS): print 'Creating users:', self._user.objects.bulk_create(self._user(username=random_string()) for i in range(num)) print 'done' def create_objects(self, num=NUM_OBJECTS): print 'Creating objects:', self._model.objects.bulk_create(self._model(text=random_string(50)) for i in xrange(num)) print 'done' def grant_permissions(self, num_perms_per_user=NUM_PERMS_PER_USER, num_objects=NUM_OBJECTS): print 'Grant permissions to users:', k=1 t1=datetime.now() for u in self._user.objects.all(): print k, for i in xrange(num_perms_per_user): j=random.randrange(1,num_objects) self._grant_perm(u, self._perm, self._model.objects.get(pk=j)) k+=1 t2=datetime.now() self.time_grant=t2-t1 print 'done in %s' % self.time_grant def check_permissions(self, num_perms_per_user=NUM_PERMS_PER_USER, num_objects=NUM_OBJECTS): print 'Check permissions:', k=1 t1=datetime.now() for u in self._user.objects.all(): print k, for i in xrange(num_perms_per_user): j=random.randrange(1,num_objects) u.has_perm(self._perm, self._model.objects.get(pk=j)) k+=1 t2=datetime.now() self.time_check=t2-t1 print 'done in %s' % self.time_check def benchmark(self): self.create_users() self.create_objects() self.grant_permissions() self.check_permissions() print 'Summary:' print 'granted permissions in %s' % self.time_grant print 'checked permissions in %s' % self.time_check