Rather than watch the "inherit from User" thread go round and round, maybe I should give people something more concrete to think about.
This is a follow-up to the mail I sent late on Friday. It describes the area where we need API additions or some kind of semi-major change to incorporate model inheritance and some developer feedback would be a good idea. I am only talking about the API here, not the implementation (since you will notice there is no patch attached to the email). Model inheritance (as I've implemented it) comes in two varieties, differentiated by the way they store the data at the db level and by the way you use them at the Python level. ----------------------- 1. Abstract Base class ----------------------- One use-case for subclassing is to use the base class as a place to store common fields and functionality. It is purely a "factor out the common stuff" holder and you don't ever intend to use the base class on its own. In the terminology of other languages (e.g. C++, Java), the base class it is an abstract class. At the database level, it makes sense to store the fields from the base class in the same table as the fields from the derived class. You will not be instantiating the base class (or querying it) on its own, so we can view the subclassed model as a flattened version of the parent + child. For example, class Thing(models.Model): name = models.CharField(...) class Animal(Thing): genus = models.CharField(...) class Toy(Thing): manufacturer = models.CharField(...) would generate two database tables (Thing does not get its own table). The Animal table would have "name" and "genus" columns (along with the standard extras like "id") and Toy would have "name" and "manufacturer" columns. We need a way to tell Django that "Thing" is an abstract class. I would propose class Thing(models.Model): name = ... class Meta: is_abstract = True Points To Ponder ================ (1) What notation to use to declare a class as abstract? I've thrown out Meta.is_abstract -- any preference for alternatives? (2) Any strong reason not to include this case? It's only for "advanced use", since there are a few ways to shoot yourself in the foot (e.g. declare a class abstract, create the tables, remove the abstract declaration, watch code explode). However, it will be useful in some cases. [I have some scripts to help with converting non-abstract inheritance to abstract and vice-versa at the database level, too.] --------------------------- 2. "Pythonic" Inheritance ---------------------------- The traditional Python inheritance model allows one to create instances of the base class and work with instances of base classes as though they were the parent. Extending this to Django's querying model, we should be able to run queries against the Parent class: Thing.objects.filter(name = "horse") in the above example and, through the magic of duck typing, have the right sort of object returned. Amazingly enough, this is also supported in my code. It follows natural the multi-table model of doing inheritance (in the above classes, Animal would have a foreign key reference to the Thing table, similarly for Toy). Multiple inheritance works as well. In order to make duck typing work without doing tons of extra database queries, it is necessary to have a mapping between each row and the type of object it ultimately represents. So a row in the Thing table for an Animal object would say "I am an Animal" (we already know it's a Thing, because it's in the Thing table). The reverse direction (a row in the Animal table is also a Thing) is easy because we have the model description at the Python level and the foreign key constraint at the database level (the latter being more of a theoretical construct if you're using SQLite, but that's not a showstopper). When thinking about this, realise that it works smoothly for multi-layer inheritance: it only takes two queries to get all the data for any object, even if you start by querying the top table in the hierarchy. I have currently implemented this "downwards" reference as an extra column in each table that is a foreign key to the ContentType table (to head off one question: a GenericForeignKey field adds nothing here). Every single table would need this column because you never know when somebody is going to subclass that model and every query needs to retrieve that value as well. The alternative is to have a separate table that has columns for content_type, pk_value, derived_content_type that performs the same function. The drawback of this is that every single query needs to do a join with this second table, because we need to know the derived class in order to create the right type of object back in Python land (a query on the Thing table for something that is ultimately an Animal should return an Animal instance; that is how Python works and we don't want to go and introduce C++-style casts). Points To Ponder ================ (3) I think having the downward reference column (the one that specifies the type of the most-derived object) as a column on each table is the right approach. Anybody have strong arguments for the other approach (a separate table)? - we can ship a script that helps with conversion from existing tables to the new structure. There is no strict requirement on what this new column is called -- it can be configured on a per-model basis so that we don't restrict anybody's particular column choices. ----------------------- 3. What you don't get ----------------------- I am avoiding PostgreSQL's table inheritance feature. It is not a standard feature on databases, so we have to do inheritance inside Django anyway and having to maintain both the Django version and the PostgreSQL-specific version will lead to errors (we already have enough per-database specific stuff to discourage anybody from wanting more at a whim). I am not implementing the "everything in one table" storage model. It is not universally applicable (you can't inherit from third-party models, for a start) and I have software engineering objections to it. It also doesn't add much value at this point in time. If somebody wants that later, that is for later. I am not doing anything view-based. Most database compute views on demand, so they don't add any real performance value and hide some complexity (the complexity is already hidden from the Python user anyway. But hiding it from the developer trying to fix core bugs is not a good plan). All of the above features can be worked on by others if they like (free and open source and all that). If we can get the query SQL generation stuff sorted out next week when a few of us are in one place, fixing the model inheritance patch to work with that is all that remains to do before I can submit it. Currently I do get bitten by some query generation bugs, so it's not an either/or situation. Malcolm --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers -~----------~----~----~----~------~----~------~--~---