Abstract
------------------------------------------------------------------------------
A database migration helper has been one of the most long standing feature
requests in Django. Though Django has an excellent database creation helper,
when faced with schema design changes, developers have to resort to either
writing raw SQL and manually performing the migrations, or using third party
apps like South[1] and Nashvegas[2].

Clearly Django will benefit from having a database migration helper as an
integral part of its codebase.

>From [3], the consensus seems to be on building a Ruby on Rails ActiveRecord
Migrations[4] like framework, which will essentially emit python code after
inspecting user models and current state of the database. The python code
generated will then be fed to a 'migrations API' that will actually handle
the
task of migration. This is the approach followed by South (as opposed to
Nashvegas's approach of generating raw SQL migration files). This ensures
modularity, one of the trademarks of Django. Third party developers can
create
their own inspection and ORM versioning tools, provided the inspection tool
emits python code conforming to our new migrations API.

To sum up, the complete migrations framework will need, at the highest
level:
1. A migrations API that accepts python code and actually performs the
   migrations.
2. An inspection tool that generates the appropriate python code after
   inspecting models and current state of database.
3. A versioning tool to keep track of migrations. This will allow 'backward'
   migrations.
4. Glue code to tie the above three together.


Implementation plan
------------------------------------------------------------------------------
Before discussing the implementation plan for the migrations framework, I
would like to digress for a moment and discuss the final state of the
migrations framework when it will be implemented.

For the user, syncing and migrating databases will consist of issuing the
commands syncdb and a new 'migrate' command.
syncdb will be have to be rewritten and a new migrate command will be
written.

South's syncdb:
class Command(NoArgsCommand):
    def handle_noargs(self, migrate_all=False, **options):
        ...
        apps_needing_sync = []
        apps_migrated = []
        for app in models.get_apps():
            app_label = get_app_label(app)
            if migrate_all:
                apps_needing_sync.append(app_label)
            else:
                try:
                    migrations = migration.Migrations(app_label)
                except NoMigrations:
                    # It needs syncing
                    apps_needing_sync.append(app_label)
                else:
                    # This is a migrated app, leave it
                    apps_migrated.append(app_label)
        verbosity = int(options.get('verbosity', 0))
        # Run the original syncdb procedure for apps_needing_sync
        # If migrate is passed as a parameter, run migrate command for rest

The above code is from South's override of syncdb command. It basically
divides INSTALLED_APPS into apps that have a migration history and will be
handled by the migrations framework and those that do not have a migrations
history and will be handled by Django's syncdb. South expects users to
manually run a 'schemamigration --initial' command for every app they want
to
be handled by South's migration framework.

If migrations become a core part of Django, every user app will have a
migration folder(module) under it, created at the time of issuing
django-admin.py startapp. Thus by modifying the startapp command to create a
migrations module for every app it creates, we will be able to use South's
syncdb code as is and will also save the user from issuing
schemamigration --initial for all his/her apps.

Now that we have a guaranteed migrations history for every user app, migrate
command will also be more or less a copy of South's migrate command.

Coming back to the migrations API,
There are three fundamental operations that can be performed during a
migration:
1. Creation of a new model.
2. Alteration in an existing model.
3. Deletion of an existing model.

As much as I would have liked to use Django creation API's code for creating
and destroying models, we cannot. The reason for this is Django's creation
API
uses its inspection tools to generate *SQL* which is then directly fed to
cursor.execute. What we need is a migrations API which gobbles up *python*
code generated by the inspection tool. Moreover deprecating/removing
Django's
creation API to use the new migrations API everywhere will give rise to
performance issues since time will be wasted in generating python code and
then
converting python to SQL for Django's core apps which will never have
migrations anyways.

The creation API and code that depends on it (syncdb, sql,
django.test.simple
and django.contrib.gis.db.backends) will be left as is.

Therefore much of the code for our new migrations API will come from South.
For models:
class Author(models.Model):
    name = models.CharField(max_length=100)

class Book(models.Model):
    title = models.CharField(max_length=100)
    author = models.ManyToManyField('Author')

the migrations file created under user app / migrations will look like:
class Migration(SchemaMigration):
    def forwards(self, orm):
        # Adding model 'Book'
        db.create_table('myapp_book', (
            ('id', self.gf
('django.db.models.fields.AutoField')(primary_key=True)),
            ('title', self.gf
('django.db.models.fields.CharField')(max_length=100)),
        ))
        db.send_create_signal('myapp', ['Book'])

        # Adding M2M table for field author on 'Book'
        db.create_table('myapp_book_author', (
            ('id', models.AutoField(verbose_name='ID', primary_key=True,
auto_created=True)),
            ('book', models.ForeignKey(orm['myapp.book'], null=False)),
            ('author', models.ForeignKey(orm['myapp.author'], null=False))
        ))
        db.create_unique('myapp_book_author', ['book_id', 'author_id'])

        # Adding model 'Author'
        db.create_table('myapp_author', (
            ('id', self.gf
('django.db.models.fields.AutoField')(primary_key=True)),
            ('name', self.gf
('django.db.models.fields.CharField')(max_length=100)),
        ))
        db.send_create_signal('myapp', ['Author'])

    def backwards(self, orm):
        # Deleting model 'Book'
        db.delete_table('myapp_book')

        # Removing M2M table for field author on 'Book'
        db.delete_table('myapp_book_author')

        # Deleting model 'Author'
        db.delete_table('myapp_author')

    models = {
        'myapp.author': {
            'Meta': {'object_name': 'Author'},
            'id': ('django.db.models.fields.AutoField', [], {'primary_key':
'True'}),
            'name': ('django.db.models.fields.CharField', [],
{'max_length': '100'})
        },
        'myapp.book': {
            'Meta': {'object_name': 'Book'},
            'author': ('django.db.models.fields.related.ManyToManyField',
[], {'to': "orm['myapp.Author']", 'symmetrical': 'False'}),
            'id': ('django.db.models.fields.AutoField', [], {'primary_key':
'True'}),
            'title': ('django.db.models.fields.CharField', [],
{'max_length': '100'})
        }
    }
    complete_apps = ['myapp']

when the initial (blank) migration at app creation time is:
class Migration(SchemaMigration):
    def forwards(self, orm):
        pass
    def backwards(self, orm):
        pass
    models = {}
    complete_apps = ['myapp']

The above code snippets have been generated by South. Thus we already have
functionality for create_table, create_unique, delete_table etc. and the
inspection routines which have to be integrated into Django's codebase
(at django.db.migrations perhaps?)


Schedule and Goal
------------------------------------------------------------------------------
Week 1    : Discussion on API design and overriding django-admin startapp
Week 2-3  : Developing the base migration API
Week 4    : Developing migration extensions and overrides for PostgreSQL
Week 5    : Developing migration extensions and overrides for MySQL
Week 6    : Developing migration extensions and overrides for SQLite
Week 7    : Developing the inspection tools
Week 8    : Developing the ORM versioning tools and glue code
Week 9-10 : Writing tests/documentaion
Week 11-12: Buffer weeks for the unexpected, Oracle DB? and
            djago.contrib.gis.backends?

Note: Work on Oracle and GIS may not be possible as part of GSoC

I will personally consider my project to be successful if I have created and
tested at least the base API + PostgreSQL extension and inspection + version
tools.


About me and my inspiration for the project
------------------------------------------------------------------------------
I am Kushagra Sinha, a pre-final year student at Institute of Technology
(about to be converted to an Indian Institute of Technology),
Banaras Hindu University, Varanasi, India.

I can be reached at:
Gmail: sinha.kushagra
Alternative email: kush [at] j4nu5.com
IRC: Nick j4nu5 on #django-dev and #django
Twitter: @j4nu5
github: j4nu5

I was happily using PHP for nearly all of my webdev work since my high
school
days (CakePHP being my framework of choice) when I was introduced to Django
a year and a half ago. Comparing Django with CakePHP (which is Ruby on Rails
inspired) I felt more attched to Django's philosophy than RoR's "hidden
magic"
approach. I have been in love ever since :)

Last year I had an internship at MobStac[5] (BusinessWorld magazine India's
hottest young startup[6]). Their stack is on Django+MySQL. I was involved in
a heavy database migration that involved their analytics platform. Since,
they
had not been using a migrations framework, the situation looked grim.
Fortunately, South came to the rescue and we were able to carry out the
migration but it left everyone a little frustrated and clearly in want of a
migrations framework built within Django itself.


Experience
------------------------------------------------------------------------------
I have experience working in a high voltage database migration through my
internship as stated before. I am also familiar with Django's contribution
guidelines and have written a couple of patches[7]. One patch has been
accepted and the second got blocked by 1.4's feature freeze.
My other projects can be seen on my github[8]


[1] http://south.aeracode.org/
[2] https://github.com/paltman/nashvegas/
[3] https://code.djangoproject.com/wiki/SchemaEvolution
[4] http://api.rubyonrails.org/classes/ActiveRecord/Migration.html
[5] http://mobstac.com/
[6]
http://blog.mobstac.com/blog/2011/06/businessworld-declares-mobstac-indias-hottest-young-startup/
[7] https://code.djangoproject.com/query?owner=~j4nu5
[8] https://github.com/j4nu5

-- 
Kushagra SInha
B. Tech. Part III (Pre-final year)
Indian Institute of Technology
Varanasi
Contact: +91 9415 125 215

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Reply via email to