Proposal: Schema migration/evolution backend

Andrew Godwin Fri, 28 May 2010 04:07:48 -0700

Hi all,

As perhaps was inevitable, I'm proposing implementing part of a schemamigration backend in Django. I'm not sure if this is a 1.3 thing (it maywell be 1.4, perhaps with some implemented in time for 1.3 but notexposed), but it's something I'd like to get started in this releasecycle. To make it clear, I'm happy to make all these changes - I'm alre

Firstly, let me make it clear that this is not a proposal to merge Southinto core. Quite the opposite, in fact, the idea is to keep the optionopen to have different migration frontends available and in fact makethe implementation of them much easier.

Secondly, this particular proposal is something that me and Russ havepreliminarily agreed on here at the sprints - however, I'd really likepeople to suggest changes and things we may have missed. Implementingthis essentially consists of drawing a line of how much of a migrationsframework we'll implement in Django, and I'm only mostly sure we have itin the right place.

The first part of the proposal is pretty uncontroversial, and it's toimplement schema-changing operations on the backends. Specifically, theproposed new operations are:


 - add_table
 - delete_table
 - rename_table
 - add_column
 - rename_column
 - alter_column
 - delete_column
 - add_primary_key
 - delete_primary_key
 - add_unique
 - delete_unique
 - add_index
 - delete_index

Some of these operations are already mostly implemented (add_table,add_index, etc.) in backends' creation modules, but they'll need a bitof rearranging and separating into a full public API. I also plan tomodify them to take model names and field names, instead of table namesand column names, so the API is exclusively using the Django model layerto represent changes (there's a possibility that some changes make sensefor schemaless databases as well, specifically renames, so it's best notto tie it directly to relational databases).

(Additionally, this means that if someone has specified the table nameor column name directly using something like Meta: db_table then we'llneed to have those as either extra arguments to the function or asmarked strings - e.g mark_raw('auth_users'))

I expect this will take a while and be quite fiddly, but we have thecodebases of django-evolution and South to draw on for the modificationcode, so there's not much new discovery and backend-specific bugfixingto be done.

Additionally, unlike in current South, backends will always be able togenerate SQL for operations, but it won't necessarily be runnable(things like index names can only be resolved at runtime, so you'd getcode like "DROP INDEX <<User.username-index>> ON users;". We feel thisis a pretty good tradeoff between being able to actually work withthings like index names (they're basically nondeterministic) while alsosatisfying people who like to read the SQL that's going to be run.

The second part is to implement migration tracking, dependency resolvingand basic running into Django. There will be a core contract migrationshave to follow:


 - Migrations are per-application

- Applications have a directory (usually appname/migrations/, butconfigurable via a setting like SOUTH_MIGRATION_MODULES does now), whichcontains zero or more .sql and .py files- Inside an application, migrations are implicitly ordered by name (bystring sort, so "0001_initial" is before "0002_second", and "alpha" isbefore "beta", but "11_foo" is not before "2_bar").- Migrations are uniquely identified by the combination of their applabel and their migration name.- There will be a table, probably "migration_history" or similar,which records which migrations have been applied, and when.- Django will ship with a "migrate" command, which will work out whatmigrations to run, and run them. There will be an automatic mode whichruns dependencies, and a manual mode where you say if you'd like to runeach migration (and ones that are missing dependencies it tells youabout, but you're not allowed to run).

As for the migration files themselves, the idea is to provide a verybasic interface that means that apps (and Django itself, potentially)can ship migrations that have no dependencies, but that still allowsthird-party tools like South to exist that will provide ORM access andautogeneration.

.py migration files will be a normal Python module, and should have a"migrate" callable, which will get called with three arguments (aconnection/operations instance, much like 'south.db.db', reverse, aboolean saying if the migration should run backwards, which will beentirely optional and some migrations will just raise an error, anddry_run, which indicates if the migration should just run through andcheck there's no obvious calling errors, which is useful for catchingerrors on MySQL before the SQL gets sent to the database).

The files can also optionally have a __depends__ variable in scope,which should be an iterable of (app_label, migration_name) or(app_label, migration_name, reverse_dependency) tuples - this is used tocalculate the dependencies for this migration.

The migration name of a .py file is simply the filename with theextension removed.

.sql migration files will just be loaded and run as-is. Dependencies canbe declared with comments at the top of the file (such as "-- depends:auth 0002_remove_username"). Because SQL is database-dependent,filenames can be of two formats: "migration_name.sql" or"migration_name.backend_name.sql". Django will attempt to run thedatabase-specific one first, and then fall back to the 'generic' one.

The idea behind all of this is to allow reusable apps to ship withmigrations using an engine or generator of their choosing, and stillhave them interact correctly with everything else. For example, Djangomight ship a migration like this:


  from django.db import models
  __depends__ = [("auth", "0002_remove_username")]
  def migrate(connection, reverse, dry_run):
      if reverse:

connection.ops.add_column("auth.User", "username",models.CharField(max_length=100))

      else:
          connection.ops.delete_column("auth.User", "username")

And a future South might make migrations like this (I'm not proposingthis as the future, it needs improvement, but it's an example):


  from south.v3 import SchemaMigration
  __depends__ = [("auth", "0002_remove_username")]
  class migrate(SchemaMigration):
      def forwards(self, db, orm):
          db.delete_column("auth.User", "username")
      def backwards(self, db, orm):

db.add_column("auth.User", "username",self.gf("django.db.models.fields.CharField")(max_length=100))

Here, the SchemaMigration class' constructor is the thing that takes(connection, reverse, dry_run), and then delegates to the appropriatemethods and uses a few wrappers.

That's the proposal, then. The grounding idea is to provide a consistentframework for migrations to run in, and absorb all the parts that reallyshould be done once and done well (backend-specific implementations,dependency resolvers, etc.). The backend changes obviously have to gointo core, while the tracking, dependency resolution and managementcommands should, I propose, go into a "django.contrib.migrations".

There's the issue of MultiDB, as always, but my proposal for that is toallow some mechanism to select the migrations directory per databasealias (be that in a router or a setting), and then have a --databaseoption on migrate - there's already going to be a way to providedirectories that aren't appname/migrations/, so this won't be too muchof an addition. That allows people who are separating tables to haveentirely separate migration sets for each database, and people who aresharding, etc. to have them all pointing at the same set.

Criticisms, changes, and observations are very welcome. This is the kindof thing I really want to be done, and be done right first time.


Andrew

--
You received this message because you are subscribed to the Google Groups "Django 
developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Proposal: Schema migration/evolution backend

Reply via email to