Hi all,

As perhaps was inevitable, I'm proposing implementing part of a schema migration backend in Django. I'm not sure if this is a 1.3 thing (it may well be 1.4, perhaps with some implemented in time for 1.3 but not exposed), but it's something I'd like to get started in this release cycle. To make it clear, I'm happy to make all these changes - I'm alre

Firstly, let me make it clear that this is not a proposal to merge South into core. Quite the opposite, in fact, the idea is to keep the option open to have different migration frontends available and in fact make the implementation of them much easier.

Secondly, this particular proposal is something that me and Russ have preliminarily agreed on here at the sprints - however, I'd really like people to suggest changes and things we may have missed. Implementing this essentially consists of drawing a line of how much of a migrations framework we'll implement in Django, and I'm only mostly sure we have it in the right place.

The first part of the proposal is pretty uncontroversial, and it's to implement schema-changing operations on the backends. Specifically, the proposed new operations are:

 - add_table
 - delete_table
 - rename_table
 - add_column
 - rename_column
 - alter_column
 - delete_column
 - add_primary_key
 - delete_primary_key
 - add_unique
 - delete_unique
 - add_index
 - delete_index

Some of these operations are already mostly implemented (add_table, add_index, etc.) in backends' creation modules, but they'll need a bit of rearranging and separating into a full public API. I also plan to modify them to take model names and field names, instead of table names and column names, so the API is exclusively using the Django model layer to represent changes (there's a possibility that some changes make sense for schemaless databases as well, specifically renames, so it's best not to tie it directly to relational databases).

(Additionally, this means that if someone has specified the table name or column name directly using something like Meta: db_table then we'll need to have those as either extra arguments to the function or as marked strings - e.g mark_raw('auth_users'))

I expect this will take a while and be quite fiddly, but we have the codebases of django-evolution and South to draw on for the modification code, so there's not much new discovery and backend-specific bugfixing to be done.

Additionally, unlike in current South, backends will always be able to generate SQL for operations, but it won't necessarily be runnable (things like index names can only be resolved at runtime, so you'd get code like "DROP INDEX <<User.username-index>> ON users;". We feel this is a pretty good tradeoff between being able to actually work with things like index names (they're basically nondeterministic) while also satisfying people who like to read the SQL that's going to be run.

The second part is to implement migration tracking, dependency resolving and basic running into Django. There will be a core contract migrations have to follow:

 - Migrations are per-application
- Applications have a directory (usually appname/migrations/, but configurable via a setting like SOUTH_MIGRATION_MODULES does now), which contains zero or more .sql and .py files - Inside an application, migrations are implicitly ordered by name (by string sort, so "0001_initial" is before "0002_second", and "alpha" is before "beta", but "11_foo" is not before "2_bar"). - Migrations are uniquely identified by the combination of their app label and their migration name. - There will be a table, probably "migration_history" or similar, which records which migrations have been applied, and when. - Django will ship with a "migrate" command, which will work out what migrations to run, and run them. There will be an automatic mode which runs dependencies, and a manual mode where you say if you'd like to run each migration (and ones that are missing dependencies it tells you about, but you're not allowed to run).

As for the migration files themselves, the idea is to provide a very basic interface that means that apps (and Django itself, potentially) can ship migrations that have no dependencies, but that still allows third-party tools like South to exist that will provide ORM access and autogeneration.

.py migration files will be a normal Python module, and should have a "migrate" callable, which will get called with three arguments (a connection/operations instance, much like 'south.db.db', reverse, a boolean saying if the migration should run backwards, which will be entirely optional and some migrations will just raise an error, and dry_run, which indicates if the migration should just run through and check there's no obvious calling errors, which is useful for catching errors on MySQL before the SQL gets sent to the database).

The files can also optionally have a __depends__ variable in scope, which should be an iterable of (app_label, migration_name) or (app_label, migration_name, reverse_dependency) tuples - this is used to calculate the dependencies for this migration.

The migration name of a .py file is simply the filename with the extension removed.

.sql migration files will just be loaded and run as-is. Dependencies can be declared with comments at the top of the file (such as "-- depends: auth 0002_remove_username"). Because SQL is database-dependent, filenames can be of two formats: "migration_name.sql" or "migration_name.backend_name.sql". Django will attempt to run the database-specific one first, and then fall back to the 'generic' one.

The idea behind all of this is to allow reusable apps to ship with migrations using an engine or generator of their choosing, and still have them interact correctly with everything else. For example, Django might ship a migration like this:

  from django.db import models
  __depends__ = [("auth", "0002_remove_username")]
  def migrate(connection, reverse, dry_run):
      if reverse:
connection.ops.add_column("auth.User", "username", models.CharField(max_length=100))
      else:
          connection.ops.delete_column("auth.User", "username")

And a future South might make migrations like this (I'm not proposing this as the future, it needs improvement, but it's an example):

  from south.v3 import SchemaMigration
  __depends__ = [("auth", "0002_remove_username")]
  class migrate(SchemaMigration):
      def forwards(self, db, orm):
          db.delete_column("auth.User", "username")
      def backwards(self, db, orm):
db.add_column("auth.User", "username", self.gf("django.db.models.fields.CharField")(max_length=100))

Here, the SchemaMigration class' constructor is the thing that takes (connection, reverse, dry_run), and then delegates to the appropriate methods and uses a few wrappers.

That's the proposal, then. The grounding idea is to provide a consistent framework for migrations to run in, and absorb all the parts that really should be done once and done well (backend-specific implementations, dependency resolvers, etc.). The backend changes obviously have to go into core, while the tracking, dependency resolution and management commands should, I propose, go into a "django.contrib.migrations".

There's the issue of MultiDB, as always, but my proposal for that is to allow some mechanism to select the migrations directory per database alias (be that in a router or a setting), and then have a --database option on migrate - there's already going to be a way to provide directories that aren't appname/migrations/, so this won't be too much of an addition. That allows people who are separating tables to have entirely separate migration sets for each database, and people who are sharding, etc. to have them all pointing at the same set.

Criticisms, changes, and observations are very welcome. This is the kind of thing I really want to be done, and be done right first time.

Andrew

--
You received this message because you are subscribed to the Google Groups "Django 
developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Reply via email to