Hi all,
As perhaps was inevitable, I'm proposing implementing part of a schema
migration backend in Django. I'm not sure if this is a 1.3 thing (it may
well be 1.4, perhaps with some implemented in time for 1.3 but not
exposed), but it's something I'd like to get started in this release
cycle. To make it clear, I'm happy to make all these changes - I'm alre
Firstly, let me make it clear that this is not a proposal to merge South
into core. Quite the opposite, in fact, the idea is to keep the option
open to have different migration frontends available and in fact make
the implementation of them much easier.
Secondly, this particular proposal is something that me and Russ have
preliminarily agreed on here at the sprints - however, I'd really like
people to suggest changes and things we may have missed. Implementing
this essentially consists of drawing a line of how much of a migrations
framework we'll implement in Django, and I'm only mostly sure we have it
in the right place.
The first part of the proposal is pretty uncontroversial, and it's to
implement schema-changing operations on the backends. Specifically, the
proposed new operations are:
- add_table
- delete_table
- rename_table
- add_column
- rename_column
- alter_column
- delete_column
- add_primary_key
- delete_primary_key
- add_unique
- delete_unique
- add_index
- delete_index
Some of these operations are already mostly implemented (add_table,
add_index, etc.) in backends' creation modules, but they'll need a bit
of rearranging and separating into a full public API. I also plan to
modify them to take model names and field names, instead of table names
and column names, so the API is exclusively using the Django model layer
to represent changes (there's a possibility that some changes make sense
for schemaless databases as well, specifically renames, so it's best not
to tie it directly to relational databases).
(Additionally, this means that if someone has specified the table name
or column name directly using something like Meta: db_table then we'll
need to have those as either extra arguments to the function or as
marked strings - e.g mark_raw('auth_users'))
I expect this will take a while and be quite fiddly, but we have the
codebases of django-evolution and South to draw on for the modification
code, so there's not much new discovery and backend-specific bugfixing
to be done.
Additionally, unlike in current South, backends will always be able to
generate SQL for operations, but it won't necessarily be runnable
(things like index names can only be resolved at runtime, so you'd get
code like "DROP INDEX <<User.username-index>> ON users;". We feel this
is a pretty good tradeoff between being able to actually work with
things like index names (they're basically nondeterministic) while also
satisfying people who like to read the SQL that's going to be run.
The second part is to implement migration tracking, dependency resolving
and basic running into Django. There will be a core contract migrations
have to follow:
- Migrations are per-application
- Applications have a directory (usually appname/migrations/, but
configurable via a setting like SOUTH_MIGRATION_MODULES does now), which
contains zero or more .sql and .py files
- Inside an application, migrations are implicitly ordered by name (by
string sort, so "0001_initial" is before "0002_second", and "alpha" is
before "beta", but "11_foo" is not before "2_bar").
- Migrations are uniquely identified by the combination of their app
label and their migration name.
- There will be a table, probably "migration_history" or similar,
which records which migrations have been applied, and when.
- Django will ship with a "migrate" command, which will work out what
migrations to run, and run them. There will be an automatic mode which
runs dependencies, and a manual mode where you say if you'd like to run
each migration (and ones that are missing dependencies it tells you
about, but you're not allowed to run).
As for the migration files themselves, the idea is to provide a very
basic interface that means that apps (and Django itself, potentially)
can ship migrations that have no dependencies, but that still allows
third-party tools like South to exist that will provide ORM access and
autogeneration.
.py migration files will be a normal Python module, and should have a
"migrate" callable, which will get called with three arguments (a
connection/operations instance, much like 'south.db.db', reverse, a
boolean saying if the migration should run backwards, which will be
entirely optional and some migrations will just raise an error, and
dry_run, which indicates if the migration should just run through and
check there's no obvious calling errors, which is useful for catching
errors on MySQL before the SQL gets sent to the database).
The files can also optionally have a __depends__ variable in scope,
which should be an iterable of (app_label, migration_name) or
(app_label, migration_name, reverse_dependency) tuples - this is used to
calculate the dependencies for this migration.
The migration name of a .py file is simply the filename with the
extension removed.
.sql migration files will just be loaded and run as-is. Dependencies can
be declared with comments at the top of the file (such as "-- depends:
auth 0002_remove_username"). Because SQL is database-dependent,
filenames can be of two formats: "migration_name.sql" or
"migration_name.backend_name.sql". Django will attempt to run the
database-specific one first, and then fall back to the 'generic' one.
The idea behind all of this is to allow reusable apps to ship with
migrations using an engine or generator of their choosing, and still
have them interact correctly with everything else. For example, Django
might ship a migration like this:
from django.db import models
__depends__ = [("auth", "0002_remove_username")]
def migrate(connection, reverse, dry_run):
if reverse:
connection.ops.add_column("auth.User", "username",
models.CharField(max_length=100))
else:
connection.ops.delete_column("auth.User", "username")
And a future South might make migrations like this (I'm not proposing
this as the future, it needs improvement, but it's an example):
from south.v3 import SchemaMigration
__depends__ = [("auth", "0002_remove_username")]
class migrate(SchemaMigration):
def forwards(self, db, orm):
db.delete_column("auth.User", "username")
def backwards(self, db, orm):
db.add_column("auth.User", "username",
self.gf("django.db.models.fields.CharField")(max_length=100))
Here, the SchemaMigration class' constructor is the thing that takes
(connection, reverse, dry_run), and then delegates to the appropriate
methods and uses a few wrappers.
That's the proposal, then. The grounding idea is to provide a consistent
framework for migrations to run in, and absorb all the parts that really
should be done once and done well (backend-specific implementations,
dependency resolvers, etc.). The backend changes obviously have to go
into core, while the tracking, dependency resolution and management
commands should, I propose, go into a "django.contrib.migrations".
There's the issue of MultiDB, as always, but my proposal for that is to
allow some mechanism to select the migrations directory per database
alias (be that in a router or a setting), and then have a --database
option on migrate - there's already going to be a way to provide
directories that aren't appname/migrations/, so this won't be too much
of an addition. That allows people who are separating tables to have
entirely separate migration sets for each database, and people who are
sharding, etc. to have them all pointing at the same set.
Criticisms, changes, and observations are very welcome. This is the kind
of thing I really want to be done, and be done right first time.
Andrew
--
You received this message because you are subscribed to the Google Groups "Django
developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/django-developers?hl=en.