Less than a week remains for student application deadline. Can someone please comment on the above revised proposal. Thanks a lot.
On Monday, 26 March 2012 01:29:35 UTC+5:30, j4nu5 wrote: > > Here is a revised proposal. > > Abstract > > ------------------------------------------------------------------------------ > A database migration helper has been one of the most long standing feature > requests in Django. Though Django has an excellent database creation > helper, > when faced with schema design changes, developers have to resort to either > writing raw SQL and manually performing the migrations, or using third > party > apps like South[1] and Nashvegas[2]. > > [1] http://south.aeracode.org/ > [2] https://github.com/paltman/nashvegas/ > > Clearly Django will benefit from having a database migration helper as an > integral part of its codebase. > > From the summary on django-developers mailing list[3], the task of > building a > migrations framework will involve: > 1. Add a db.backends module to provide an abstract interface to migration > primitives (add column, add index, rename column, rename table, and so > on). > 2. Add a contrib app that performs the high level accounting of "has > migration > X been applied", and management commands to "apply all outstanding > migrations" > 3. Provide an API that allows end users to define raw-SQL migrations, or > native Python migrations using the backend primitives. > 4. Leave the hard task of determining dependencies, introspection of > database > models and so on to the toolset contributed by the broader community. > > [3] http://groups.google.com/ > group/django-developers/msg/cf379a4f353a37f8 > > I would like to work on the 1st step as part of this year's GSoC. > > > Implementation plan > > ------------------------------------------------------------------------------ > The idea is to have a CRUD interface to database schema (with some > additional > utility functions for indexing etc.) with functions like: > * create_table > * rename_table > * delete_table > * add_column > and so on, which will have the *explicit* names of the table/column to be > modified as its parameter. It will be the responsibility of the higher > level > API caller (will not be undertaken as part of GSoC) to translate > model/field > names to explicit table/column names. These functions will be directly > responsible for modifying the schema, and any interaction with the database > schema will take place by calling these functions. Most of these functions > will come from South. > > These API functions will also have a "dry-run" or test mode, in which they > will output raw SQL representation of the migration or display errors if > they > occur. This will be useful in: > 1. The MySQL backend. MySQL does not have transaction support for schema > modification and hence the migrations will be run in a dry run mode > first > so that any errors can be captured before altering the schema. > 2. The django-admin commands sql and sqlall that return the SQL (for > creation > and indexing) for an app. They will capture the SQL returned from the > API > running in dry run mode. > > As for the future of the current Django creation API, it will have to be > refactored (not under GSoC) to make use of the 'create' part of our new > CRUD > interface, for consistency purposes. > > The GeoDjango backends will also have to be refactored to use the new API. > Since, they build upon the base code in db.backends, firstly db.backends > will > have to be refactored. > > Last year xtrqt had written, documented and tested code for at least the > SQLite backend[4]. As per Andrew's suggestion, I would not be relying too > much > on that code but some parts can still be salvaged. > > [4] https://groups.google.com/ > forum/?fromgroups#!searchin/django-developers/xtrqt/django-developers/pSICNJBJRy8/Hl7frp-O-dMJ > > > Schedule and Goal > > ------------------------------------------------------------------------------ > Week 1 : Discussion on API design and writing tests > Week 2-3 : Developing the base migration API > Week 4 : Developing extensions and overrides for PostgreSQL > Week 5-6 : Developing extensions and overrides for MySQL > Week 7-8.5 : Developing extensions and overrides for SQLite (may be > shorter or > longer (by 0.5 week) depending on how much of xtrqt's code is > considered acceptable) > Week 8.5-10: Writing documentaion and leftover regression tests, if any > Week 11-12 : Buffer weeks for the unexpected > > I will consider my project to be successful when we have working, tested > and > documented migration primitives for Postgres, MySQL and SQLite. If we can > develop a working fork of South to use these primitives, that will be a > strong > indicator of the project's success. > > > About me and my inspiration for the project > > ------------------------------------------------------------------------------ > I am Kushagra Sinha, a pre-final year student at Institute of Technology > (about to be converted to an Indian Institute of Technology), > Banaras Hindu University, Varanasi, India. > > I can be reached at: > Gmail: sinha.kushagra > Alternative email: kush [at] j4nu5 [dot] com > IRC: Nick j4nu5 on #django-dev and #django > Twitter: @j4nu5 > github: j4nu5 > > I was happily using PHP for nearly all of my webdev work since my high > school > days (CakePHP being my framework of choice) till I was introduced to Django > a year and a half ago. Comparing Django with CakePHP (which is Ruby on > Rails > inspired) I felt more attached to Django's philosophy than RoR's "hidden > magic" > approach. I have been in love ever since :) > > Last year I had an internship at MobStac[5] (BusinessWorld magazine India's > hottest young startup[6]). Their stack is on Django+MySQL. I was involved > in > a heavy database migration that involved their analytics platform. Since, > they > had not been using a migrations framework, the situation looked grim. > Fortunately, South came to the rescue and we were able to carry out the > migration but it left everyone a little frustrated and clearly in want of a > migrations framework built within Django itself. > > [5] http://mobstac.com/ > [6] > http://blog.mobstac.com/blog/2011/06/businessworld-declares-mobstac-indias-hottest-young-startup/ > > > Experience > > ------------------------------------------------------------------------------ > I have experience working in a high voltage database migration through my > internship as stated before. I am also familiar with Django's contribution > guidelines and have written a couple of patches[7]. One patch has been > accepted and the second got blocked by 1.4's feature freeze. > My other projects can be seen on my github[8] > > [7] https://code.djangoproject.com/query?owner=~j4nu5 > [8] https://github.com/j4nu5 > > On Mon, Mar 19, 2012 at 5:03 AM, Russell Keith-Magee < > russ...@keith-magee.com> wrote: > >> >> On 18/03/2012, at 7:38 PM, Kushagra Sinha wrote: >> >> > Abstract >> > >> ------------------------------------------------------------------------------ >> > A database migration helper has been one of the most long standing >> feature >> > requests in Django. Though Django has an excellent database creation >> helper, >> > when faced with schema design changes, developers have to resort to >> either >> > writing raw SQL and manually performing the migrations, or using third >> party >> > apps like South[1] and Nashvegas[2]. >> > >> > Clearly Django will benefit from having a database migration helper as >> an >> > integral part of its codebase. >> > >> > From [3], the consensus seems to be on building a Ruby on Rails >> ActiveRecord >> > Migrations[4] like framework, which will essentially emit python code >> after >> > inspecting user models and current state of the database. >> >> Check the edit dates on that wiki -- most of the content on that page is >> historical, reflecting discussions that were happening over 3 years ago. >> There have been many more recent discussions. >> >> The "current consensus" (at least, the consensus of what the core team is >> likely to accept) is better reflected by the GSoC project that was >> accepted, but not completed last year. I posted to Django-developers about >> this a week or so ago [1]; there were some follow up conversations in that >> thread, too [2]. >> >> [1] http://groups.google.com/group/django-developers/msg/cf379a4f353a37f8 >> [2] http://groups.google.com/group/django-developers/msg/2f287e5e3dc9f459 >> >> > The python code >> > generated will then be fed to a 'migrations API' that will actually >> handle the >> > task of migration. This is the approach followed by South (as opposed to >> > Nashvegas's approach of generating raw SQL migration files). This >> ensures >> > modularity, one of the trademarks of Django. >> >> I don't think you're going to be able to ignore raw SQL migrations quite >> that easily. Just like the ORM isn't able to express every query, there >> will be migrations that you can't express in any schema migration >> abstraction. Raw SQL migrations will always need to be an option (even if >> they're feature limited). >> >> > Third party developers can create >> > their own inspection and ORM versioning tools, provided the inspection >> tool >> > emits python code conforming to our new migrations API. >> > >> > To sum up, the complete migrations framework will need, at the highest >> level: >> > 1. A migrations API that accepts python code and actually performs the >> > migrations. >> >> This is certainly needed. I'm a little concerned by your phrasing of an >> "API that accepts python code", though. An API is something that Python >> code can invoke, not the other way around. We're looking for >> django.db.backends.migration as an analog of django.db.backends.creation, >> not a code consuming utility library. >> >> > 2. An inspection tool that generates the appropriate python code after >> > inspecting models and current state of database. >> >> The current consensus is that this shouldn't be Django's domain -- at >> least, not in the first instance. It might be appropriate to expose an API >> to extract the current model state in a Pythonic form, but a fully-fledged, >> user accessible "tool". >> >> > 3. A versioning tool to keep track of migrations. This will allow >> 'backward' >> > migrations. >> >> If backward migrations is the only reason to have a versioning tool, then >> I'd argue you don't need versioning. >> >> However, that's not the only reason to have versioning, is it :-) >> >> > South's syncdb: >> > class Command(NoArgsCommand): >> > def handle_noargs(self, migrate_all=False, **options): >> >> As a guide for the future -- large wads of code like this aren't very >> compelling as part of a proposal unless you're trying to demonstrate >> something specific. In this case, you're just duplicating some of South's >> internals -- "I'm going to take South's lead" is all you really needed to >> say. >> >> > If migrations become a core part of Django, every user app will have a >> > migration folder(module) under it, created at the time of issuing >> > django-admin.py startapp. Thus by modifying the startapp command to >> create a >> > migrations module for every app it creates, we will be able to use >> South's >> > syncdb code as is and will also save the user from issuing >> > schemamigration --initial for all his/her apps. >> > >> > Now that we have a guaranteed migrations history for every user app, >> migrate >> > command will also be more or less a copy of South's migrate command. >> >> What does this "history" look like? Are migrations named? Are they dated? >> Numbered? How do you handle dependencies? Ordering? Collisions between >> parallel development? >> >> *This* is the sort of thing a proposal should be elaborating. >> > >> > As much as I would have liked to use Django creation API's code for >> creating >> > and destroying models, we cannot. The reason for this is Django's >> creation API >> > uses its inspection tools to generate *SQL* which is then directly fed >> to >> > cursor.execute. What we need is a migrations API which gobbles up >> *python* >> > code generated by the inspection tool. Moreover deprecating/removing >> Django's >> > creation API to use the new migrations API everywhere will give rise to >> > performance issues since time will be wasted in generating python code >> and then >> > converting python to SQL for Django's core apps which will never have >> > migrations anyways. >> >> This sounds like a false economy to me. If we're talking about the core >> pipeline for handling a HTTP request, then every method call and >> abstraction counts. However, that's not what we're talking about. We're >> talking about utilities used to synchronize the database. They're called by >> manual invocation, infrequently, and *never* as part of the >> request/response cycle. >> >> Yes, there will probably be a slowdown -- but we get the benefit of a >> consistent interface to database creation. However, unless the slowdown to >> syncdb is such that it becomes *seriously* observable -- e.g., turns sycndb >> into a 1 minute operation, rather than a 1 second operation -- then you're >> advocating for duplicating code paths in order to maintain a false economy. >> >> > The creation API and code that depends on it (syncdb, sql, >> django.test.simple >> > and django.contrib.gis.db.backends) will be left as is. >> > >> > Therefore much of the code for our new migrations API will come from >> South. >> >> Again, the code snippet highlights nothing here. Anyone qualified to >> review your proposal is at least familiar with South, so there's no need to >> give a page long example of South's usage unless you're trying to say >> something specific about South's API and usage. >> >> > Schedule and Goal >> > >> ------------------------------------------------------------------------------ >> > Week 1 : Discussion on API design and overriding django-admin >> startapp >> > Week 2-3 : Developing the base migration API >> > Week 4 : Developing migration extensions and overrides for PostgreSQL >> > Week 5 : Developing migration extensions and overrides for MySQL >> > Week 6 : Developing migration extensions and overrides for SQLite >> > Week 7 : Developing the inspection tools >> > Week 8 : Developing the ORM versioning tools and glue code >> > Week 9-10 : Writing tests/documentaion >> > Week 11-12: Buffer weeks for the unexpected, Oracle DB? and >> > djago.contrib.gis.backends? >> > >> >> Week 13 - profit. >> >> Seriously, this is a very unconvincing timetable. What are you basing >> these estimates on? >> >> Some of the things that raise flags for me: >> >> * What makes you think that MySQL, PostgreSQL and SQLite are all equally >> complex when it comes to migrations? SQLite doesn't let you rename a table. >> Tracking MySQL index changes is non-trivial. >> >> * On what basis do you assert that "developing inspection tools" -- >> presumably for all three databases covered in weeks 4-6 -- will take 1 week? >> >> * If you're not working on tests until week 9-10, how do you plan to >> establish that the work you do in week 1 actually works? >> >> > Note: Work on Oracle and GIS may not be possible as part of GSoC >> > >> > I will personally consider my project to be successful if I have >> created and >> > tested at least the base API + PostgreSQL extension and inspection + >> version >> > tools. >> >> If that's the case, then why does your schedule say you're going to >> complete MySQL and SQLite, and possibly Oracle as well? >> >> I can see that you're obviously enthused by this project, but as it >> stands, I can't say this is a very compelling proposal. >> >> * It ignores the most recent activity in the area (last year's GSoC, in >> particular) >> >> * It is extremely light in detail on how some very big details (like >> your "versioning tools" will work) >> >> * The proposed schedule reads more like a list of things you know you >> need to do, not a detailed work breakdown backed by realistic estimates. >> >> Thanks for taking the time to submit this proposal. I'd encourage you to >> have a second swing at this. Read the recent discussions on the topic; take >> a look at last year's GSoC proposal; and spend some time elaborating on the >> details that I've highlighted. >> >> Yours, >> Russ Magee %-) >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Django developers" group. >> To post to this group, send email to django-developers@googlegroups.com. >> To unsubscribe from this group, send email to >> django-developers+unsubscr...@googlegroups.com. >> For more options, visit this group at >> http://groups.google.com/group/django-developers?hl=en. >> >> > -- You received this message because you are subscribed to the Google Groups "Django developers" group. To view this discussion on the web visit https://groups.google.com/d/msg/django-developers/-/nfJvnjObKKsJ. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.