Less than a week remains for student application deadline. Can someone 
please comment on the above revised proposal. Thanks a lot.

On Monday, 26 March 2012 01:29:35 UTC+5:30, j4nu5 wrote:
>
> Here is a revised proposal.
>
> Abstract
>
> ------------------------------------------------------------------------------
> A database migration helper has been one of the most long standing feature
> requests in Django. Though Django has an excellent database creation 
> helper,
> when faced with schema design changes, developers have to resort to either
> writing raw SQL and manually performing the migrations, or using third 
> party
> apps like South[1] and Nashvegas[2].
>
> [1] http://south.aeracode.org/
> [2] https://github.com/paltman/nashvegas/
>
> Clearly Django will benefit from having a database migration helper as an
> integral part of its codebase.
>
> From the summary on django-developers mailing list[3], the task of 
> building a
> migrations framework will involve:
> 1. Add a db.backends module to provide an abstract interface to migration
>    primitives (add column, add index, rename column, rename table, and so 
> on).
> 2. Add a contrib app that performs the high level accounting of "has 
> migration
>    X been applied", and management commands to "apply all outstanding
>    migrations"
> 3. Provide an API that allows end users to define raw-SQL migrations, or
>    native Python migrations using the backend primitives.
> 4. Leave the hard task of determining dependencies, introspection of 
> database
>    models and so on to the toolset contributed by the broader community.
>
> [3] http://groups.google.com/
> ​group/django-developers/msg/​cf379a4f353a37f8
>
> I would like to work on the 1st step as part of this year's GSoC.
>
>
> Implementation plan
>
> ------------------------------------------------------------------------------
> The idea is to have a CRUD interface to database schema (with some 
> additional
> utility functions for indexing etc.) with functions like:
> * create_table
> * rename_table
> * delete_table
> * add_column
> and so on, which will have the *explicit* names of the table/column to be
> modified as its parameter. It will be the responsibility of the higher 
> level
> API caller (will not be undertaken as part of GSoC) to translate 
> model/field
> names to explicit table/column names. These functions will be directly
> responsible for modifying the schema, and any interaction with the database
> schema will take place by calling these functions. Most of these functions
> will come from South.
>
> These API functions will also have a "dry-run" or test mode, in which they
> will output raw SQL representation of the migration or display errors if 
> they
> occur. This will be useful in:
> 1. The MySQL backend. MySQL does not have transaction support for schema
>    modification and hence the migrations will be run in a dry run mode 
> first
>    so that any errors can be captured before altering the schema.
> 2. The django-admin commands sql and sqlall that return the SQL (for 
> creation
>    and indexing) for an app. They will capture the SQL returned from the 
> API
>    running in dry run mode.
>
> As for the future of the current Django creation API, it will have to be
> refactored (not under GSoC) to make use of the 'create' part of our new 
> CRUD
> interface, for consistency purposes.
>
> The GeoDjango backends will also have to be refactored to use the new API.
> Since, they build upon the base code in db.backends, firstly db.backends 
> will
> have to be refactored.
>
> Last year xtrqt had written, documented and tested code for at least the
> SQLite backend[4]. As per Andrew's suggestion, I would not be relying too 
> much
> on that code but some parts can still be salvaged.
>
> [4] https://groups.google.com/
> ​forum/?fromgroups#!searchin/​django-developers/xtrqt/​django-developers/pSICNJBJRy8/​Hl7frp-O-dMJ
>
>
> Schedule and Goal
>
> ------------------------------------------------------------------------------
> Week 1     : Discussion on API design and writing tests
> Week 2-3   : Developing the base migration API
> Week 4     : Developing extensions and overrides for PostgreSQL
> Week 5-6   : Developing extensions and overrides for MySQL
> Week 7-8.5 : Developing extensions and overrides for SQLite (may be 
> shorter or
>              longer (by 0.5 week) depending on how much of xtrqt's code is
>              considered acceptable)
> Week 8.5-10: Writing documentaion and leftover regression tests, if any
> Week 11-12 : Buffer weeks for the unexpected
>
> I will consider my project to be successful when we have working, tested 
> and
> documented migration primitives for Postgres, MySQL and SQLite. If we can
> develop a working fork of South to use these primitives, that will be a 
> strong
> indicator of the project's success.
>
>
> About me and my inspiration for the project
>
> ------------------------------------------------------------------------------
> I am Kushagra Sinha, a pre-final year student at Institute of Technology
> (about to be converted to an Indian Institute of Technology),
> Banaras Hindu University, Varanasi, India.
>
> I can be reached at:
> Gmail: sinha.kushagra
> Alternative email: kush [at] j4nu5 [dot] com
> IRC: Nick j4nu5 on #django-dev and #django
> Twitter: @j4nu5
> github: j4nu5
>
> I was happily using PHP for nearly all of my webdev work since my high 
> school
> days (CakePHP being my framework of choice) till I was introduced to Django
> a year and a half ago. Comparing Django with CakePHP (which is Ruby on 
> Rails
> inspired) I felt more attached to Django's philosophy than RoR's "hidden 
> magic"
> approach. I have been in love ever since :)
>
> Last year I had an internship at MobStac[5] (BusinessWorld magazine India's
> hottest young startup[6]). Their stack is on Django+MySQL. I was involved 
> in
> a heavy database migration that involved their analytics platform. Since, 
> they
> had not been using a migrations framework, the situation looked grim.
> Fortunately, South came to the rescue and we were able to carry out the
> migration but it left everyone a little frustrated and clearly in want of a
> migrations framework built within Django itself.
>
> [5] http://mobstac.com/
> [6] 
> http://blog.mobstac.com/blog/2011/06/businessworld-declares-mobstac-indias-hottest-young-startup/
>
>
> Experience
>
> ------------------------------------------------------------------------------
> I have experience working in a high voltage database migration through my
> internship as stated before. I am also familiar with Django's contribution
> guidelines and have written a couple of patches[7]. One patch has been
> accepted and the second got blocked by 1.4's feature freeze.
> My other projects can be seen on my github[8]
>
> [7] https://code.djangoproject.com/query?owner=~j4nu5
> [8] https://github.com/j4nu5
>
> On Mon, Mar 19, 2012 at 5:03 AM, Russell Keith-Magee <
> russ...@keith-magee.com> wrote:
>
>>
>> On 18/03/2012, at 7:38 PM, Kushagra Sinha wrote:
>>
>> > Abstract
>> > 
>> ------------------------------------------------------------------------------
>> > A database migration helper has been one of the most long standing 
>> feature
>> > requests in Django. Though Django has an excellent database creation 
>> helper,
>> > when faced with schema design changes, developers have to resort to 
>> either
>> > writing raw SQL and manually performing the migrations, or using third 
>> party
>> > apps like South[1] and Nashvegas[2].
>> >
>> > Clearly Django will benefit from having a database migration helper as 
>> an
>> > integral part of its codebase.
>> >
>> > From [3], the consensus seems to be on building a Ruby on Rails 
>> ActiveRecord
>> > Migrations[4] like framework, which will essentially emit python code 
>> after
>> > inspecting user models and current state of the database.
>>
>> Check the edit dates on that wiki -- most of the content on that page is 
>> historical, reflecting discussions that were happening over 3 years ago. 
>> There have been many more recent discussions.
>>
>> The "current consensus" (at least, the consensus of what the core team is 
>> likely to accept) is better reflected by the GSoC project that was 
>> accepted, but not completed last year. I posted to Django-developers about 
>> this a week or so ago [1]; there were some follow up conversations in that 
>> thread, too [2].
>>
>> [1] http://groups.google.com/group/django-developers/msg/cf379a4f353a37f8
>> [2] http://groups.google.com/group/django-developers/msg/2f287e5e3dc9f459
>>
>> > The python code
>> > generated will then be fed to a 'migrations API' that will actually 
>> handle the
>> > task of migration. This is the approach followed by South (as opposed to
>> > Nashvegas's approach of generating raw SQL migration files). This 
>> ensures
>> > modularity, one of the trademarks of Django.
>>
>> I don't think you're going to be able to ignore raw SQL migrations quite 
>> that easily. Just like the ORM isn't able to express every query, there 
>> will be migrations that you can't express in any schema migration 
>> abstraction. Raw SQL migrations will always need to be an option (even if 
>> they're feature limited).
>>
>> > Third party developers can create
>> > their own inspection and ORM versioning tools, provided the inspection 
>> tool
>> > emits python code conforming to our new migrations API.
>> >
>> > To sum up, the complete migrations framework will need, at the highest 
>> level:
>> > 1. A migrations API that accepts python code and actually performs the
>> >    migrations.
>>
>> This is certainly needed. I'm a little concerned by your phrasing of an 
>> "API that accepts python code", though. An API is something that Python 
>> code can invoke, not the other way around. We're looking for 
>> django.db.backends.migration as an analog of django.db.backends.creation, 
>> not a code consuming utility library.
>>
>> > 2. An inspection tool that generates the appropriate python code after
>> >    inspecting models and current state of database.
>>
>> The current consensus is that this shouldn't be Django's domain -- at 
>> least, not in the first instance. It might be appropriate to expose an API 
>> to extract the current model state in a Pythonic form, but a fully-fledged, 
>> user accessible "tool".
>>
>> > 3. A versioning tool to keep track of migrations. This will allow 
>> 'backward'
>> >    migrations.
>>
>> If backward migrations is the only reason to have a versioning tool, then 
>> I'd argue you don't need versioning.
>>
>> However, that's not the only reason to have versioning, is it :-)
>>
>> > South's syncdb:
>> > class Command(NoArgsCommand):
>> >     def handle_noargs(self, migrate_all=False, **options):
>>
>> As a guide for the future -- large wads of code like this aren't very 
>> compelling as part of a proposal unless you're trying to demonstrate 
>> something specific. In this case, you're just duplicating some of South's 
>> internals -- "I'm going to take South's lead" is all you really needed to 
>> say.
>>
>> > If migrations become a core part of Django, every user app will have a
>> > migration folder(module) under it, created at the time of issuing
>> > django-admin.py startapp. Thus by modifying the startapp command to 
>> create a
>> > migrations module for every app it creates, we will be able to use 
>> South's
>> > syncdb code as is and will also save the user from issuing
>> > schemamigration --initial for all his/her apps.
>> >
>> > Now that we have a guaranteed migrations history for every user app, 
>> migrate
>> > command will also be more or less a copy of South's migrate command.
>>
>> What does this "history" look like? Are migrations named? Are they dated? 
>> Numbered? How do you handle dependencies? Ordering? Collisions between 
>> parallel development?
>>
>> *This* is the sort of thing a proposal should be elaborating.
>> >
>> > As much as I would have liked to use Django creation API's code for 
>> creating
>> > and destroying models, we cannot. The reason for this is Django's 
>> creation API
>> > uses its inspection tools to generate *SQL* which is then directly fed 
>> to
>> > cursor.execute. What we need is a migrations API which gobbles up 
>> *python*
>> > code generated by the inspection tool. Moreover deprecating/removing 
>> Django's
>> > creation API to use the new migrations API everywhere will give rise to
>> > performance issues since time will be wasted in generating python code 
>> and then
>> > converting python to SQL for Django's core apps which will never have
>> > migrations anyways.
>>
>> This sounds like a false economy to me. If we're talking about the core 
>> pipeline for handling a HTTP request, then every method call and 
>> abstraction counts. However, that's not what we're talking about. We're 
>> talking about utilities used to synchronize the database. They're called by 
>> manual invocation, infrequently, and *never* as part of the 
>> request/response cycle.
>>
>> Yes, there will probably be a slowdown -- but we get the benefit of a 
>> consistent interface to database creation. However, unless the slowdown to 
>> syncdb is such that it becomes *seriously* observable -- e.g., turns sycndb 
>> into a 1 minute operation, rather than a 1 second operation -- then you're 
>> advocating for duplicating code paths in order to maintain a false economy.
>>
>> > The creation API and code that depends on it (syncdb, sql, 
>> django.test.simple
>> > and django.contrib.gis.db.backends) will be left as is.
>> >
>> > Therefore much of the code for our new migrations API will come from 
>> South.
>>
>> Again, the code snippet highlights nothing here. Anyone qualified to 
>> review your proposal is at least familiar with South, so there's no need to 
>> give a page long example of South's usage unless you're trying to say 
>> something specific about South's API and usage.
>>
>> > Schedule and Goal
>> > 
>> ------------------------------------------------------------------------------
>> > Week 1    : Discussion on API design and overriding django-admin 
>> startapp
>> > Week 2-3  : Developing the base migration API
>> > Week 4    : Developing migration extensions and overrides for PostgreSQL
>> > Week 5    : Developing migration extensions and overrides for MySQL
>> > Week 6    : Developing migration extensions and overrides for SQLite
>> > Week 7    : Developing the inspection tools
>> > Week 8    : Developing the ORM versioning tools and glue code
>> > Week 9-10 : Writing tests/documentaion
>> > Week 11-12: Buffer weeks for the unexpected, Oracle DB? and
>> >             djago.contrib.gis.backends?
>> >
>>
>> Week 13 - profit.
>>
>> Seriously, this is a very unconvincing timetable. What are you basing 
>> these estimates on?
>>
>> Some of the things that raise flags for me:
>>
>>  * What makes you think that MySQL, PostgreSQL and SQLite are all equally 
>> complex when it comes to migrations? SQLite doesn't let you rename a table. 
>> Tracking MySQL index changes is non-trivial.
>>
>> * On what basis do you assert that "developing inspection tools" -- 
>> presumably for all three databases covered in weeks 4-6 -- will take 1 week?
>>
>>  * If you're not working on tests until week 9-10, how do you plan to 
>> establish that the work you do in week 1 actually works?
>>
>> > Note: Work on Oracle and GIS may not be possible as part of GSoC
>> >
>> > I will personally consider my project to be successful if I have 
>> created and
>> > tested at least the base API + PostgreSQL extension and inspection + 
>> version
>> > tools.
>>
>> If that's the case, then why does your schedule say you're going to 
>> complete MySQL and SQLite, and possibly Oracle as well?
>>
>> I can see that you're obviously enthused by this project, but as it 
>> stands, I can't say this is a very compelling proposal.
>>
>>  * It ignores the most recent activity in the area (last year's GSoC, in 
>> particular)
>>
>>  * It is extremely light in detail on how some very big details (like 
>> your "versioning tools" will work)
>>
>>  * The proposed schedule reads more like a list of things you know you 
>> need to do, not a detailed work breakdown backed by realistic estimates.
>>
>> Thanks for taking the time to submit this proposal. I'd encourage you to 
>> have a second swing at this. Read the recent discussions on the topic; take 
>> a look at last year's GSoC proposal; and spend some time elaborating on the 
>> details that I've highlighted.
>>
>> Yours,
>> Russ Magee %-)
>>
>> --
>> You received this message because you are subscribed to the Google Groups 
>> "Django developers" group.
>> To post to this group, send email to django-developers@googlegroups.com.
>> To unsubscribe from this group, send email to 
>> django-developers+unsubscr...@googlegroups.com.
>> For more options, visit this group at 
>> http://groups.google.com/group/django-developers?hl=en.
>>
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/django-developers/-/nfJvnjObKKsJ.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Reply via email to