Setting database default values in migrations (postgres)

2014-10-30 Thread Peter Coles
Hey all,

I'm interested in getting database migrations to support setting 
database-level default values. I've heard talk of this being a hotly 
contested issue in the past, but I haven't been able to find the actual 
conversations, so far this 

 
is the best thread I've found and I've already asked about it in 
django-users 
.

Instead of just asking for this feature, I've gone ahead and created a new 
db backend that is almost exactly 
the `django.db.backends.postgresql_psycopg2` module, but with two key code 
blocks removed (that appear to be the logic to remove default values from 
the database).

Here's the repo: https://github.com/ringly/django-postgres-dbdefaults

and here it is in pypi: 
https://pypi.python.org/pypi/django-postgres-dbdefaults/0.0.1

I'd love to hear feedback/thoughts/concerns.

-Peter

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/5bf3f984-30fb-4adf-a992-616cbdedd864%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Setting database default values in migrations (postgres)

2014-10-30 Thread Peter Coles
Thanks for the quick response! I just tried datetimefields with 
`datetime.datetime.now` and `timezone.now` as defaults (and another one 
with null=True, just to see). They just evaluated the callables and set 
those as static default values, e.g., the code generated was this:

BEGIN;
ALTER TABLE "customers" ADD COLUMN "created_datetime_now" timestamp with 
time zone DEFAULT '2014-10-29 13:31:34.519110+00:00' NOT NULL;
ALTER TABLE "customers" ADD COLUMN "created_timezone_now" timestamp with 
time zone DEFAULT '2014-10-29 13:31:34.607064+00:00' NOT NULL;
ALTER TABLE "customers" ADD COLUMN "created_null_true" timestamp with time 
zone NULL;

COMMIT;

This isn't ideal, but doesn't look too far off from what the existing 
migration does anyways. For example, I tested adding a not null boolean 
column with a default value of false to a table that had pre-existing rows, 
and the SQL for it looks like this:

BEGIN;
ALTER TABLE "temp_table" ADD COLUMN "temp" boolean NOT NULL DEFAULT false;
ALTER TABLE "temp_table" ALTER COLUMN "temp" DROP DEFAULT;
COMMIT;

If I then do a select from that table, I see that all existing columns have 
taken on the default value of false for the "temp" column.


db=# SELECT "temp" FROM "temp_table" LIMIT 5;
 temp
--
 f
 f
 f
 f
 f
(5 rows)

Even if I change the add column line to be nullable (`ALTER TABLE 
"temp_table" ADD COLUMN "temp" boolean NULL DEFAULT false`), it still 
introduces the default value to existing rows.

Given how this works in practice, I'd either label that a bug in the 
current implementation or suggest that it's one more reason that putting 
defaults in the db isn't that big of a deal.

On the other hand I haven't played around with callables for default values 
much more than with timestamps, are there some that might actually raise 
exceptions?

I'll keep playing around with this and come back later when I have more 
input on using it in practice and probably find some lurking bug somewhere. 
Worst case, other people who really want this functionality could use this 
as a 3rd-party dependency. Regardless, I think the docs should be clearer 
about defaults not getting set in the db.




On Thursday, October 30, 2014 12:47:29 PM UTC-4, Andrew Godwin wrote:
>
> Hi Peter,
>
> The main reason this hasn't been done in the past was that Django has 
> never made use of database defaults, mostly for the reason described in 
> that thread - that Django is sufficiently expressive with its defaults (you 
> can pass in any callable) that we can't represent them in the database. The 
> only reason they're ever set at all is because we need them to add columns, 
> and even then I'd considered a mode where django adds all columns as NULL 
> and calls the default once per row rather than doing one call for the whole 
> set of rows.
>
> I'm not totally against the idea, but I'd want to see a convincing method 
> as to how you detect and deal with defaults that are not expressible in the 
> database (for example, what if someone does the very common 
> default=datetime.datetime.now). Obviously all calls through Django will 
> still work as intended, but other database users will have entirely the 
> wrong value in that field whenever they insert rows but no errors to tell 
> them about it (and if you don't have other database users than Django, why 
> do you need the defaults in there in the first place?)
>
> Andrew
>
> On Thu, Oct 30, 2014 at 8:47 AM, Peter Coles  > wrote:
>
>> Hey all,
>>
>> I'm interested in getting database migrations to support setting 
>> database-level default values. I've heard talk of this being a hotly 
>> contested issue in the past, but I haven't been able to find the actual 
>> conversations, so far this 
>> <https://groups.google.com/forum/#!searchin/django-developers/database$20defaults/django-developers/fHjzttZTkzc/oXbrpBa0dHAJ>
>>  
>> is the best thread I've found and I've already asked about it in 
>> django-users 
>> <https://groups.google.com/forum/#!topic/django-users/_N2mtxFTFjI>.
>>
>> Instead of just asking for this feature, I've gone ahead and created a 
>> new db backend that is almost exactly 
>> the `django.db.backends.postgresql_psycopg2` module, but with two key code 
>> blocks removed (that appear to be the logic to remove default values from 
>> the database).
>>
>> Here's the repo: https://github.com/ringly/django-postgres-dbdefaults
>>
>> and here it is in pypi: 
>> https://pypi.python.org/pypi/django-postgres-dbdefaults/0.0.1
>>

Re: Setting database default values in migrations (postgres)

2014-10-30 Thread Peter Coles
Interesting scenario. However, the same example seems entirely broken in 
the existing django migrations with `null=False` and a default too, right?

If so, maybe our current lens is a bit too myopic, and the UNIQUE scenario 
is a separate problem where model updates on existing tables can sometimes 
create invalid migrations?

Definitely worth adding into my README though. Thank you for the great 
input so far!


On Thursday, October 30, 2014 5:00:50 PM UTC-4, Andrew Godwin wrote:
>
> Well, the one big problem with callable defaults is people who use them 
> for UNIQUE fields (e.g. they're using uuid as the value or something). 
> These migrations will apply fine to empty tables - as there's no data to 
> cause non-unique issues with - but your solution would leave an entirely 
> invalid default in there.
>
> If this does happen, I'd probably want some way to declare what defaults 
> to keep. (South actually used to have this with a keep_default option on 
> the add_column method but it was kind of unmaintained)
>
> Andrew
>
> On Thu, Oct 30, 2014 at 1:45 PM, Peter Coles  > wrote:
>
>> Thanks for the quick response! I just tried datetimefields with 
>> `datetime.datetime.now` and `timezone.now` as defaults (and another one 
>> with null=True, just to see). They just evaluated the callables and set 
>> those as static default values, e.g., the code generated was this:
>>
>> BEGIN;
>> ALTER TABLE "customers" ADD COLUMN "created_datetime_now" timestamp with 
>> time zone DEFAULT '2014-10-29 13:31:34.519110+00:00' NOT NULL;
>> ALTER TABLE "customers" ADD COLUMN "created_timezone_now" timestamp with 
>> time zone DEFAULT '2014-10-29 13:31:34.607064+00:00' NOT NULL;
>> ALTER TABLE "customers" ADD COLUMN "created_null_true" timestamp with 
>> time zone NULL;
>>
>> COMMIT;
>>
>> This isn't ideal, but doesn't look too far off from what the existing 
>> migration does anyways. For example, I tested adding a not null boolean 
>> column with a default value of false to a table that had pre-existing rows, 
>> and the SQL for it looks like this:
>>
>> BEGIN;
>> ALTER TABLE "temp_table" ADD COLUMN "temp" boolean NOT NULL DEFAULT false;
>> ALTER TABLE "temp_table" ALTER COLUMN "temp" DROP DEFAULT;
>> COMMIT;
>>
>> If I then do a select from that table, I see that all existing columns 
>> have taken on the default value of false for the "temp" column.
>>
>>
>> db=# SELECT "temp" FROM "temp_table" LIMIT 5;
>>  temp
>> --
>>  f
>>  f
>>  f
>>  f
>>  f
>> (5 rows)
>>
>> Even if I change the add column line to be nullable (`ALTER TABLE 
>> "temp_table" ADD COLUMN "temp" boolean NULL DEFAULT false`), it still 
>> introduces the default value to existing rows.
>>
>> Given how this works in practice, I'd either label that a bug in the 
>> current implementation or suggest that it's one more reason that putting 
>> defaults in the db isn't that big of a deal.
>>
>> On the other hand I haven't played around with callables for default 
>> values much more than with timestamps, are there some that might actually 
>> raise exceptions?
>>
>> I'll keep playing around with this and come back later when I have more 
>> input on using it in practice and probably find some lurking bug somewhere. 
>> Worst case, other people who really want this functionality could use this 
>> as a 3rd-party dependency. Regardless, I think the docs should be clearer 
>> about defaults not getting set in the db.
>>
>>
>>
>>
>> On Thursday, October 30, 2014 12:47:29 PM UTC-4, Andrew Godwin wrote:
>>>
>>> Hi Peter,
>>>
>>> The main reason this hasn't been done in the past was that Django has 
>>> never made use of database defaults, mostly for the reason described in 
>>> that thread - that Django is sufficiently expressive with its defaults (you 
>>> can pass in any callable) that we can't represent them in the database. The 
>>> only reason they're ever set at all is because we need them to add columns, 
>>> and even then I'd considered a mode where django adds all columns as NULL 
>>> and calls the default once per row rather than doing one call for the whole 
>>> set of rows.
>>>
>>> I'm not totally against the idea, but I'd want to see a convincing 
>>> method as to how you