Adding support for postgres 10

2017-10-01 Thread Amir Reza Ghods
Postgres 10 is coming with some exciting features such as native partitioning. 
Would it be a good idea to plan for supporting these new features in Django?

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/35ada368-f57a-4adc-8d3e-a1846027b3b2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: bulk_create on Postgresql: on conflict do nothing / post_save signal

2017-10-01 Thread Дилян Палаузов

Hello,

fetching 3GB of existing records to only pass afterwards to 
bulk_create() some non-existent ones is not feasible.


I found a way to convince Postgresql to report which rows were not 
inserted on INSERT ON CONFLICT DO UPDATE:


Consider this:
CREATE TABLE t (
  id SERIAL PRIMARY KEY,
  name VARCHAR(10) NOT NULL UNIQUE,
  comment VARCHAR(10) NOT NULL);

And now the magic:

WITH
  to_be_inserted AS (
 SELECT 'name1' AS "name", 'comment1' as "comment" UNION ALL
 SELECT 'name4',   'comment4'  UNION ALL
 SELECT 'name5',   'comment5'),
  successfully_inserted AS (
  INSERT INTO t ("name", "comment" ) SELECT *
FROM to_be_inserted ON CONFLICT DO NOTHING RETURNING *)
SELECT s.id FROM to_be_inserted AS b
  LEFT JOIN successfully_inserted AS s ON (b.name = s.name AND 
b.comment = s.comment);


Returns a column "id" where for each record from to_be_inserted the id 
is NULL for already existing records, or the new identifier.


This way bulk_create() can be implemented, so that it sends post_save 
signal for all records created, forwards ON CONFLICT DO NOTHING to the 
database and returns only the objects from its input, which were 
actually created.


Looking at the existing code, my feeling is that this query does not fit 
anyhow in the current approaches, hence I will be very glad if somebody 
gets expired from this idea and implements it in Django.


Greetings
  Дилян

On 09/28/2017 07:20 PM, Tom Forbes wrote:
I've been in similar situations before, you can usually get away with 
using a single query to fetch existing records and only pass data that 
doesn't exist to bulk_create. This works great for a single identity 
column, but if you have multiple it gets messy.


It seems all supported databases offer at least ON CONFLICT IGNORE in 
some form or another, with pretty similar syntax.


On 28 Sep 2017 18:11, "Дилян Палаузов" > wrote:



Hello,

I want after a user request to be sure that certain objects are
stored in a Postgres database, even if before the request some of
the objects were there.

The only way I can do this with django, not talking about raw sql,
is with "for obj in objects: Model.objects.get_or_create(obj)".  It
works, but creates several INSERTs, and is hence suboptimal.

I cannot use bulk_create(), which squeezes all the INSERTs to a
single one, as it does not work, if any of the to-be-inserted rows
was already in the database.

In Postgresql this can be achieved by sending "INSERT ... ON
CONFLICT DO NOTHING".

I propose changing the interface of QuerySet.bulk_create to accept
one more parameter on_conflict, that can be a string e.g. "DO
NOTHING" or Q-objects (which could be used to implement ON CONFLICT
DO UPDATE SET ... WHERE ...

def bulk_create(self, objs, batch_size=None, on_conflict=None): ...

What are the arguments against or in favour?

The further, bulk_create() does not send post_save signal, because
it is difficult to implement with the standard backends, except with
postgresql.

I propose extending the implementation to send the signal:

https://code.djangoproject.com/ticket/28641#comment:1


when Postgresql is used.  I assume there a no users, who want to get
a (post_save) signal on save() but not on bulk_create().

Combining ON CONFLICT DO NOTHING with sending post_save gets however
nasty, as "INSERT ... ON CONFLICT DO NOTHING RETURNING id;" does not
return anything on unchanged rows, hence the system knows at the end
how much rows were changed, but not which, so it cannot determine
for which objects to send post_save.  At least I have not found a
way how to figure out which rows were inserted/not inserted.

However, this can be achieved by RETURNING * and then comparing the
returned objects to the sent objects, eventually making
bulk_create() return the objects actually inserted in the database.

These changes will allow a switch to a single INSERT on Postgresql.

Regards
   Дилян

-- 
You received this message because you are subscribed to the Google

Groups "Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to django-developers+unsubscr...@googlegroups.com
.
To post to this group, send email to
django-developers@googlegroups.com
.
Visit this group at
https://groups.google.com/group/django-developers
.
To view this discussion on the web visit

https://groups.google.com/d/msgid/django-developers/daa88462-c095-dfcc-2ce7-6d34f6bbc2f6%40aegee.org



non-concurrent QuerySet.get_or_create() on Postgresql

2017-10-01 Thread Дилян Палаузов

Hello,

currently get_or_create(params) is implemented (imprecisely) this way:

try:
  self.get(params)
except DoesNotExist:
  self._create_object_from_params(params)

This creates concurrency problem, as the object might get created by 
another thread, after get(params) threw an exception and before 
_create_object_from_params has started.


Luckily Postgresql lets combine the above statements into a single 
structured query.  This should be a performance gain as for the DB is 
faster to execute one than a two queries.


Consider this:

CREATE TABLE t (
  id SERIAL PRIMARY KEY,
  name VARCHAR(10) NOT NULL UNIQUE,
  comment VARCHAR(10));

And here comes the magic:

WITH
  to_be_inserted AS (SELECT 'nameD' as "name", 'comment13' as "comment"),
  just_inserted AS (
 INSERT INTO t (name, comment) SELECT * FROM to_be_inserted
WHERE NOT EXISTS(
 SELECT * FROM t WHERE t.name='nameD')
 RETURNING *)
SELECT *, FALSE as "created" FROM t WHERE t.name='nameD' UNION ALL 
SELECT *, TRUE AS "created" FROM just_inserted;


where "to_be_inserted contains" the values for the new object ('default' 
parameter of get_or_create) and 'nameB' is the criterion passed to get().


Regards
  Дилян

--
You received this message because you are subscribed to the Google Groups "Django 
developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/f82e01e6-0bd0-1a44-957d-1e1544593af6%40aegee.org.
For more options, visit https://groups.google.com/d/optout.


Prevent DateRangeField overlap in model?

2017-10-01 Thread Brylie Christopher Oxley
Now that Django supports the DateRangeField, is there a 'Pythonic' way to 
prevent records from having overlapping date ranges?

# Hypothetical use case
One hypothetical use case would be a booking system, where you don't want 
people to book the same resource at the same time.

# Hypothetical example code


class Booking(models.model):
# The resource to be reserved
resource = models.ForeignKey('Resource')
# When to reserve the resource
date_range = models.DateRangeField()

class Meta:
unique_together = ('resource', 'date_range',)

This might have the effect of adding an exclusion constraint on the underlying 
table:
https://www.postgresql.org/docs/9.3/static/sql-createtable.html#SQL-CREATETABLE-EXCLUDE
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/38774F1F-C132-4F55-A72F-A4449C0A14D9%40amble.fi.
For more options, visit https://groups.google.com/d/optout.