Hi Alex,

On Sat, Sep 27, 2014 at 5:48 PM, Alex Gaynor <alex.gay...@gmail.com> wrote:
> Thanks for reaching out about this!

No problem. I'm not sure if you recall, but we actually met before, in
San Francisco during Waza 2013.

> From the docs, the syntax looks pretty reasonable, it would be nice to see
> what some examples look like though.

There are a couple of examples at the end of the sql-insert.html
INSERT reference page. There could probably stand to be more. There
isn't an example featuring the use of a WHERE clause, though.

The WHERE clause is a bit odd because It is only evaluated on a
conclusively visible, locked row version (and if the predicate isn't
satisfied, the row is still locked, unlike a regular UPDATE). So,
potentially, you could have the predicate satisfied, where the
command's MVCC snapshot would not see a row version that you'd expect
to be satisfied by it. It might just so happen that a later row
version - the latest one - did happen to be satisfied by the predicate
at the same time, because some other session updated the row in a way
that made that the case. This is kind of esoteric, but is a new thing.
Although there have always been similar odd things for regular
UPDATEs. It's also kind of a new thing that you can update something
that is committed, and exists, even though *no* version is visible to
the command's MVCC snapshot. You have to be using READ COMMITTED for
this to happen, though, and if you are, the next command will have a
new snapshot anyway, and so it's probably a pretty academic concern
for Django people in practice.

UPSERT is more or less about reconciling two different ideas of
reality: The snapshot isolation idea, and the physical, objective
reality of whether or not an index tuple exists in a unique index at
this time. Spookiness like this is inevitable, unless you throw an
error instead, which I strongly consider to be a cop-out. If you
wanted that, you'd have used a higher isolation level.

> Yes, I think so:
> https://docs.djangoproject.com/en/dev/ref/models/querysets/#update-or-create
> exists, but right now it's manually implemented using savepoints (IIRC) -- I
> think strictly speaking it isn't safe.

I recall talking to a client about a race condition within
get_or_create() back in my consulting days.

The rule of thumb with the subtransaction looping pattern is that you
have to have a unique index defined to determine when a duplicate is
inserted according to your particular idea of a duplicate (obviously),
and you must be prepared to retry. So typically you update, then you
insert. Loop until one of those two works. If you just try both once,
there are still race conditions.

What I think would be particularly interesting to hear is your level
of interest in the optional "WITHIN unique_index_name" clause. The
idea here is that you can name a unique index directly, and have that
explicitly be the place where the violation is expected, if anywhere
(there ought to be an expectation of having such a violation be in one
place, or else surprising things can happen).

Now, I'm sure the first thing you'll say, as others have is: "Can't it
be the name of some columns instead, so that Postgres figures out
which unique index I mean"? It turns out that that's quite ticklish in
certain edge cases (e.g. partial unique indexes with BEFORE triggers).
We might come up with a better way that's fully general, but I'm not
holding my breath.

> I think even under MySQL it doesn't use the native support for this feature,
> having it in PostgreSQL as well would be good motivation.

Cool.

-- 
Peter Geoghegan

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CAM3SWZQx%3DR2HxdSMBNTM1XFY9phyj0kq4wysRaFykvPVqdULaA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to