#36489: OneToOneField + concurrent get_or_create results in wrong object in 
field
cache
-------------------------------------+-------------------------------------
     Reporter:  Brian Atkinson       |                    Owner:  (none)
         Type:  Bug                  |                   Status:  new
    Component:  Database layer       |                  Version:  5.2
  (models, ORM)                      |
     Severity:  Normal               |               Resolution:
     Keywords:                       |             Triage Stage:  Accepted
    Has patch:  0                    |      Needs documentation:  0
  Needs tests:  0                    |  Patch needs improvement:  0
Easy pickings:  0                    |                    UI/UX:  0
-------------------------------------+-------------------------------------
Comment (by Simon Charette):

 Thank you for your very detailed report. I understand this is a tricky
 issue to diagnose, particularly because it should be rare under normal
 circumstances, but it's hard to determine what should be the right thing
 to do in this case as the ORM doesn't maintain an identity map (e.g. like
 SQLAlchemy session for example) so it has to balance a delicate sequence
 of cache clearing.

 A naive approach could be the following but likely breaks usage of
 `select_related("parent").get_or_create(parent=parent)` by forcing an
 extra query as it systematically clears the cache

 {{{#!diff
 diff --git a/django/db/models/query.py b/django/db/models/query.py
 index 63ab4a873a..b03f318076 100644
 --- a/django/db/models/query.py
 +++ b/django/db/models/query.py
 @@ -18,6 +18,7 @@
      IntegrityError,
      NotSupportedError,
      connections,
 +    models,
      router,
      transaction,
  )
 @@ -984,9 +984,19 @@ def get_or_create(self, defaults=None, **kwargs):
                      return self.create(**params), True
              except IntegrityError:
                  try:
 -                    return self.get(**kwargs), False
 +                    obj = self.get(**kwargs)
                  except self.model.DoesNotExist:
                      pass
 +                else:
 +                    get_field = self.model._meta.get_field
 +                    for field_name, value in kwargs.items():
 +                        if not isinstance(value, models.Model):
 +                            continue
 +                        try:
 +                            field = get_field(field_name)
 +                        except exceptions.FieldDoesNotExist:
 +                            continue
 +                        if getattr(field, "one_to_one", False):
 +                            field.remote_field.delete_cached_value(value)
 +                    return obj, False
                  raise

      get_or_create.alters_data = True
 }}}

 while the ORM doesn't have a session wide concept of an identity map
 though it has one per queryset stored in the
 `QuerySet.known_related_objects: dict[Field, dict]` attribute so it might
 be usable for this purpose? It seems like it would be coherent with how
 ''many'' related managers work by maintaining affinity (e.g.
 `author.books.get_or_create`)

 {{{#!diff
 diff --git a/django/db/models/query.py b/django/db/models/query.py
 index 63ab4a873a..b03f318076 100644
 --- a/django/db/models/query.py
 +++ b/django/db/models/query.py
 @@ -18,6 +18,7 @@
      IntegrityError,
      NotSupportedError,
      connections,
 +    models,
      router,
      transaction,
  )
 @@ -973,18 +974,34 @@ def get_or_create(self, defaults=None, **kwargs):
          # The get() needs to be targeted at the write database in order
          # to avoid potential transaction consistency problems.
          self._for_write = True
 +        get_field = self.model._meta.get_field
 +        known_related_objects = {}
 +        for field_name, value in kwargs.items():
 +            if not isinstance(value, models.Model):
 +                continue
 +            try:
 +                field = get_field(field_name)
 +            except exceptions.FieldDoesNotExist:
 +                continue
 +            if field.remote_field:
 +                known_related_objects.setdefault(field, {})[value.pk] =
 value
 +        qs = self
 +        if known_related_objects:
 +            qs = qs._clone()
 +            for field, objects in known_related_objects.items():
 +                qs._known_related_objects.setdefault(field,
 {}).update(objects)
          try:
 -            return self.get(**kwargs), False
 +            return qs.get(**kwargs), False
          except self.model.DoesNotExist:
              params = self._extract_model_params(defaults, **kwargs)
              # Try to create an object using passed params.
              try:
                  with transaction.atomic(using=self.db):
                      params = dict(resolve_callables(params))
 -                    return self.create(**params), True
 +                    return qs.create(**params), True
              except IntegrityError:
                  try:
 -                    return self.get(**kwargs), False
 +                    return qs.get(**kwargs), False
                  except self.model.DoesNotExist:
                      pass
                  raise
 }}}

 it does bring a few questions though mainly should `update_or_create` be
 updated as well and to what extent? Should `defaults` and
 `create_defaults` also be considered? Only when creation actually takes
 place or systematically?
-- 
Ticket URL: <https://code.djangoproject.com/ticket/36489#comment:4>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/django-updates/01070197c7f85a0f-bb840615-b960-44b3-bdee-cf3d54c25446-000000%40eu-central-1.amazonses.com.

Reply via email to