#36318: Bad stack trace during rollback after bulk create
--------------------------------+--------------------------------------
Reporter: Gordon Wrigley | Owner: (none)
Type: Uncategorized | Status: new
Component: Uncategorized | Version: 4.2
Severity: Normal | Resolution:
Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
--------------------------------+--------------------------------------
Changes (by Gordon Wrigley):
* summary: Bad stack trace during rollback => Bad stack trace during
rollback after bulk create
Old description:
> I'M REWRITING ALL OF THIS
>
> TL;DNR the value of `connection.rollback_exc` is being set in one celery
> task and then raised in another later celery task.
>
> I'm on 4.2.20 and I got this stack trace from a celery worker
>
> {{{
> UniqueViolation: duplicate key value violates unique constraint
> "retain_consumer_firm_id_email_hash_3f1266dc_uniq"
> DETAIL: Key (firm_id, email_hash)=(ac90f753-3f10-44f2-af43-b851371c5572,
> 863397f56c061afbaa0822c9c16ece2b55f2e972) already exists.
>
> File "django/db/backends/utils.py", line 89, in _execute
> return self.cursor.execute(sql, params)
>
> IntegrityError: duplicate key value violates unique constraint
> "retain_consumer_firm_id_email_hash_3f1266dc_uniq"
> DETAIL: Key (firm_id, email_hash)=(ac90f753-3f10-44f2-af43-b851371c5572,
> 863397f56c061afbaa0822c9c16ece2b55f2e972) already exists.
>
> File "data_import/import_from_csv.py", line 1692, in _consumer_from_row
> consumer.save()
> File "core/models/consumer.py", line 615, in save
> return super().save(*args, **kwargs)
> File "core/models/firm_user.py", line 197, in save
> super().save(update_fields=update_fields, **kwargs)
> File "common/models.py", line 900, in save
> super().save(*args, **kwargs)
> File "django/contrib/auth/base_user.py", line 76, in save
> super().save(*args, **kwargs)
> File "django/db/models/base.py", line 814, in save
> self.save_base(
> File "django/db/models/base.py", line 877, in save_base
> updated = self._save_table(
> File "django/db/models/base.py", line 1020, in _save_table
> results = self._do_insert(
> File "django/db/models/base.py", line 1061, in _do_insert
> return manager._insert(
> File "django/db/models/manager.py", line 87, in manager_method
> return getattr(self.get_queryset(), name)(*args, **kwargs)
> File "django/db/models/query.py", line 1805, in _insert
> return query.get_compiler(using=using).execute_sql(returning_fields)
> File "django/db/models/sql/compiler.py", line 1822, in execute_sql
> cursor.execute(sql, params)
> File "django/db/backends/utils.py", line 67, in execute
> return self._execute_with_wrappers(
> File "django/db/backends/utils.py", line 80, in _execute_with_wrappers
> return executor(sql, params, many, context)
> File "django/db/backends/utils.py", line 84, in _execute
> with self.db.wrap_database_errors:
> File "django/db/utils.py", line 91, in __exit__
> raise dj_exc_value.with_traceback(traceback) from exc_value
> File "django/db/backends/utils.py", line 89, in _execute
> return self.cursor.execute(sql, params)
>
> TransactionManagementError: An error occurred in the current transaction.
> You can't execute queries until the end of the 'atomic' block.
> File "celery/app/trace.py", line 760, in __protected_call__
> return self.run(*args, **kwargs)
> File "common/tasks.py", line 19, in wrapper
> return func(*args, **kwargs)
> File "contextlib.py", line 81, in inner
> return func(*args, **kwds)
> File "product_selection/tasks.py", line 129, in try_product_import
> product_import.save()
> File "common/models.py", line 900, in save
> super().save(*args, **kwargs)
> File "django/db/models/base.py", line 814, in save
> self.save_base(
> File "django/db/models/base.py", line 877, in save_base
> updated = self._save_table(
> File "django/db/models/base.py", line 990, in _save_table
> updated = self._do_update(
> File "django/db/models/base.py", line 1054, in _do_update
> return filtered._update(values) > 0
> File "django/db/models/query.py", line 1231, in _update
> return query.get_compiler(self.db).execute_sql(CURSOR)
> File "django/db/models/sql/compiler.py", line 1984, in execute_sql
> cursor = super().execute_sql(result_type)
> File "django/db/models/sql/compiler.py", line 1562, in execute_sql
> cursor.execute(sql, params)
> File "django/db/backends/utils.py", line 67, in execute
> return self._execute_with_wrappers(
> File "django/db/backends/utils.py", line 80, in _execute_with_wrappers
> return executor(sql, params, many, context)
> File "django/db/backends/utils.py", line 83, in _execute
> self.db.validate_no_broken_transaction()
> File "django/db/backends/base/base.py", line 531, in
> validate_no_broken_transaction
> raise TransactionManagementError(
> }}}
>
> There's two related problems with this. First the lines around `File
> "data_import/import_from_csv.py", line 1692, in _consumer_from_row` look
> like this
> {{{
> try:
> with transaction.atomic():
> consumer.save()
> if change_log:
> self.record_log(consumer, change_log)
> except IntegrityError:
> }}}
> So that has been well handled and the transaction is good again.
>
> Secondly there is no relationship between the `File
> "product_selection/tasks.py", line 129, in try_product_import` code path
> from the `TransactionManagementError` and the `File
> "data_import/import_from_csv.py", line 1692` in _consumer_from_row` code
> path from the `IntegrityError` and `UniqueViolation`, these are totally
> unrelated celery tasks.
>
> It looks for all intents as if the stack for the
> `IntegrityError`/`UniqueViolation` has been captured in one celery task
> and then raised in a different later celery task.
>
> Digging through the code and adding prints supports this view.
> 1. `connection.rollback_exc` is never cleared anywhere
> 2. `Atomic.__exit__` can set `connection.needs_rollback = True` when it
> wasn't true already and without setting `connection.rollback_exc` this
> creates the possibility for the observed behaviour.
>
> I'm not sure how critical it is to reproduction, but the actual error
> that killed the transaction looks like this
> {{{
> decimal.InvalidOperation: [<class 'decimal.ConversionSyntax'>]
> File "django/db/models/fields/__init__.py", line 1741, in to_python
> decimal_value = decimal.Decimal(value)
>
> django.core.exceptions.ValidationError: ['“3 3 3 2 1” value must be a
> decimal number.']
> File "product_selection/tasks.py", line 123, in try_product_import
> Product.objects.bulk_create(new_products, skip_save_logic=True)
> File "django/db/models/manager.py", line 87, in manager_method
> return getattr(self.get_queryset(), name)(*args, **kwargs)
> File "common/models.py", line 772, in bulk_create
> return super().bulk_create(*args, **kwargs)
> File "django/db/models/query.py", line 786, in bulk_create
> returned_columns = self._batched_insert(
> File "django/db/models/query.py", line 1831, in _batched_insert
> self._insert(
> File "django/db/models/query.py", line 1805, in _insert
> return query.get_compiler(using=using).execute_sql(returning_fields)
> File "django/db/models/sql/compiler.py", line 1821, in execute_sql
> for sql, params in self.as_sql():
> File "django/db/models/sql/compiler.py", line 1747, in as_sql
> self.prepare_value(field, self.pre_save_val(field, obj))
> File "django/db/models/sql/compiler.py", line 1686, in prepare_value
> return field.get_db_prep_save(value, connection=self.connection)
> File "django/db/models/fields/__init__.py", line 1760, in
> get_db_prep_save
> self.to_python(value), self.max_digits, self.decimal_places
> File "django/db/models/fields/__init__.py", line 1743, in to_python
> raise exceptions.ValidationError(
> }}}
>
> My understanding is this should have been captured into `rollback_exc`
> when `needs_rollback` was set to True
> That happens in `Atomic.__exit__` when `exc_type` is set and
> `connection.needs_rollback` is False
> I can't test this on 5.2 but `Atomic.__exit__` looks unchanged.
New description:
The value of `connection.rollback_exc` is being set in one transaction and
then raised in a different later, unrelated transaction.
with these models
{{{
class Firm(Model):
name = models.CharField(max_length=255, unique=True)
class Lender(Model):
name = models.CharField(max_length=255, unique=True)
override_svr = models.DecimalField(max_digits=6, decimal_places=2)
}}}
this code
{{{
def bob():
try:
with transaction.atomic():
Firm.objects.create(name="bob")
Firm.objects.create(name="bob")
except Exception as e:
pass # safely ignoring e
def fred():
with transaction.atomic():
try:
l = Lender(name="fred", override_svr = "bobit")
Lender.objects.bulk_create([l])
except Exception as e:
pass # unsafely ignoring e
Lender.objects.count() # will fail because we're in a broken
transaction
def main():
try:
bob()
Lender.objects.count() # demonstrate we can make queries
fred()
except Exception as e:
import traceback
traceback.print_exc()
main()
}}}
yields this exception
{{{
Traceback (most recent call last):
django/db/backends/utils.py", line 89, in _execute
return self.cursor.execute(sql, params)
psycopg2.errors.UniqueViolation: duplicate key value violates unique
constraint "retain_firm_name_key"
DETAIL: Key (name)=(bob) already exists.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<ipython-input-6-729c3e87d98c>", line 9, in bob
Firm.objects.create(name="bob")
django/db/models/manager.py", line 87, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
django/db/models/query.py", line 658, in create
obj.save(force_insert=True, using=self.db)
django/db/models/base.py", line 814, in save
self.save_base(
django/db/models/base.py", line 877, in save_base
updated = self._save_table(
django/db/models/base.py", line 1020, in _save_table
results = self._do_insert(
django/db/models/base.py", line 1061, in _do_insert
return manager._insert(
django/db/models/manager.py", line 87, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
django/db/models/query.py", line 1805, in _insert
return query.get_compiler(using=using).execute_sql(returning_fields)
django/db/models/sql/compiler.py", line 1822, in execute_sql
cursor.execute(sql, params)
django/db/backends/utils.py", line 102, in execute
return super().execute(sql, params)
django/db/backends/utils.py", line 67, in execute
return self._execute_with_wrappers(
django/db/backends/utils.py", line 80, in _execute_with_wrappers
return executor(sql, params, many, context)
django/db/backends/utils.py", line 84, in _execute
with self.db.wrap_database_errors:
django/db/utils.py", line 91, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
django/db/backends/utils.py", line 89, in _execute
return self.cursor.execute(sql, params)
django.db.utils.IntegrityError: duplicate key value violates unique
constraint "retain_firm_name_key"
DETAIL: Key (name)=(bob) already exists.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<ipython-input-6-729c3e87d98c>", line 27, in main
fred()
File "<ipython-input-6-729c3e87d98c>", line 21, in fred
Lender.objects.count() # will fail because we're in a broken
transaction
django/db/models/manager.py", line 87, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
django/db/models/query.py", line 608, in count
return self.query.get_count(using=self.db)
django/db/models/sql/query.py", line 568, in get_count
return obj.get_aggregation(using, {"__count": Count("*")})["__count"]
django/db/models/sql/query.py", line 554, in get_aggregation
result = compiler.execute_sql(SINGLE)
django/db/models/sql/compiler.py", line 1562, in execute_sql
cursor.execute(sql, params)
django/db/backends/utils.py", line 102, in execute
return super().execute(sql, params)
django/db/backends/utils.py", line 67, in execute
return self._execute_with_wrappers(
django/db/backends/utils.py", line 80, in _execute_with_wrappers
return executor(sql, params, many, context)
django/db/backends/utils.py", line 83, in _execute
self.db.validate_no_broken_transaction()
django/db/backends/base/base.py", line 531, in
validate_no_broken_transaction
raise TransactionManagementError(
django.db.transaction.TransactionManagementError: An error occurred in the
current transaction. You can't execute queries until the end of the
'atomic' block.
}}}
I do not mind that this exceptioned, the problem is the stack trace it
produced is leaking stuff between different call paths. In my case it was
between unrelated celery tasks.
What's important to note here is the TransactionManagementError is from
the fred function but the IntegrityError it shows as the cause is from the
bob function. What we should be seeing instead is
{{{
django.core.exceptions.ValidationError: ['“bobit” value must be a decimal
number.']
}}}
My understanding is that this validation error should have been captured
into `rollback_exc` when `needs_rollback` was set to True
`needs_rollback` is getting set in `Atomic.__exit__` when `exc_type` is
set and `connection.needs_rollback` is False
Alternatively maybe there's some error handling missing in bulk_create.
Although even if that's the case it seems bad that
`Atomic.__exit__` can turn on `needs_rollback` without setting
`rollback_exc`. Also perhaps `rollback_exc` should be getting cleared when
`needs_rollback` is cleared.
I can't test this on 5.2 but `Atomic.__exit__` looks unchanged.
--
--
Ticket URL: <https://code.djangoproject.com/ticket/36318#comment:2>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
--
You received this message because you are subscribed to the Google Groups
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion visit
https://groups.google.com/d/msgid/django-updates/01070196254b61b8-260b8173-b324-4292-b1cb-56bcd91b92bd-000000%40eu-central-1.amazonses.com.