#36430: bulk_batch_size() special-cases single field on SQLite according to
outdated limit
-------------------------------------+-------------------------------------
Reporter: Jacob Walls | Owner: (none)
Type: | Status: new
Cleanup/optimization |
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage:
SQLITE_MAX_COMPOUND_SELECT, | Unreviewed
bulk_create |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Old description:
> On SQLite, `bulk_batch_size()`
> [https://github.com/django/django/blob/1a744343999c9646912cee76ba0a2fa6ef5e6240/django/db/backends/sqlite3/operations.py#L49
> special-cases] a field list of length 1 and applies the
> `SQLITE_MAX_COMPOUND_SELECT` limit to arrive at a value of 500.
>
> I think this must date from before bulk inserts used `VALUES` syntax,
> which became available in SQLite in 2012, see
> [https://github.com/laravel/framework/issues/25262#issuecomment-414836191
> discussion in Laravel].
>
> When the list of fields exceeds 1, we go through the `elif` branch and
> arrive at much higher limits. I'm pretty sure this shows that the limit
> of 500 for `fields=["pk"]` is overly protective and can just be removed.
>
> I don't have a unit test to provide, but you can play with changing the
> limit from 500 to 501 and see that `test_large_delete` still passes (no
> trouble selecting 501 objects).
>
> (I found this while trying to
> [https://github.com/django/django/pull/19502#discussion_r2118612666
> refactor] a different call site away from `max_query_params` in the hopes
> of just calling `bulk_batch_size()` until I saw how overly protective it
> was.)
>
> I think this would be a good idea to resolve before anyone invests effort
> in [https://github.com/django/django/pull/19427/files#r2062643293 reading
> the dynamic limit] for this potentially irrelevant param.
New description:
On SQLite, `bulk_batch_size()`
[https://github.com/django/django/blob/1a744343999c9646912cee76ba0a2fa6ef5e6240/django/db/backends/sqlite3/operations.py#L49
special-cases] a field list of length 1 and applies the
`SQLITE_MAX_COMPOUND_SELECT` limit to arrive at a value of 500.
I think this must date from before bulk inserts used `VALUES` syntax,
which became available in SQLite in 2012, see
[https://github.com/laravel/framework/issues/25262#issuecomment-414836191
discussion in Laravel].
When the list of fields exceeds 1, we go through the `elif` branch and
arrive at much higher limits. I'm pretty sure this shows that the limit of
500 for `fields=["pk"]` is overly protective and can just be removed.
I don't have a unit test to provide, but you can play with changing the
limit from 500 to 501 and see that `test_large_delete` still passes (no
trouble selecting 501 objects). (You can also adjust
`test_max_batch_size()` to provide a s
(I found this while trying to
[https://github.com/django/django/pull/19502#discussion_r2118612666
refactor] a different call site away from `max_query_params` in the hopes
of just calling `bulk_batch_size()` until I saw how overly protective it
was.)
I think this would be a good idea to resolve before anyone invests effort
in [https://github.com/django/django/pull/19427/files#r2062643293 reading
the dynamic limit] for this potentially irrelevant param.
--
Comment (by Jacob Walls):
Here's a rough test after all to demo, just adjust `test_max_batch_size`
like this:
{{{#!diff
diff --git a/tests/bulk_create/tests.py b/tests/bulk_create/tests.py
index d590a292de..fd30ca48f7 100644
--- a/tests/bulk_create/tests.py
+++ b/tests/bulk_create/tests.py
@@ -296,7 +296,7 @@ class BulkCreateTests(TestCase):
@skipUnlessDBFeature("has_bulk_insert")
def test_max_batch_size(self):
objs = [Country(name=f"Country {i}") for i in range(1000)]
- fields = ["name", "iso_two_letter", "description"]
+ fields = ["pk"]
max_batch_size = connection.ops.bulk_batch_size(fields, objs)
with self.assertNumQueries(ceil(len(objs) / max_batch_size)):
Country.objects.bulk_create(objs)
}}}
----
The test fails as it was *more* efficient than expected:
{{{#!py
======================================================================
FAIL: test_max_batch_size
(bulk_create.tests.BulkCreateTests.test_max_batch_size)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/.../django/tests/bulk_create/tests.py", line 301, in
test_max_batch_size
with self.assertNumQueries(ceil(len(objs) / max_batch_size)):
~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: 1 != 2 : 1 queries executed, 2 expected
Captured queries were:
1. INSERT INTO "bulk_create_country" ("name", "iso_two_letter",
"description") VALUES ('Country 0', '', ''), ('Country 1', '', ''),
('Country 2', '', ''), ('Country 3', '', ...
----------------------------------------------------------------------
Ran 2 tests in 0.025s
FAILED (failures=1)
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/36430#comment:1>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
--
You received this message because you are subscribed to the Google Groups
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion visit
https://groups.google.com/d/msgid/django-updates/0107019732a3d3f5-d1bc843c-003a-46b3-97bb-a61263d5ea35-000000%40eu-central-1.amazonses.com.