#36430: bulk_batch_size() special-cases single field on SQLite according to
outdated limit
-------------------------------------+-------------------------------------
     Reporter:  Jacob Walls          |                    Owner:  (none)
         Type:                       |                   Status:  new
  Cleanup/optimization               |
    Component:  Database layer       |                  Version:  dev
  (models, ORM)                      |
     Severity:  Normal               |               Resolution:
     Keywords:                       |             Triage Stage:
  SQLITE_MAX_COMPOUND_SELECT,        |  Unreviewed
  bulk_create                        |
    Has patch:  0                    |      Needs documentation:  0
  Needs tests:  0                    |  Patch needs improvement:  0
Easy pickings:  0                    |                    UI/UX:  0
-------------------------------------+-------------------------------------

Old description:

> On SQLite, `bulk_batch_size()`
> [https://github.com/django/django/blob/1a744343999c9646912cee76ba0a2fa6ef5e6240/django/db/backends/sqlite3/operations.py#L49
> special-cases] a field list of length 1 and applies the
> `SQLITE_MAX_COMPOUND_SELECT` limit to arrive at a value of 500.
>
> I think this must date from before bulk inserts used `VALUES` syntax,
> which became available in SQLite in 2012, see
> [https://github.com/laravel/framework/issues/25262#issuecomment-414836191
> discussion in Laravel].
>
> When the list of fields exceeds 1, we go through the `elif` branch and
> arrive at much higher limits. I'm pretty sure this shows that the limit
> of 500 for `fields=["pk"]` is overly protective and can just be removed.
>
> I don't have a unit test to provide, but you can play with changing the
> limit from 500 to 501 and see that `test_large_delete` still passes (no
> trouble selecting 501 objects).
>
> (I found this while trying to
> [https://github.com/django/django/pull/19502#discussion_r2118612666
> refactor] a different call site away from `max_query_params` in the hopes
> of just calling `bulk_batch_size()` until I saw how overly protective it
> was.)
>
> I think this would be a good idea to resolve before anyone invests effort
> in [https://github.com/django/django/pull/19427/files#r2062643293 reading
> the dynamic limit] for this potentially irrelevant param.

New description:

 On SQLite, `bulk_batch_size()`
 
[https://github.com/django/django/blob/1a744343999c9646912cee76ba0a2fa6ef5e6240/django/db/backends/sqlite3/operations.py#L49
 special-cases] a field list of length 1 and applies the
 `SQLITE_MAX_COMPOUND_SELECT` limit to arrive at a value of 500.

 I think this must date from before bulk inserts used `VALUES` syntax,
 which became available in SQLite in 2012, see
 [https://github.com/laravel/framework/issues/25262#issuecomment-414836191
 discussion in Laravel].

 When the list of fields exceeds 1, we go through the `elif` branch and
 arrive at much higher limits. I'm pretty sure this shows that the limit of
 500 for `fields=["pk"]` is overly protective and can just be removed.

 I don't have a unit test to provide, but you can play with changing the
 limit from 500 to 501 and see that `test_large_delete` still passes (no
 trouble selecting 501 objects). (You can also adjust
 `test_max_batch_size()` to provide a s

 (I found this while trying to
 [https://github.com/django/django/pull/19502#discussion_r2118612666
 refactor] a different call site away from `max_query_params` in the hopes
 of just calling `bulk_batch_size()` until I saw how overly protective it
 was.)

 I think this would be a good idea to resolve before anyone invests effort
 in [https://github.com/django/django/pull/19427/files#r2062643293 reading
 the dynamic limit] for this potentially irrelevant param.

--
Comment (by Jacob Walls):

 Here's a rough test after all to demo, just adjust `test_max_batch_size`
 like this:


 {{{#!diff
 diff --git a/tests/bulk_create/tests.py b/tests/bulk_create/tests.py
 index d590a292de..fd30ca48f7 100644
 --- a/tests/bulk_create/tests.py
 +++ b/tests/bulk_create/tests.py
 @@ -296,7 +296,7 @@ class BulkCreateTests(TestCase):
      @skipUnlessDBFeature("has_bulk_insert")
      def test_max_batch_size(self):
          objs = [Country(name=f"Country {i}") for i in range(1000)]
 -        fields = ["name", "iso_two_letter", "description"]
 +        fields = ["pk"]
          max_batch_size = connection.ops.bulk_batch_size(fields, objs)
          with self.assertNumQueries(ceil(len(objs) / max_batch_size)):
              Country.objects.bulk_create(objs)
 }}}

 ----
 The test fails as it was *more* efficient than expected:
 {{{#!py
 ======================================================================
 FAIL: test_max_batch_size
 (bulk_create.tests.BulkCreateTests.test_max_batch_size)
 ----------------------------------------------------------------------
 Traceback (most recent call last):
   File "/Users/.../django/tests/bulk_create/tests.py", line 301, in
 test_max_batch_size
     with self.assertNumQueries(ceil(len(objs) / max_batch_size)):
          ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 AssertionError: 1 != 2 : 1 queries executed, 2 expected
 Captured queries were:
 1. INSERT INTO "bulk_create_country" ("name", "iso_two_letter",
 "description") VALUES ('Country 0', '', ''), ('Country 1', '', ''),
 ('Country 2', '', ''), ('Country 3', '', ...

 ----------------------------------------------------------------------
 Ran 2 tests in 0.025s

 FAILED (failures=1)
 }}}
-- 
Ticket URL: <https://code.djangoproject.com/ticket/36430#comment:1>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/django-updates/0107019732a3d3f5-d1bc843c-003a-46b3-97bb-a61263d5ea35-000000%40eu-central-1.amazonses.com.

Reply via email to