#35904: Speed up fixture loading by adding options bulk insert/create
-----------------------------------+--------------------------------------
Reporter: JorisBenschop | Owner: (none)
Type: New feature | Status: new
Component: Testing framework | Version: 5.0
Severity: Normal | Resolution:
Keywords: | Triage Stage: Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-----------------------------------+--------------------------------------
Description changed by JorisBenschop:
Old description:
> As per [https://forum.djangoproject.com/t/feature-proposal-faster-
> fixture-loading-via-loaddata-command/36972 this forum discussion], I
> have created a patch to improve load times for the loaddata command under
> some circumstances.
>
> Currently the “loaddata” management command uses the obj.save() method
> for each deserialized object within a fixture. This function first tries
> an UPDATE statement and, if that fails, tries an INSERT statement. By
> using the --force_insert a reduction of 50% of queries is achieved.
>
> A second option is to use bulk_create for insertion of multiple records.
> This improves insertion speed by (n-1/n), or ~99% for insertion of 100
> records.
>
> These options are not meant to cover each use case, and therefore are set
> to optional.
>
> Benchmark results
> ===============
> test is to insert 1000 records from a single fixture
> current: 0.116s
> with --force_insert: 0.066s
> with --bulk_create: 0.01s
>
> test is to insert 10000 records from a single fixture
> current: 1.07s
> with --force_insert: 0.39s
> with --bulk_create: 0.010s
New description:
As per [https://forum.djangoproject.com/t/feature-proposal-faster-
fixture-loading-via-loaddata-command/36972 this forum discussion], I
have created a patch to improve load times for the loaddata command under
some circumstances.
Currently the “loaddata” management command uses the obj.save() method for
each deserialized object within a fixture. This function first tries an
UPDATE statement and, if that fails, tries an INSERT statement. By using
the --force_insert a reduction of 50% of queries is achieved.
A second option is to use bulk_create for insertion of multiple records.
This improves insertion speed by (n-1/n), or ~99% for insertion of 100
records.
These options are not meant to cover each use case, and therefore are set
to optional.
Benchmark results
===============
test to insert 1000 records from a single fixture (using the Article
model on Sqlite)
current: 0.116s
with --force_insert: 0.066s
with --bulk_create: 0.010s
test to insert 10000 records from a single fixture
current: 1.07s
with --force_insert: 0.39s
with --bulk_create: 0.104s
I expect larger models to have a more significant improvement even.
--
--
Ticket URL: <https://code.djangoproject.com/ticket/35904#comment:8>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
--
You received this message because you are subscribed to the Google Groups
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion visit
https://groups.google.com/d/msgid/django-updates/010701939ba9ac8c-c4d101aa-b80a-48e1-ae69-b13821add9df-000000%40eu-central-1.amazonses.com.