#34107: Django ORM queries do not pick up indexes made on a key in JSONField in
Postgres
-------------------------------------+-------------------------------------
     Reporter:  Tadek Teleżyński     |                    Owner:  nobody
         Type:  Bug                  |                   Status:  new
    Component:  Database layer       |                  Version:  3.2
  (models, ORM)                      |
     Severity:  Normal               |               Resolution:
     Keywords:  json, jsonfield,     |             Triage Stage:
  postgres, index                    |  Unreviewed
    Has patch:  0                    |      Needs documentation:  0
  Needs tests:  0                    |  Patch needs improvement:  0
Easy pickings:  0                    |                    UI/UX:  0
-------------------------------------+-------------------------------------
Description changed by Tadek Teleżyński:

Old description:

> Hey folks!
>
> Been developing in Django for the past 5 years, excited to post my first
> ticket :)
> Thank you from the bottom of my heart for developing and maintaining this
> amazing framework.
> If you feel like the below is more of a feature request than a bug, feel
> free to recategorize it.
>
> Consider a following setup (assuming Postgres database, version 13.8):
> {{{
> class Example(models.Model):
>     data = models.JSONField(null=False, blank=True, default=dict)
> }}}
> And the following migration adding two indexes (note the difference in
> "->" and "->>") on data__category key:
>
> {{{
> class Migration(migrations.Migration):
>
>     dependencies = [
>         (...),
>     ]
>     atomic = False
>     operations = [
>         migrations.RunSQL(
>             "CREATE INDEX CONCURRENTLY CategoryIndex_default_cast ON
> app_example USING BTREE((data->>'category'));",
>             reverse_sql="DROP INDEX IF EXISTS
> CategoryIndex_default_cast;"
>         ),
>         migrations.RunSQL(
>             "CREATE INDEX CONCURRENTLY CategoryIndex ON app_example USING
> BTREE (((data->'category')::TEXT));",
>             reverse_sql="DROP INDEX IF EXISTS CategoryIndex;"
>         ),
>     ]
> }}}
>
> Now if I use a regular filtering syntax on a query none of the indexes is
> used:
> {{{
> >>> print(Example.objects.filter(data__category='dog').explain())
> Seq Scan on app_example  (cost=0.00..29.05 rows=6 width=36)
>   Filter: ((data -> 'category'::text) = '"dog"'::jsonb)
> }}}
>
> In order to pick up the index I need to write a bit more complex query:
> {{{
> >>> print(Example.objects.annotate(_category=KeyTextTransform('category',
> 'data')).filter(_category=Value('dog')).explain())
> Bitmap Heap Scan on app_example  (cost=4.20..13.70 rows=6 width=68)
>   Recheck Cond: ((data ->> 'category'::text) = 'dog'::text)
>   ->  Bitmap Index Scan on categoryindex_default_cast  (cost=0.00..4.20
> rows=6 width=0)
>         Index Cond: ((data ->> 'category'::text) = 'dog'::text)
> }}}
>
> To be honest I wasn't able to figure out a query that would pick up the
> second index ("CategoryIndex") that's using explicit type casting.
>
> It would be very neat if the ORM could figure the type casting
> automatically.

New description:

 Hey folks!

 Been developing in Django for the past 5 years, excited to post my first
 ticket :)
 Thank you from the bottom of my heart for developing and maintaining this
 amazing framework.
 If you feel like the below is more of a feature request than a bug, feel
 free to recategorize it.

 Consider a following setup (assuming Postgres database, version 13.8):
 {{{
 class Example(models.Model):
     data = models.JSONField(null=False, blank=True, default=dict)
 }}}
 And the following migration adding two indexes (note the difference in
 "->" and "->>") on "data__category" key:

 {{{
 class Migration(migrations.Migration):

     dependencies = [
         (...),
     ]
     atomic = False
     operations = [
         migrations.RunSQL(
             "CREATE INDEX CONCURRENTLY CategoryIndex_default_cast ON
 app_example USING BTREE((data->>'category'));",
             reverse_sql="DROP INDEX IF EXISTS CategoryIndex_default_cast;"
         ),
         migrations.RunSQL(
             "CREATE INDEX CONCURRENTLY CategoryIndex ON app_example USING
 BTREE (((data->'category')::TEXT));",
             reverse_sql="DROP INDEX IF EXISTS CategoryIndex;"
         ),
     ]
 }}}

 Now if I use a regular filtering syntax on a query none of the indexes is
 used:
 {{{
 >>> print(Example.objects.filter(data__category='dog').explain())
 Seq Scan on app_example  (cost=0.00..29.05 rows=6 width=36)
   Filter: ((data -> 'category'::text) = '"dog"'::jsonb)
 }}}

 In order to pick up the index I need to write a bit more complex query:
 {{{
 >>> print(Example.objects.annotate(_category=KeyTextTransform('category',
 'data')).filter(_category=Value('dog')).explain())
 Bitmap Heap Scan on app_example  (cost=4.20..13.70 rows=6 width=68)
   Recheck Cond: ((data ->> 'category'::text) = 'dog'::text)
   ->  Bitmap Index Scan on categoryindex_default_cast  (cost=0.00..4.20
 rows=6 width=0)
         Index Cond: ((data ->> 'category'::text) = 'dog'::text)
 }}}

 To be honest I wasn't able to figure out a query that would pick up the
 second index ("CategoryIndex") that's using explicit type casting.

 It would be very neat if the ORM could figure the type casting
 automatically.

--

-- 
Ticket URL: <https://code.djangoproject.com/ticket/34107#comment:1>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/01070183f0f9eebf-8d05fe64-5fbf-4f2d-8b72-e6da93cd23dd-000000%40eu-central-1.amazonses.com.

Reply via email to