#33749: Queries take exponential time when filtering in a loop
-------------------------------------+-------------------------------------
               Reporter:  Enhaloed   |          Owner:  nobody
                   Type:             |         Status:  new
  Uncategorized                      |
              Component:  Database   |        Version:  3.2
  layer (models, ORM)                |
               Severity:  Normal     |       Keywords:
           Triage Stage:             |      Has patch:  0
  Unreviewed                         |
    Needs documentation:  0          |    Needs tests:  0
Patch needs improvement:  0          |  Easy pickings:  0
                  UI/UX:  0          |
-------------------------------------+-------------------------------------
 I have a loop that builds up various queries, then applies them with a
 filter.

 I was having some performance issues and started looking into it more
 closely and it seemed to be exponential time instead of something
 relatively linear.

 I made the query execute on each loop by printing the results and used
 django-debug-toolbar to time each query.

 Here is the important loop:

 {{{#!python
 text = 'Choose one, if there are ten or more fallen'
 words = text.split(' ')
 for word in words:
     word_query = Q()
     for search_field in search_fields:
         word_query |= Q(**{search_field + '__icontains': word})

         if search_field == 'name':
             # Also check the alternative name
             word_query |= Q(**{'name_without_punctuation__icontains':
 word})

     if exactness_option == CONS.TEXT_CONTAINS_ALL:
         cards = cards.filter(word_query)
         print(word_query)
         print(cards)
   }}}

 The word_query object scome out as expected, essentially checking if each
 word exists in any of the relationships, one word at a time
 {{{
 (OR: ('name__icontains', 'Choose'),
 ('name_without_punctuation__icontains', 'Choose'),
 ('name_without_punctuation__icontains', 'Choose'),
 ('ability_texts__text__icontains', 'Choose'), ('races__name__icontains',
 'Choose'))
 (OR: ('name__icontains', 'one,'), ('name_without_punctuation__icontains',
 'one,'), ('name_without_punctuation__icontains', 'one,'),
 ('ability_texts__text__icontains', 'one,'), ('races__name__icontains',
 'one,'))
 (OR: ('name__icontains', 'if'), ('name_without_punctuation__icontains',
 'if'), ('name_without_punctuation__icontains', 'if'),
 ('ability_texts__text__icontains', 'if'), ('races__name__icontains',
 'if'))
 (OR: ('name__icontains', 'there'), ('name_without_punctuation__icontains',
 'there'), ('name_without_punctuation__icontains', 'there'),
 ('ability_texts__text__icontains', 'there'), ('races__name__icontains',
 'there'))
 (OR: ('name__icontains', 'are'), ('name_without_punctuation__icontains',
 'are'), ('name_without_punctuation__icontains', 'are'),
 ('ability_texts__text__icontains', 'are'), ('races__name__icontains',
 'are'))
 (OR: ('name__icontains', 'ten'), ('name_without_punctuation__icontains',
 'ten'), ('name_without_punctuation__icontains', 'ten'),
 ('ability_texts__text__icontains', 'ten'), ('races__name__icontains',
 'ten'))
 (OR: ('name__icontains', 'or'), ('name_without_punctuation__icontains',
 'or'), ('name_without_punctuation__icontains', 'or'),
 ('ability_texts__text__icontains', 'or'), ('races__name__icontains',
 'or'))
 (OR: ('name__icontains', 'more'), ('name_without_punctuation__icontains',
 'more'), ('name_without_punctuation__icontains', 'more'),
 ('ability_texts__text__icontains', 'more'), ('races__name__icontains',
 'more'))
 (OR: ('name__icontains', 'fallen'),
 ('name_without_punctuation__icontains', 'fallen'),
 ('name_without_punctuation__icontains', 'fallen'),
 ('ability_texts__text__icontains', 'fallen'), ('races__name__icontains',
 'fallen'))
 }}}

 Using django-debug-toolbar to check the times for these queries however go
 as follows:
 {{{
 0.47s
 0.83s
 1.04s
 2.43s
 4.74s
 26.27s
 60.18s
 124.46s
 175.75s
 }}}

 As far as I can see these values should be somewhat similar since each
 query is checking the same 4 columns for a string of comparable length.
 Instead, they are going somewhat exponential.

 This is using PostgreSQL 13.4 with the engine
 django.db.backends.postgresql, Django v3.2.6

 Let me know if there's any other details I can provide to help with this.

-- 
Ticket URL: <https://code.djangoproject.com/ticket/33749>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/010701810fea5fd7-89c0ff32-8709-426f-8019-3a46b265ae24-000000%40eu-central-1.amazonses.com.

Reply via email to