gkatiforis opened a new issue, #11892:
URL: https://github.com/apache/lucene/issues/11892

   ### Description
   
   The query has several match conditions with fuzziness 2, which makes it 
quite expensive. I don't want to search for document relevance or any kind of 
scoring, just identify the list of document IDs that match the query criteria. 
   
   The query takes **5ms** to execute (when the documents are matched) in an 
index of 2 documents total, which is a long time for such a simple query. After 
analyzing with profile API seems that the `build_scorer `takes **3.8ms**. Even 
though I use bool filter, `constant_score `queries, and 
`"fuzzy_rewrite":"constant_score"`, the score calculation slows down the search 
process. I also noticed that if I use the default rewrite method instead of 
`"fuzzy_rewrite":"constant_score"` the `rewrite_time `increases to **3ms** and 
the `build_scorer `remains low. Is it possible to disable the scoring or 
improve the performance? Any advice or suggestion would be appreciated.
   
   
   
   A simplified version of a query is:
   
   ```
   {
       "profile": true,
       "sort": "_doc",
       "query": {
           "constant_score": {
               "filter": {
                   "bool": {
                       "filter": [
                           {
                               "bool": {
                                   "should": [
                                       {
                                           "match": {
                                               "namesCombined": {
                                                   "query": "name1 name2 name3",
                                                   "operator": "AND",
                                                   "fuzziness": "2",
                                                   "fuzzy_rewrite": 
"constant_score"
                                               }
                                           }
                                       },
                                       {
                                           "match": {
                                               "namesCombined": {
                                                   "query": "name1",
                                                   "operator": "AND",
                                                   "fuzziness": "2",
                                                   "fuzzy_rewrite": 
"constant_score"
                                               }
                                           }
                                       }
                                   ],
                                   "minimum_should_match": "1"
                               }
                           }
                       ]
                   }
               },
               "boost": 1.0
           }
       },
       "_source": {
           "includes": [
               "id"
           ],
           "excludes": []
       }
   }
   ```
   
   A sample document:
   
   `{"id":"id","namesCombined":"name1 name2 name3"}`
   
   Index settings and mapping:
   
   ```
   {
     "settings": {
       "index": {
         "number_of_shards": 1,
         "number_of_replicas": 0,
         "store.type": "mmapfs",
         "refresh_interval": "300s"
       }
     },
     "mappings": {
       "namesCombined": {
         "type": "text",
         "analyzer": "whitespace",
         "doc_values": false,
         "index_options": "docs",
         "norms": false,
         "store": false,
         "term_vector": "no"
       }
     }
   }
   ```
   Profiler results:
   
   ```
   {
     "profile": {
       "shards": [
         {
           "id": "[v1VNjaOiQrqkDPKg4aTVfA][testindex][0]",
           "searches": [
             {
               "query": [
                 {
                   "type": "ConstantScoreQuery",
                   "description": "ConstantScore(((+namesCombined:NICKOLAS~2 
+namesCombined:JACK~2 +namesCombined:DOE~2) namesCombined:JACKDOE~2)~1)",
                   "time_in_nanos": 3902030,
                   "breakdown": {
                     "set_min_competitive_score_count": 0,
                     "match_count": 0,
                     "shallow_advance_count": 0,
                     "set_min_competitive_score": 0,
                     "next_doc": 6278,
                     "match": 0,
                     "next_doc_count": 10,
                     "score_count": 0,
                     "compute_max_score_count": 0,
                     "compute_max_score": 0,
                     "advance": 9500,
                     "advance_count": 2,
                     "score": 0,
                     "build_scorer_count": 4,
                     "create_weight": 4852,
                     "shallow_advance": 0,
                     "create_weight_count": 1,
                     "build_scorer": 3881400
                   },
                   "children": [
                     {
                       "type": "BooleanQuery",
                       "description": "((+namesCombined:NICKOLAS~2 
+namesCombined:JACK~2 +namesCombined:DOE~2) namesCombined:JACKDOE~2)~1",
                       "time_in_nanos": 3899555,
                       "breakdown": {
                         "set_min_competitive_score_count": 0,
                         "match_count": 0,
                         "shallow_advance_count": 0,
                         "set_min_competitive_score": 0,
                         "next_doc": 5909,
                         "match": 0,
                         "next_doc_count": 10,
                         "score_count": 0,
                         "compute_max_score_count": 0,
                         "compute_max_score": 0,
                         "advance": 9411,
                         "advance_count": 2,
                         "score": 0,
                         "build_scorer_count": 4,
                         "create_weight": 3493,
                         "shallow_advance": 0,
                         "create_weight_count": 1,
                         "build_scorer": 3880742
                       },
                       "children": [
                         {
                           "type": "BooleanQuery",
                           "description": "+namesCombined:NICKOLAS~2 
+namesCombined:JACK~2 +namesCombined:DOE~2",
                           "time_in_nanos": 2702444,
                           "breakdown": {
                             "set_min_competitive_score_count": 0,
                             "match_count": 0,
                             "shallow_advance_count": 0,
                             "set_min_competitive_score": 0,
                             "next_doc": 5518,
                             "match": 0,
                             "next_doc_count": 10,
                             "score_count": 0,
                             "compute_max_score_count": 0,
                             "compute_max_score": 0,
                             "advance": 9329,
                             "advance_count": 2,
                             "score": 0,
                             "build_scorer_count": 6,
                             "create_weight": 2024,
                             "shallow_advance": 0,
                             "create_weight_count": 1,
                             "build_scorer": 2685573
                           },
                           "children": [
                             {
                               "type": "MultiTermQueryConstantScoreWrapper",
                               "description": "namesCombined:NICKOLAS~2",
                               "time_in_nanos": 1719693,
                               "breakdown": {
                                 "set_min_competitive_score_count": 0,
                                 "match_count": 0,
                                 "shallow_advance_count": 0,
                                 "set_min_competitive_score": 0,
                                 "next_doc": 0,
                                 "match": 0,
                                 "next_doc_count": 0,
                                 "score_count": 0,
                                 "compute_max_score_count": 0,
                                 "compute_max_score": 0,
                                 "advance": 4243,
                                 "advance_count": 12,
                                 "score": 0,
                                 "build_scorer_count": 6,
                                 "create_weight": 334,
                                 "shallow_advance": 0,
                                 "create_weight_count": 1,
                                 "build_scorer": 1715116
                               }
                             },
                             {
                               "type": "MultiTermQueryConstantScoreWrapper",
                               "description": "namesCombined:JACK~2",
                               "time_in_nanos": 543438,
                               "breakdown": {
                                 "set_min_competitive_score_count": 0,
                                 "match_count": 0,
                                 "shallow_advance_count": 0,
                                 "set_min_competitive_score": 0,
                                 "next_doc": 810,
                                 "match": 0,
                                 "next_doc_count": 10,
                                 "score_count": 0,
                                 "compute_max_score_count": 0,
                                 "compute_max_score": 0,
                                 "advance": 3695,
                                 "advance_count": 2,
                                 "score": 0,
                                 "build_scorer_count": 6,
                                 "create_weight": 20,
                                 "shallow_advance": 0,
                                 "create_weight_count": 1,
                                 "build_scorer": 538913
                               }
                             },
                             {
                               "type": "MultiTermQueryConstantScoreWrapper",
                               "description": "namesCombined:DOE~2",
                               "time_in_nanos": 416365,
                               "breakdown": {
                                 "set_min_competitive_score_count": 0,
                                 "match_count": 0,
                                 "shallow_advance_count": 0,
                                 "set_min_competitive_score": 0,
                                 "next_doc": 0,
                                 "match": 0,
                                 "next_doc_count": 0,
                                 "score_count": 0,
                                 "compute_max_score_count": 0,
                                 "compute_max_score": 0,
                                 "advance": 3024,
                                 "advance_count": 12,
                                 "score": 0,
                                 "build_scorer_count": 6,
                                 "create_weight": 20,
                                 "shallow_advance": 0,
                                 "create_weight_count": 1,
                                 "build_scorer": 413321
                               }
                             }
                           ]
                         },
                         {
                           "type": "MultiTermQueryConstantScoreWrapper",
                           "description": "namesCombined:JACKDOE~2",
                           "time_in_nanos": 1187767,
                           "breakdown": {
                             "set_min_competitive_score_count": 0,
                             "match_count": 0,
                             "shallow_advance_count": 0,
                             "set_min_competitive_score": 0,
                             "next_doc": 0,
                             "match": 0,
                             "next_doc_count": 0,
                             "score_count": 0,
                             "compute_max_score_count": 0,
                             "compute_max_score": 0,
                             "advance": 0,
                             "advance_count": 0,
                             "score": 0,
                             "build_scorer_count": 2,
                             "create_weight": 19,
                             "shallow_advance": 0,
                             "create_weight_count": 1,
                             "build_scorer": 1187748
                           }
                         }
                       ]
                     }
                   ]
                 }
               ],
               "rewrite_time": 28163,
               "collector": [
                 {
                   "name": "SimpleFieldCollector",
                   "reason": "search_top_hits",
                   "time_in_nanos": 75426
                 }
               ]
             }
           ],
           "aggregations": []
         }
       ]
     }
   }
   ```
   
   
   ### Version and environment details
   
   Lucene Version: 8.8.0
   
   Java Version: 11
   
   OS: Linux


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to