gkatiforis opened a new issue, #11892: URL: https://github.com/apache/lucene/issues/11892
### Description The query has several match conditions with fuzziness 2, which makes it quite expensive. I don't want to search for document relevance or any kind of scoring, just identify the list of document IDs that match the query criteria. The query takes **5ms** to execute (when the documents are matched) in an index of 2 documents total, which is a long time for such a simple query. After analyzing with profile API seems that the `build_scorer `takes **3.8ms**. Even though I use bool filter, `constant_score `queries, and `"fuzzy_rewrite":"constant_score"`, the score calculation slows down the search process. I also noticed that if I use the default rewrite method instead of `"fuzzy_rewrite":"constant_score"` the `rewrite_time `increases to **3ms** and the `build_scorer `remains low. Is it possible to disable the scoring or improve the performance? Any advice or suggestion would be appreciated. A simplified version of a query is: ``` { "profile": true, "sort": "_doc", "query": { "constant_score": { "filter": { "bool": { "filter": [ { "bool": { "should": [ { "match": { "namesCombined": { "query": "name1 name2 name3", "operator": "AND", "fuzziness": "2", "fuzzy_rewrite": "constant_score" } } }, { "match": { "namesCombined": { "query": "name1", "operator": "AND", "fuzziness": "2", "fuzzy_rewrite": "constant_score" } } } ], "minimum_should_match": "1" } } ] } }, "boost": 1.0 } }, "_source": { "includes": [ "id" ], "excludes": [] } } ``` A sample document: `{"id":"id","namesCombined":"name1 name2 name3"}` Index settings and mapping: ``` { "settings": { "index": { "number_of_shards": 1, "number_of_replicas": 0, "store.type": "mmapfs", "refresh_interval": "300s" } }, "mappings": { "namesCombined": { "type": "text", "analyzer": "whitespace", "doc_values": false, "index_options": "docs", "norms": false, "store": false, "term_vector": "no" } } } ``` Profiler results: ``` { "profile": { "shards": [ { "id": "[v1VNjaOiQrqkDPKg4aTVfA][testindex][0]", "searches": [ { "query": [ { "type": "ConstantScoreQuery", "description": "ConstantScore(((+namesCombined:NICKOLAS~2 +namesCombined:JACK~2 +namesCombined:DOE~2) namesCombined:JACKDOE~2)~1)", "time_in_nanos": 3902030, "breakdown": { "set_min_competitive_score_count": 0, "match_count": 0, "shallow_advance_count": 0, "set_min_competitive_score": 0, "next_doc": 6278, "match": 0, "next_doc_count": 10, "score_count": 0, "compute_max_score_count": 0, "compute_max_score": 0, "advance": 9500, "advance_count": 2, "score": 0, "build_scorer_count": 4, "create_weight": 4852, "shallow_advance": 0, "create_weight_count": 1, "build_scorer": 3881400 }, "children": [ { "type": "BooleanQuery", "description": "((+namesCombined:NICKOLAS~2 +namesCombined:JACK~2 +namesCombined:DOE~2) namesCombined:JACKDOE~2)~1", "time_in_nanos": 3899555, "breakdown": { "set_min_competitive_score_count": 0, "match_count": 0, "shallow_advance_count": 0, "set_min_competitive_score": 0, "next_doc": 5909, "match": 0, "next_doc_count": 10, "score_count": 0, "compute_max_score_count": 0, "compute_max_score": 0, "advance": 9411, "advance_count": 2, "score": 0, "build_scorer_count": 4, "create_weight": 3493, "shallow_advance": 0, "create_weight_count": 1, "build_scorer": 3880742 }, "children": [ { "type": "BooleanQuery", "description": "+namesCombined:NICKOLAS~2 +namesCombined:JACK~2 +namesCombined:DOE~2", "time_in_nanos": 2702444, "breakdown": { "set_min_competitive_score_count": 0, "match_count": 0, "shallow_advance_count": 0, "set_min_competitive_score": 0, "next_doc": 5518, "match": 0, "next_doc_count": 10, "score_count": 0, "compute_max_score_count": 0, "compute_max_score": 0, "advance": 9329, "advance_count": 2, "score": 0, "build_scorer_count": 6, "create_weight": 2024, "shallow_advance": 0, "create_weight_count": 1, "build_scorer": 2685573 }, "children": [ { "type": "MultiTermQueryConstantScoreWrapper", "description": "namesCombined:NICKOLAS~2", "time_in_nanos": 1719693, "breakdown": { "set_min_competitive_score_count": 0, "match_count": 0, "shallow_advance_count": 0, "set_min_competitive_score": 0, "next_doc": 0, "match": 0, "next_doc_count": 0, "score_count": 0, "compute_max_score_count": 0, "compute_max_score": 0, "advance": 4243, "advance_count": 12, "score": 0, "build_scorer_count": 6, "create_weight": 334, "shallow_advance": 0, "create_weight_count": 1, "build_scorer": 1715116 } }, { "type": "MultiTermQueryConstantScoreWrapper", "description": "namesCombined:JACK~2", "time_in_nanos": 543438, "breakdown": { "set_min_competitive_score_count": 0, "match_count": 0, "shallow_advance_count": 0, "set_min_competitive_score": 0, "next_doc": 810, "match": 0, "next_doc_count": 10, "score_count": 0, "compute_max_score_count": 0, "compute_max_score": 0, "advance": 3695, "advance_count": 2, "score": 0, "build_scorer_count": 6, "create_weight": 20, "shallow_advance": 0, "create_weight_count": 1, "build_scorer": 538913 } }, { "type": "MultiTermQueryConstantScoreWrapper", "description": "namesCombined:DOE~2", "time_in_nanos": 416365, "breakdown": { "set_min_competitive_score_count": 0, "match_count": 0, "shallow_advance_count": 0, "set_min_competitive_score": 0, "next_doc": 0, "match": 0, "next_doc_count": 0, "score_count": 0, "compute_max_score_count": 0, "compute_max_score": 0, "advance": 3024, "advance_count": 12, "score": 0, "build_scorer_count": 6, "create_weight": 20, "shallow_advance": 0, "create_weight_count": 1, "build_scorer": 413321 } } ] }, { "type": "MultiTermQueryConstantScoreWrapper", "description": "namesCombined:JACKDOE~2", "time_in_nanos": 1187767, "breakdown": { "set_min_competitive_score_count": 0, "match_count": 0, "shallow_advance_count": 0, "set_min_competitive_score": 0, "next_doc": 0, "match": 0, "next_doc_count": 0, "score_count": 0, "compute_max_score_count": 0, "compute_max_score": 0, "advance": 0, "advance_count": 0, "score": 0, "build_scorer_count": 2, "create_weight": 19, "shallow_advance": 0, "create_weight_count": 1, "build_scorer": 1187748 } } ] } ] } ], "rewrite_time": 28163, "collector": [ { "name": "SimpleFieldCollector", "reason": "search_top_hits", "time_in_nanos": 75426 } ] } ], "aggregations": [] } ] } } ``` ### Version and environment details Lucene Version: 8.8.0 Java Version: 11 OS: Linux -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org