[ https://issues.apache.org/jira/browse/LUCENE-10562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Henrik Hertel updated LUCENE-10562: ----------------------------------- Description: I use Solr and have a large system with 1TB in one core and about 5 million documents. The textual content of large PDF files is indexed there. My query is extremely slow as soon as I use wildcards e.g. **searchvalue**, even though I put a filter query in front of it that reduces to less than 20 documents. searchvalue -> less than 1 second searchvalue* -> less than 1 second **searchvalue**-> more than 30 seconds My query: select?defType=lucene&q=content_t:**searchvalue{**{*}}{*}&fq=metadataitemids_is:20950&fq=renditions_ss%3A&fl=id&rows=50&start=0 I've tried everything imaginable. It doesn't make sense to me why a search over a small subset should take so long. If I omit the filter query metadataitemids_is:20950, so search the entire inventory, then it also takes the same amount of time. Therefore, I suspect that despite the filter query, the main query runs over the entire index. was: I use Solr and have a large system with 1TB in one core and about 5 million documents. The textual content of large PDF files is indexed there. My query is extremely slow as soon as I use wildcards e.g. {*}**{*}searchvalue**, even though I put a filter query in front of it that reduces to less than 20 documents. searchvalue -> less than 1 second searchvalue* -> less than 1 second {*}**{*}searchvalue**-> more than 30 seconds My query: select?defType=lucene&q=content_t:*{*}searchvalue{*}*&fq=metadataitemids_is:20950&fq=renditions_ss%3A&fl=id&rows=50&start=0 I've tried everything imaginable. It doesn't make sense to me why a search over a small subset should take so long. If I omit the filter query metadataitemids_is:20950, so search the entire inventory, then it also takes the same amount of time. Therefore, I suspect that despite the filter query, the main query runs over the entire index. > Large system: Wildcard search leads to full index scan despite filter query > --------------------------------------------------------------------------- > > Key: LUCENE-10562 > URL: https://issues.apache.org/jira/browse/LUCENE-10562 > Project: Lucene - Core > Issue Type: Bug > Components: core/search > Affects Versions: 8.11.1 > Reporter: Henrik Hertel > Priority: Major > Labels: performance > > I use Solr and have a large system with 1TB in one core and about 5 million > documents. The textual content of large PDF files is indexed there. My query > is extremely slow as soon as I use wildcards e.g. **searchvalue**, even > though I put a filter query in front of it that reduces to less than 20 > documents. > searchvalue -> less than 1 second > searchvalue* -> less than 1 second > **searchvalue**-> more than 30 seconds > My query: > select?defType=lucene&q=content_t:**searchvalue{**{*}}{*}&fq=metadataitemids_is:20950&fq=renditions_ss%3A&fl=id&rows=50&start=0 > I've tried everything imaginable. It doesn't make sense to me why a search > over a small subset should take so long. If I omit the filter query > metadataitemids_is:20950, so search the entire inventory, then it also takes > the same amount of time. Therefore, I suspect that despite the filter query, > the main query runs over the entire index. -- This message was sent by Atlassian Jira (v8.20.7#820007) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org