Hi Elaine, I'm curious what happens if you remove "pf" (phrase field) setting from your edismax config?
This question brought to mind https://issues.apache.org/jira/browse/SOLR-12243?focusedCommentId=16836448#comment-16836448 and https://issues.apache.org/jira/browse/LUCENE-8531. This *could* have directly explained the behavior you're observing, except for the fact that pre-6.5.0, analyzeGraphPhrase(...) generated a fully-enumerated Lucene "GraphQuery" (since removed, but afaict similar to MultiPhraseQuery). But the direct topic of SOLR-12243 was that SpanNearQuery, nevermind its performance characteristics, was getting completely ignored by edismax. Curious about your case, I looked at ExtendedDismaxQParser for 6.4.2, and it appears that GraphQuery was similarly ignored?: https://github.com/apache/lucene-solr/blob/releases/lucene-solr/6.4.2/solr/core/src/java/org/apache/solr/search/ExtendedDismaxQParser.java#L1219-L1252 If this is in fact the case (and I could well be overlooking something), then it's possible that 6.4.2 was more performant mainly because edismax was completely ignoring the more complex phrase queries generated by analyzeGraphPhrase(...). I'll be curious to hear what you find, and eager to be corrected if the above speculation is off-base! Michael On Wed, Aug 19, 2020 at 10:56 AM Elaine Cario <etca...@gmail.com> wrote: > > Hi Solr experts, > > We're in the process of upgrading SolrCloud from 6.4.2 to 8.3.1, and our > performance testing is consistently showing search latencies are measurably > higher in 8.3.1, for certain kinds of queries it may be as much as 200 ms > higher on average. > > We've seen this now in 2 different environments. In one environment, we > effectively doubled the OS memory for Solr 8 (by removing a replica set), > and saw little improvement. > > The specs on the VM's we're using are the same from Solr 6 and 8, and the > index sizes and shard distribution are also the same. We reviewed garbage > collection logs, and didn't see any red flags there. We're still using > Java 8 (sorry!). Content was re-fed into Solr 8 from scratch. > > We re-ran queries removing all the usual suspects for high latencies: > grouping, faceting, highlighting.We saw some improvement (as we would > expect), but nothing approaching the average Solr 6 latencies with all > those features turned on. > > We've narrowed the largest overall latencies to queries which contain many > terms OR'd together (essentially synonyms we add to the query ourselves); > there may be as many as 0-38 or more quoted phrases OR'd together. > Latencies increase the more synonyms we add (we always knew this), but it > seems much worse in Solr 8. (It is an unfortunate quirk of our content that > these terms often have pretty high frequencies). But it's not clear if > this is just amplifying an underlying issue, or if something fundamental > changed in the way Solr (or Lucene) resolves queries with OR'd terms. We > use a custom variant of edismax (but we also modified the queries to enable > use of OOTB edismax, and still saw no improvement). > > We also noted that 0-term queries (*:*) with lots of facets perform as well > as Solr 6, so it definitely seems related to searching for terms. > > I'm out of ideas here. Has anyone experienced similar degradation from > older Solr versions? > > Thanks in advance for any help you can provide.