Hello Michael,

Sorry it took so long to get back to this, too many things to do.

Anyway, yes, we have WDF on our query-time analysers. I uploaded two log files, 
both the same query of death with and without synonym filter enabled.

https://mail.openindex.io/export/solr-8983-console.log 23 MB
https://mail.openindex.io/export/solr-8983-console-without-syns.log 1.9 MB

Without the synonym we still see a huge number of entries. Many different parts 
of our analyser chain contribute to the expansion of queries, but pf itself 
really turns the problem on or off.

Since SOLR-12243 is new in 7.6, does anyone know that SOLR-12243 could have 
this side-effect?

Thanks,
Markus


-----Original message-----
> From:Michael Gibney <mich...@michaelgibney.net>
> Sent: Friday 8th February 2019 17:19
> To: solr-user@lucene.apache.org
> Subject: Re: Query of Death Lucene/Solr 7.6
> 
> Hi Markus,
> As of 7.6, LUCENE-8531 <https://issues.apache.org/jira/browse/LUCENE-8531>
> reverted a graph/Spans-based phrase query implementation (introduced in 6.5
> -- LUCENE-7699 <https://issues.apache.org/jira/browse/LUCENE-7699>) to an
> implementation that builds a separate phrase query for each possible
> enumerated path through the graph described by a parsed query.
> The potential for combinatoric explosion of the enumerated approach was (as
> far as I can tell) one of the main motivations for introducing the
> Spans-based implementation. Some real-world use cases would be good to
> explore. Markus, could you send (as an attachment) the debug toString() for
> the queries with/without synonyms enabled? I'm also guessing you may have
> WordDelimiterGraphFilter on the query analyzer?
> As an alternative to disabling pf, LUCENE-8531 only reverts to the
> enumerated approach for phrase queries where slop>0, so setting ps=0 would
> probably also help.
> Michael
> 
> On Fri, Feb 8, 2019 at 5:57 AM Markus Jelsma <markus.jel...@openindex.io>
> wrote:
> 
> > Hello (apologies for cross-posting),
> >
> > While working on SOLR-12743, using 7.6 on two nodes and 7.2.1 on the
> > remaining four, we stumbled upon a situation where the 7.6 nodes quickly
> > succumb when a 'Query-of-Death' is issued, 7.2.1 up to 7.5 are all
> > unaffected (tested and confirmed).
> >
> > Following Smiley's suggestion i used Eclipse MAT to find the problem in
> > the heap dump i obtained, this fantastic tool revealed within minutes that
> > a query thread ate 65 % of all resources, in the class variables i could
> > find the the query, and reproduce the problem.
> >
> > The problematic query is 'dubbele dijk/rijke dijkproject in het dijktracé
> > eemshaven-delfzijl', on 7.6 this input produces a 40+ MB toString() output
> > in edismax' newFieldQuery. If the node survives it takes 2+ seconds for the
> > query to run (150 ms otherwise). If i disable all query time
> > SynonymGraphFilters it still takes a second and produces just a 9 MB
> > toString() for the query.
> >
> > I could not find anything like this in Jira. I did think of LUCENE-8479
> > and LUCENE-8531 but they were about graphs, this problem looked related
> > though.
> >
> > I think i tracked it further down to LUCENE-8589 or SOLR-12243. When i
> > leave Solr's edismax' pf parameter empty, everything runs fast. When all
> > fields are configured for pf, the node dies.
> >
> > I am now unsure whether this is a Solr or a Lucene issue.
> >
> > Please let me know.
> >
> > Many thanks,
> > Markus
> >
> > ps. in Solr i even got an 'Impossible Exception', my first!
> >
> 

Reply via email to