Has anyone looked at it? On Sun, May 3, 2015 at 10:18 AM, jaime spicciati <jaime.spicci...@gmail.com> wrote:
> We ran into this as well on 4.10.3 (not related to an upgrade). It was > identified during load testing when a small percentage of queries would > take more than 20 seconds to return. We were able to isolate it by > rerunning the same query multiple times and regardless of cache hits the > queries would still take a long time to return. We used this method to > narrow down the performance problem to a small number of very large records > (many many fields in a single record). > > We fixed it by turning on hl.requireFieldMatch on the query so that only > fields that have an actual hit are passed through the highlighter. > > Hopefully this helps, > Jaime Spicciati > > On Sat, May 2, 2015 at 8:20 PM, Joel Bernstein <joels...@gmail.com> wrote: > > > Hi, > > > > Can you also include the details of your research that narrowed the issue > > to the highlighter? > > > > Joel Bernstein > > http://joelsolr.blogspot.com/ > > > > On Sat, May 2, 2015 at 5:27 PM, Ryan, Michael F. (LNG-DAY) < > > michael.r...@lexisnexis.com> wrote: > > > > > Are you able to identify if there is a particular part of the code that > > is > > > slow? > > > > > > A simple way to do this is to use the jstack command (assuming your > > server > > > has the full JDK installed). You can run it like this: > > > /path/to/java/bin/jstack PID > > > > > > If you run that a bunch of times while your highlight query is running, > > > you might be able to spot the hotspot. Usually I'll do something like > > this > > > to see the stacktrace for the thread running the query: > > > /path/to/java/bin/jstack PID | grep SearchHandler -B30 > > > > > > A few more questions: > > > - What are response times you are seeing before and after the upgrade? > Is > > > "unusably slow" 1 second, 10 seconds...? > > > - If you run the exact same query multiple times, is it consistently > > slow? > > > Or is it only slow on the first run? > > > - While the query is running, do you see high user CPU on your server, > or > > > high IO wait, or both? (You can check this with the top command or > vmstat > > > command in Linux.) > > > > > > -Michael > > > > > > -----Original Message----- > > > From: Cheng, Sophia Kuen [mailto:sophia_ch...@hms.harvard.edu] > > > Sent: Saturday, May 02, 2015 4:13 PM > > > To: solr-user@lucene.apache.org > > > Subject: Upgraded to 4.10.3, highlighting performance unusably slow > > > > > > Hello, > > > > > > We recently upgraded solr from 3.8.0 to 4.10.3. We saw that this > upgrade > > > caused a incredible slowdown in our searches. We were able to narrow it > > > down to the highlighting. The slowdown is extreme enough that we are > > > holding back our release until we can resolve this. Our research > > indicated > > > using TermVectors & FastHighlighter were the way to go, however this > > still > > > does nothing for the performance. I think we may be overlooking a > crucial > > > configuration, but cannot figure it out. I was hoping for some guidance > > and > > > help. Sorry for the long email, I wanted to provide enough information. > > > > > > Our documents are largely dynamic fields, and so we have been using ‘*’ > > as > > > the field for highlighting. This is the same setting as in prior > versions > > > of solr use. The dynamic fields are of type ’text’ and we added > > > customizations to the schema.xml for the type ’text’: > > > > > > <fieldType name="text" class="solr.TextField" > positionIncrementGap="100" > > > storeOffsetsWithPositions="true" termVectors="true" > termPositions="true" > > > termOffsets="true"> > > > <analyzer type="index"> > > > <!-- this charFilter removes all xml-tagging from the text: --> > > > <charFilter class="solr.HTMLStripCharFilterFactory"/> > > > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > > > <!-- Case insensitive stop word removal. > > > add enablePositionIncrements=true in both the index and query > > > analyzers to leave a 'gap' for more accurate phrase queries. > > > --> > > > <filter class="solr.StopFilterFactory" ignoreCase="true" > > > words="stopwords.txt" enablePositionIncrements="true"/> > > > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="1" > > > generateNumberParts="1" catenateWords="1" catenateNumbers="1" > > > catenateAll="0" splitOnCaseChange="1"/> > > > <filter class="solr.LowerCaseFilterFactory"/> > > > <filter class="solr.SnowballPorterFilterFactory" language="English" > > > protected="protwords.txt"/> > > > </analyzer> > > > <analyzer type="query"> > > > <!-- this charFilter removes all xml-tagging from the text. Needed > > > also in query due to autosuggest --> > > > <charFilter class="solr.HTMLStripCharFilterFactory"/> > > > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > > > <filter class="solr.StopFilterFactory" ignoreCase="true" > > > words="stopwords.txt" enablePositionIncrements="true"/> > > > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="1" > > > generateNumberParts="1" catenateWords="0" catenateNumbers="0" > > > catenateAll="0" splitOnCaseChange="1"/> > > > <filter class="solr.LowerCaseFilterFactory"/> > > > <filter class="solr.SnowballPorterFilterFactory" language="English" > > > protected="protwords.txt"/> > > > </analyzer> > > > </fieldType> > > > > > > One of the two dynamic fields we use: > > > > > > <dynamicField name="DTPropValue_*" type="text" indexed="true" > > > stored="true" required="false" multiValued="true"/> > > > > > > In our solrConfig.xml file, we have: > > > > > > <requestHandler name="/eiHandler" class="solr.SearchHandler"> <lst > > > name="defaults"> <str name="echoParams">explicit</str> > > > <int name="rows">13</int> > > > <bool name="tv">true</bool> > > > <bool name="hl.useFastVectorHighligter">true</bool> > > > </lst> > > > <arr name="last-components"> > > > <str>tvComponent</str> > > > </arr> > > > </requestHandler> > > > <searchComponent name="tvComponent" class="solr.TermVectorComponent”/> > > > <searchComponent class="solr.HighlightComponent" name="highlight"> > > > <highlighting> > > > <fragmenter name="gap" default="true" > > > class="solr.highlight.GapFragmenter"> > > > <lst name="defaults"> > > > <int name="hl.fragsize">100</int> > > > </lst> > > > </fragmenter> > > > <fragmenter name="regex" class="solr.highlight.RegexFragmenter"> > > > <lst name="defaults"> > > > <int name="hl.fragsize">70</int> > > > <float name="hl.regex.slop">0.5</float> > > > <str name="hl.regex.pattern">[-\w > > ,/\n\"']{20,200}</str> > > > </lst> > > > </fragmenter> > > > > > > <formatter name="html" default="true" > > > class="solr.highlight.HtmlFormatter"> > > > <lst name="defaults"> > > > <str name="hl.simple.pre"><![CDATA[<i>]]></str> > > > <str name="hl.simple.post"><![CDATA[</i>]]></str> > > > </lst> > > > </formatter> > > > > > > <encoder name="html" class="solr.highlight.HtmlEncoder" /> > > > <fragListBuilder name="simple" > > > class="solr.highlight.SimpleFragListBuilder"/> > > > <fragListBuilder name="single" > > > class="solr.highlight.SingleFragListBuilder"/> > > > <fragListBuilder name="weighted" default="true" > > > class="solr.highlight.WeightedFragListBuilder"/> > > > <fragmentsBuilder name="default" default="true" > > > class="solr.highlight.ScoreOrderFragmentsBuilder"> > > > </fragmentsBuilder> > > > > > > <!-- multi-colored tag FragmentsBuilder --> > > > <fragmentsBuilder name="colored" > > > class="solr.highlight.ScoreOrderFragmentsBuilder"> > > > <lst name="defaults"> > > > <str name="hl.tag.pre"><![CDATA[ > > > <b style="background:yellow">,<b > > style="background:lawgreen">, > > > <b style="background:aquamarine">,<b > > > style="background:magenta">, > > > <b style="background:palegreen">,<b > > style="background:coral">, > > > <b style="background:wheat">,<b style="background:khaki">, > > > <b style="background:lime">,<b > > > style="background:deepskyblue">]]></str> > > > <str name="hl.tag.post"><![CDATA[</b>]]></str> > > > </lst> > > > </fragmentsBuilder> > > > > > > <boundaryScanner name="default" default="true" > > > class="solr.highlight.SimpleBoundaryScanner"> > > > <lst name="defaults"> > > > <str name="hl.bs.maxScan">10</str> > > > <str name="hl.bs.chars">.,!? 	 </str> > > > </lst> > > > </boundaryScanner> > > > > > > <boundaryScanner name="breakIterator" > > > class="solr.highlight.BreakIteratorBoundaryScanner"> > > > <lst name="defaults"> > > > <str name="hl.bs.type">WORD</str> > > > <str name="hl.bs.language">en</str> > > > <str name="hl.bs.country">US</str> > > > </lst> > > > </boundaryScanner> > > > </highlighting> > > > </searchComponent> > > > > > > And in our code: > > > > > > final SolrQuery query = new SolrQuery( luceneQueryStr ); > > > query.setRequestHandler("/eiHandler"); > > > query.setStart( request.getStartIndex() ); query.setRows( > > > request.getMaxResults() ); query.setSort(new > > > SortClause(request.getSortOrder().getFieldName(), > > > request.getSortOrder().isAscending()?ORDER.asc:ORDER.desc) ); > > > query.addHighlightField( "*" ); query.setFields( "*", "score" ); > > > > > > Any assistance is greatly appreciated. Thank you. > > > > > > Sincerely, > > > Sophia > > > > > > -- Bill Bell billnb...@gmail.com cell 720-256-8076