Pardon I didn't go through details in configs and I guess you have already went through the recent talks on highlighters, still sharing if not:
https://www.slideshare.net/lucidworks/solr-highlighting-at-full-speed-presented-by-timothy-rodriguez-bloomberg-david-smiley-d-w-smiley-llc https://www.youtube.com/watch?v=tv5qKDKW8kk Amrit Sarkar Search Engineer Lucidworks, Inc. 415-589-9269 www.lucidworks.com Twitter http://twitter.com/lucidworks LinkedIn: https://www.linkedin.com/in/sarkaramrit2 On Wed, Aug 9, 2017 at 7:45 PM, sasarun <sasa...@gmail.com> wrote: > Hi All, > > I found quite a few discussions on the highlighting performance issue. > Though I tried to implement most of them, performance improvement was > negative. > Currently index count is really low with about 922 records . But the field > on which highlighting is done is quite large data. Querying of data with > highlighting is taking lots of time with 85-90% time taken on highlighting. > Configuration of my set schema.xml is as below > > fieldType name="text_general" class="solr.TextField" > positionIncrementGap="100"> > <analyzer type="index"> > <tokenizer class="solr.StandardTokenizerFactory"/> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt" /> > > <filter class="solr.LowerCaseFilterFactory"/> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.StandardTokenizerFactory"/> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt" /> > <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" > ignoreCase="true" expand="true"/> > <filter class="solr.LowerCaseFilterFactory"/> > </analyzer> > </fieldType> > <field name="customContent" type="text_general" indexed="true" > stored="true" > termVectors="true" termPositions="true" termOffsets="true" > storeOffsetsWithPositions="true"/> > <field name="customContent_term" type="text_general" indexed="false" > stored="true"/> > <copyField source="customContent" dest="customContent_term"/> > > Query used in solr is > > hl=true&hl.fl=customContent&hl.fragsize=500&hl.simple.pre= > <HL>&hl.simple.post=</HL>&hl.snippets=1&hl.method=unified& > hl.bs.type=SENTENCE&hl.fragListBuilder=simple&hl. > maxAnalyzedChars=214748364&facet=true&facet.mincount=1& > facet.limit=-1&facet.s > ort=count&debug=timing&facet.field=contentSpecific > > Also note that We had tried fastvectorhighlighter too but the result was > not > positive. Once when we tried to hl.offsetSource="term_vectors" with unified > result came up in half a second but it didnt had any highlight snippets. > > One of the debug returned by solr is shared below for reference > > time=8833.0,prepare={time=0.0,query={time=0.0},facet={time= > 0.0},facet_module={time=0.0},mlt={time=0.0},hig > hlight={time=0.0},stats={time=0.0},expand={time=0.0},terms={ > time=0.0},debug={time=0.0}},process={time=8826.0,query={ > time=867.0},facet={time=2.0},facet_module={time=0.0},mlt={ > time=0.0},highlight={time=7953.0},stats={time=0.0},expand={time=0.0},ter > ms={time=0.0},debug={time=0.0}},loadFieldValues={time=28.0}} > > Any suggestions to improve the performance would be of great help > > Thanks, > Arun > > > > -- > View this message in context: http://lucene.472066.n3. > nabble.com/Highlighting-Performance-improvement- > suggestions-required-Solr-6-5-1-tp4349767.html > Sent from the Solr - User mailing list archive at Nabble.com. >