Hi All, 

I found quite a few discussions on the highlighting performance issue.
Though I tried to implement most of them, performance improvement was
negative. 
Currently index count is really low with about 922 records . But the field
on which highlighting is done is quite large data. Querying of data with
highlighting is taking lots of time with 85-90% time taken on highlighting. 
Configuration of  my set schema.xml is as below 

fieldType name="text_general" class="solr.TextField"
positionIncrementGap="100">
    <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" />
        
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> 
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" />
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
    </fieldType>
<field name="customContent" type="text_general" indexed="true" stored="true"
termVectors="true" termPositions="true" termOffsets="true"
storeOffsetsWithPositions="true"/>
<field name="customContent_term" type="text_general" indexed="false"
stored="true"/>
    <copyField source="customContent"   dest="customContent_term"/>

Query used in solr is 

hl=true&hl.fl=customContent&hl.fragsize=500&hl.simple.pre=<HL>&hl.simple.post=</HL>&hl.snippets=1&hl.method=unified&hl.bs.type=SENTENCE&hl.fragListBuilder=simple&hl.maxAnalyzedChars=214748364&facet=true&facet.mincount=1&facet.limit=-1&facet.s
ort=count&debug=timing&facet.field=contentSpecific

Also note that We had tried fastvectorhighlighter too but the result was not
positive. Once when we tried to hl.offsetSource="term_vectors" with unified
result came up in half a second but it didnt had any highlight snippets.

One of the debug returned by solr is shared below for reference

time=8833.0,prepare={time=0.0,query={time=0.0},facet={time=0.0},facet_module={time=0.0},mlt={time=0.0},hig
hlight={time=0.0},stats={time=0.0},expand={time=0.0},terms={time=0.0},debug={time=0.0}},process={time=8826.0,query={time=867.0},facet={time=2.0},facet_module={time=0.0},mlt={time=0.0},highlight={time=7953.0},stats={time=0.0},expand={time=0.0},ter
ms={time=0.0},debug={time=0.0}},loadFieldValues={time=28.0}}

Any suggestions to  improve the performance would be of great help

Thanks, 
Arun



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Highlighting-Performance-improvement-suggestions-required-Solr-6-5-1-tp4349767.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to