Hi Erick, thank you for the reply. Yes, I'm using the fast vector highlighter (Solr 4.3). Every request should only deliver 10 results.
Here is my schema configuration on both field: <fieldType name="text" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.StandardTokenizerFactory" /> <filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.WordDelimiterFilterFactory" catenateWords="1" catenateNumbers="1" catenateAll="1" preserveOriginal="1" /> <filter class="solr.ASCIIFoldingFilterFactory" /> </analyzer> <analyzer type="query"> <tokenizer class="solr.StandardTokenizerFactory" /> <filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.ASCIIFoldingFilterFactory" /> </analyzer> <analyzer type="multiterm"> <tokenizer class="solr.WhitespaceTokenizerFactory" /> <filter class="solr.ASCIIFoldingFilterFactory" /> </analyzer> </fieldType> <fieldType name="textSpell" class="solr.TextField" positionIncrementGap="100" omitNorms="true"> <analyzer type="index"> <tokenizer class="solr.StandardTokenizerFactory" /> <filter class="solr.SnowballPorterFilterFactory" language="German2" /> <filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.StopFilterFactory" /> <filter class="solr.RemoveDuplicatesTokenFilterFactory" /> <filter class="solr.ShingleFilterFactory" /> </analyzer> <analyzer type="query"> <tokenizer class="solr.StandardTokenizerFactory" /> <filter class="solr.SnowballPorterFilterFactory" language="German2" /> <filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.StandardFilterFactory" /> <filter class="solr.RemoveDuplicatesTokenFilterFactory" /> </analyzer> <analyzer type="multiterm"> <tokenizer class="solr.WhitespaceTokenizerFactory" /> <filter class="solr.ASCIIFoldingFilterFactory" /> </analyzer> </fieldType> <field name="spell" type="textSpell" indexed="true" multiValued="true" /> <field name="content" type="text" stored="true" indexed="true" multiValued="true" termVectors="true" termPositions="true" termOffsets="true" /> Field content contains in average around 5000 - 6000 words (only rough estimation). Best regards Erwin -----Original Message----- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Tuesday, February 25, 2014 3:27 PM To: solr-user@lucene.apache.org Subject: Re: Performance problem on Solr query on stemmed values Right, highlighting may have to re-analyze the input in order to return the highlighted data. This will be significantly slower than the search, especially if you have a large number of rows you're returning. You can get better performance in highlighting by using FastVectorHighlighter. See: https://cwiki.apache.org/confluence/display/solr/FastVector+Highlighter 1000x is unusual, though, unless your fields are very large or you're returning a lot of documents. Best, Erick On Tue, Feb 25, 2014 at 5:23 AM, Erwin Gunadi <festiva.s...@gmail.com>wrote: > Hi, > > > > I would like to know whether anyone have experienced this kind of > phenomena. > > > > We are having performance problem regarding query on stemmed value. > > I've documented the symptoms which I'm currently facing: > > > > > Search on field content > > Search on field spell > > Highlighting (on content field) > > Processing speed > > > active > > active > > Active > > Slow > > > active > > not active > > Active > > Fast > > > active > > active > > not active > > Fast > > > not active > > active > > Active > > Slow > > > not active > > active > > not active > > Fast > > > > *Fast means 1000x faster than "slow". > > > > Field Content is our index field, which holds original text, and spell > is the field with stemmed value. > > According to my measurement result, search on both fields (stemmed and > not > stemmed) is really fast. > > But when I start to take highlighting into our query it takes too long > to process. > > > > Best Regards > > Erwin > >