Hello everbody,

I've a question about the performance and the internal actions of the update 
process. We've an index containing nearly 200.000 entries (one field contains 
much content), the schema.xml is the following:

// ...
<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <!-- in this example, we will only use synonyms at query time
        <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" 
ignoreCase="true" expand="false"/>
        -->
        <filter class="solr.StopFilterFactory" ignoreCase="true" 
words="stopwords.txt"/>
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" 
generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" 
ignoreCase="true" expand="true"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" 
words="stopwords.txt"/>
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" 
generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
</fieldType>
// ...
<fields>
   <field name="id" type="string" indexed="true" stored="true" required="true" 
/>
   <field name="date" type="date" indexed="true" stored="false" required="true" 
/>
   <field name="headline" type="text" indexed="true" stored="true" 
required="true" />
   <field name="companyid" type="integer" indexed="true" stored="false" 
required="true" />
   <field name="companyname" type="text" indexed="true" stored="true" 
required="true" />
   <field name="text" type="text" indexed="true" stored="true" required="true" 
/>
   <field name="language" type="string" indexed="true" stored="false" 
required="true" />
</fields>
// ....

Every five minutes there is a cronjob, that updates a small number (between 1 
and maybe 20) of records that have been edited. But its speed is not 
satisfying, the needed time grows continuously and was over 4 minutes before we 
restarted tomcat. That was very good for the first updates (17 seconds), but 
soon the time raises again up to 170 and more seconds.

Does anyone have an idea were the problem is? Or is there no problem and the 
performance is "normal" for our configuration? I hope there are some tricks out 
there to enhance the performance.

Best regards,
Christian

Reply via email to