Index size is around 150 GB and there are around 6.5 million documents in the
index. Search on a specific text field is very slow, it takes 1 minute to 2
minute for wildcard queries like *test*  with no highlighting and no facets
This field contributes to 90% of index size.
This is my shema.xml

<fieldType name="text_pl" class="solr.TextField">
      <analyzer>
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
                <filter class="solr.StopFilterFactory" words="stopwords.txt"
ignoreCase="true"/>
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="0" generateNumberParts="0" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/>
      </analyzer>
</fieldType>

 </types>


 <fields>
 
 <field name="syndrome" type="text_pl" indexed="true" stored="true"
required="true" multivalued="false" omitNorms="false"/>
 <field name="test_file_result_id" type="long" indexed="true" stored="true"
omitNorms="true" multivalued="false"/>
 <field name="start_date" type="date" indexed="true" stored="true"
omitNorms="true" multivalued="false"/>
 <field name="job_id" type="long" indexed="true" stored="true"
omitNorms="true" multivalued="false"/>
 <field name="test_run_id" type="long" indexed="true" stored="true"
omitNorms="true" multivalued="false"/>
 <field name="cluster" type="string" indexed="true" stored="true"
omitNorms="true" multivalued="false"/>
 <field name="logfile" type="text_ws" indexed="true" stored="true"
omitNorms="true" multivalued="false"/>


I am using DIH for indexing data from Database. Largest "syndrome" field
size is 5MB it can range from 5MB to 1KB

I tried using whitespacetokeniser with not much luck, 
I am using solr3.6.0
Indexing takes 1.5 hours.

Please let me know , if I need to add anything to improve the search speed.

Thanks
Meena







--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-search-on-a-large-text-field-is-very-slow-tp4083310.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to