Re: Solr search on a large text field is very slow

Erick Erickson Thu, 08 Aug 2013 09:14:26 -0700

Sometimes bolding comes through my e-mail as *, so is *test* with the
asterisk on each end really what you're doing? Assuming so, this
will inevitably be slow. It must iterate through all the terms in the field
to see if any of them match. This is generally a bad practice.


You can to go n-grams if that's really what you want this kind
of search.

I'd expect the regular Solr search on a single term to be very fast, 1-2
seconds is very surprising.

Best
Erick


On Thu, Aug 8, 2013 at 10:45 AM, meena.sri...@mathworks.com <
meena.sri...@mathworks.com> wrote:

> Index size is around 150 GB and there are around 6.5 million documents in
> the
> index. Search on a specific text field is very slow, it takes 1 minute to 2
> minute for wildcard queries like *test*  with no highlighting and no facets
> This field contributes to 90% of index size.
> This is my shema.xml
>
> <fieldType name="text_pl" class="solr.TextField">
>       <analyzer>
>         <tokenizer class="solr.StandardTokenizerFactory"/>
>         <filter class="solr.LowerCaseFilterFactory"/>
>                 <filter class="solr.StopFilterFactory"
> words="stopwords.txt"
> ignoreCase="true"/>
>         <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="0" generateNumberParts="0" catenateWords="0"
> catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/>
>       </analyzer>
> </fieldType>
>
>  </types>
>
>
>  <fields>
>
>  <field name="syndrome" type="text_pl" indexed="true" stored="true"
> required="true" multivalued="false" omitNorms="false"/>
>  <field name="test_file_result_id" type="long" indexed="true" stored="true"
> omitNorms="true" multivalued="false"/>
>  <field name="start_date" type="date" indexed="true" stored="true"
> omitNorms="true" multivalued="false"/>
>  <field name="job_id" type="long" indexed="true" stored="true"
> omitNorms="true" multivalued="false"/>
>  <field name="test_run_id" type="long" indexed="true" stored="true"
> omitNorms="true" multivalued="false"/>
>  <field name="cluster" type="string" indexed="true" stored="true"
> omitNorms="true" multivalued="false"/>
>  <field name="logfile" type="text_ws" indexed="true" stored="true"
> omitNorms="true" multivalued="false"/>
>
>
> I am using DIH for indexing data from Database. Largest "syndrome" field
> size is 5MB it can range from 5MB to 1KB
>
> I tried using whitespacetokeniser with not much luck,
> I am using solr3.6.0
> Indexing takes 1.5 hours.
>
> Please let me know , if I need to add anything to improve the search speed.
>
> Thanks
> Meena
>
>
>
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-search-on-a-large-text-field-is-very-slow-tp4083310.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Solr search on a large text field is very slow

Reply via email to