--- On Tue, 2/22/11, Jon Drukman <j...@cluttered.com> wrote:

> From: Jon Drukman <j...@cluttered.com>
> Subject: Sorting - bad performance
> To: solr-user@lucene.apache.org
> Date: Tuesday, February 22, 2011, 9:44 PM
> The performance factors wiki says:
> "If you do a lot of field based sorting, it is advantageous
> to add explicitly
> warming queries to the "newSearcher" and "firstSearcher"
> event listeners in your
> solrconfig which sort on those fields, so the FieldCache is
> populated prior to
> any queries being executed by your users."
> 
> I've got an index with 24+ million docs of forum posts from
> users.  I want to be
> able to get a given user's posts sorted by date.  It's
> taking 20 seconds right
> now.  What would I put in the newSearch/firstSearcher
> to make that quicker?  Is
> there any other general approach I can use to speed up
> sorting?
> 
> The schema looks like
> 
>  <fields>
>    <field name="type_id" type="string"
> indexed="true" stored="true"
> required="true" />
>    <field name="subhead" type="text"
> indexed="true" stored="true"/>
>    <field name="post_date" type="date"
> indexed="true" stored="true" />
>    <field name="author" type="cistring"
> indexed="true" stored="true" />
>    <field name="parent_author"
> type="cistring" indexed="true" stored="true" />
>  </fields>
> 
> cistring is a case-insensitive string type i created:
> 
>    <fieldType name="cistring"
> class="solr.StrField" sortMissingLast="true"
> omitNorms="true">
>         <analyzer type="index">
>                
> <tokenizer class="solr.LowerCaseTokenizerFactory"/>
>         </analyzer>
>         <analyzer type="query">
>                
> <tokenizer class="solr.LowerCaseTokenizerFactory"/>
>         </analyzer>
>     </fieldType>
> 

It is not directly related with sorting performance but this will reduce number 
of unique terms:

If you define a type with class="solr.StrField", then analyzer definition is 
ignored. Although analysis.jsp displays as if it is not ignored.

If you want to activate tokenizer etc, you need to use class="solr.TextField".

And about your author fields, depending of your domain you may want to use 
KeywordTokenizerFactory instead of LowerCaseTokenizerFactory. 
 



Reply via email to