Steve,

A field named "name" sounds like a free text field.  What is its type, string 
or text?  Fields you sort by should not be tokenized and should be indexed.  I 
have a hunch your name field is tokenized.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: Steve Conover <scono...@gmail.com>
> To: solr-user@lucene.apache.org
> Sent: Friday, March 27, 2009 11:59:52 PM
> Subject: Re: optimization advice?
> 
> We sort by default on "name", which varies quite a bit (we're never
> going to make sorting by field go away).
> 
> The thing is solr has been pretty amazing across 1 million records.
> Now that we've doubled the size of the dataset things are definitely
> slower in a nonlinear way...I'm wondering what factors are involved
> here.
> 
> -Steve
> 
> On Fri, Mar 27, 2009 at 6:58 PM, Otis Gospodnetic
> wrote:
> >
> > OK, we are a step closer.  Sorting makes things slower.  What field(s) do 
> > you 
> sort on, what are their types, and if there is a date in there, are the dates 
> very granular, and if they are, do you really need them to be that precise?
> >
> >
> > Otis
> > --
> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> >
> >
> >
> > ----- Original Message ----
> >> From: Steve Conover 
> >> To: solr-user@lucene.apache.org
> >> Sent: Friday, March 27, 2009 1:51:14 PM
> >> Subject: Re: optimization advice?
> >>
> >> > Steve,
> >> >
> >> > Maybe you can tell us about:
> >>
> >> sure
> >>
> >> > - your hardware
> >>
> >> 2.5GB RAM, pretty modern virtual servers
> >>
> >> > - query rate
> >>
> >> Let's say a few queries per second max... < 4
> >>
> >> And in general the challenge is to get latency on any given query down
> >> to something very low - we don't have to worry about a huge amount of
> >> load at the moment.
> >>
> >> > - document cache and query cache settings
> >>
> >>
> >>         class="solr.LRUCache"
> >>         size="512"
> >>         initialSize="512"
> >>         autowarmCount="256"/>
> >>
> >>
> >>         class="solr.LRUCache"
> >>         size="512"
> >>         initialSize="512"
> >>         autowarmCount="0"/>
> >>
> >> > - your current response times
> >>
> >> This depends on the query.  For queries that involve a total record
> >> count of < 1 million, we often see < 10ms response times, up to
> >> 4-500ms in the worst case.  When we do a page one, sorted query on our
> >> full record set of 2 million+ records, response times can get up into
> >> 2+ seconds.
> >>
> >> > - any pain points, any slow query patterns
> >>
> >> Something that can't be emphasized enough is that we can't predict
> >> what records people will want.  Almost every query is aimed at a
> >> different set of records.
> >>
> >> -Steve
> >
> >

Reply via email to