Steve, A field named "name" sounds like a free text field. What is its type, string or text? Fields you sort by should not be tokenized and should be indexed. I have a hunch your name field is tokenized.
Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- > From: Steve Conover <scono...@gmail.com> > To: solr-user@lucene.apache.org > Sent: Friday, March 27, 2009 11:59:52 PM > Subject: Re: optimization advice? > > We sort by default on "name", which varies quite a bit (we're never > going to make sorting by field go away). > > The thing is solr has been pretty amazing across 1 million records. > Now that we've doubled the size of the dataset things are definitely > slower in a nonlinear way...I'm wondering what factors are involved > here. > > -Steve > > On Fri, Mar 27, 2009 at 6:58 PM, Otis Gospodnetic > wrote: > > > > OK, we are a step closer. Sorting makes things slower. What field(s) do > > you > sort on, what are their types, and if there is a date in there, are the dates > very granular, and if they are, do you really need them to be that precise? > > > > > > Otis > > -- > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > > > > ----- Original Message ---- > >> From: Steve Conover > >> To: solr-user@lucene.apache.org > >> Sent: Friday, March 27, 2009 1:51:14 PM > >> Subject: Re: optimization advice? > >> > >> > Steve, > >> > > >> > Maybe you can tell us about: > >> > >> sure > >> > >> > - your hardware > >> > >> 2.5GB RAM, pretty modern virtual servers > >> > >> > - query rate > >> > >> Let's say a few queries per second max... < 4 > >> > >> And in general the challenge is to get latency on any given query down > >> to something very low - we don't have to worry about a huge amount of > >> load at the moment. > >> > >> > - document cache and query cache settings > >> > >> > >> class="solr.LRUCache" > >> size="512" > >> initialSize="512" > >> autowarmCount="256"/> > >> > >> > >> class="solr.LRUCache" > >> size="512" > >> initialSize="512" > >> autowarmCount="0"/> > >> > >> > - your current response times > >> > >> This depends on the query. For queries that involve a total record > >> count of < 1 million, we often see < 10ms response times, up to > >> 4-500ms in the worst case. When we do a page one, sorted query on our > >> full record set of 2 million+ records, response times can get up into > >> 2+ seconds. > >> > >> > - any pain points, any slow query patterns > >> > >> Something that can't be emphasized enough is that we can't predict > >> what records people will want. Almost every query is aimed at a > >> different set of records. > >> > >> -Steve > > > >