Re: field queries seem slow

Lance Norskog Thu, 05 Nov 2009 16:32:17 -0800

Restarting Solr clears out all caching.

Doing a commit used to drop all of the caches for new requests, but it
no longer does this.


On Linux you can clear the kernel's disk buffer cache with a special
hook. You echo '1' into a /proc/something and this tells the kernel to
drop its caches. Sorry, don't remember the exact command.

On Thu, Nov 5, 2009 at 10:09 AM, Otis Gospodnetic
<otis_gospodne...@yahoo.com> wrote:
> Hi,
>
> There is no way that I know to clear Solr's caches (query, document, filter 
> caches).
> FIeldCache is a Lucene thing and it's also something you can't clear, as far 
> as I know.
>
> Slowness on start could be due to:
>
>  * OS not cached the index yet (would be the case if your Solr was down for a 
> while and its index got displaced from the OS buffers)
>  * sort query run for the first time, FieldCache not populated yet
>  * expensive query run for the first time, its results and hits not cached in 
> Solr caches
>
>  * ...
>
> Otis
>
> --
> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>
>
>
> ----- Original Message ----
>> From: mike anderson <saidthero...@gmail.com>
>> To: solr-user@lucene.apache.org
>> Sent: Thu, November 5, 2009 11:34:59 AM
>> Subject: Re: field queries seem slow
>>
>> On production our servers are restarted very rarely (once a month). But this
>> raises a question, what does it take to clear the cache? On my benchmarking
>> platform I've been simply restarting the server as a method of starting
>> fresh. Is there a cache file I could delete to make sure I'm getting
>> unbiased results? Second of all, is there an internal cache for sort fields
>> separate from the cache for queries and filters which has settings found in
>> the solrconfig.xml file?
>>
>> I did a test as you suggested to determine if that type of query is always
>> slow or just when it starts up, it seems that it is only slow when it starts
>> up. However, it seems to be slow when it starts up with and without sorting.
>> (I'm still trying to figure out how to do good benchmarking with one
>> independent variable, so it's possible that this result is inconsistent)
>>
>> for reference, my query is looking like this (+/- sort field):
>>
>> http://10.0.20.174:8986/solr/select?mlt=false&rows=10&shards=localhost:8986/solr,localhost:8986/solr,localhost:8986/solr&q=abbrev_authors%3A%22Gallinger+S%22
>>
>> I like the suggestion on date resolution, we definitely don't need second
>> accuracy (which it is now), and in fact I think we'll just start stamping
>> documents with year/week and then sort by that.
>>
>>
>> thanks for all your help!
>>
>> Cheers,
>> Mike
>>
>>
>>
>> On Wed, Nov 4, 2009 at 2:07 PM, Erick Erickson wrote:
>>
>> > By readers, I meant your searchers. Perhaps you were shutting
>> > down your servers?
>> >
>> > The warming isn't to pre-load authors, it's to pre-populate, particularly,
>> > sort fields. Which are then kept in caches. There is considerable
>> > overhead in loading the sort field the first time you sort by it. So,
>> > my question was really based on the chance that "over the
>> > weekend" corresponded to "the first queries after the server
>> > restarted", or "the first query after the underlying index searchers
>> > were (re)opened.
>> >
>> > The real question comes down to whether the same form of query
>> > (i.e. searching for different values on the same fields with the
>> > same kind of sort) is slow all the time or just when things start up.
>> >
>> > How fine is the resolution for your dates? Assuming that the sorting
>> > is the issue, if you are storing dates in the millisecond range, that's
>> > probably 20M dates that have to be loaded to sort. You might
>> > want to think about a coarser resolution  if this has any relevance.
>> >
>> > HTH
>> > Erick
>> >
>> > On Wed, Nov 4, 2009 at 1:54 PM, mike anderson
>> > >wrote:
>> >
>> > > Erik, we are doing a sort by date first, and then by score. I'm not sure
>> > > what you mean by readers.
>> > >
>> > > Since we have nearly 6M authors attached to our 20M documents I'm not
>> > sure
>> > > that autowarming would help that much (especially since we have very
>> > little
>> > > overlap in what users are searching for). But maybe it would?
>> > >
>> > > Lance, I was just being a bit lazy. thanks though.
>> > >
>> > > -mike
>> > >
>> > >
>> > > On Mon, Nov 2, 2009 at 10:27 PM, Lance Norskog
>> > wrote:
>> > >
>> > > > This searches author:albert and (default text field): einstein. This
>> > > > may not be what you expect?
>> > > >
>> > > > On Mon, Nov 2, 2009 at 2:30 PM, Erick Erickson <
>> > erickerick...@gmail.com>
>> > > > wrote:
>> > > > > Hmmmm, are you sorting? And has your readers been reopened? Is the
>> > > > > second query of that sort also slow? If the answer to this last
>> > > question
>> > > > is
>> > > > > "no",
>> > > > > have you tried some autowarming queries?
>> > > > >
>> > > > > Best
>> > > > > Erick
>> > > > >
>> > > > > On Mon, Nov 2, 2009 at 4:34 PM, mike anderson <
>> > saidthero...@gmail.com
>> > > > >wrote:
>> > > > >
>> > > > >> I took a look through my Solr logs this weekend and noticed that the
>> > > > >> longest
>> > > > >> queries were on particular fields, like "author:albert einstein". Is
>> > > > this a
>> > > > >> result consistent with other setups out there? If not, Is there a
>> > > trick
>> > > > to
>> > > > >> make these go faster? I've read up on filter queries and use those
>> > > when
>> > > > >> applicable, but they don't really solve all my problems.
>> > > > >>
>> > > > >> If anybody wants to take a shot at it but needs to see my
>> > solrconfig,
>> > > > etc
>> > > > >> just let me know.
>> > > > >>
>> > > > >> Cheers,
>> > > > >> Mike
>> > > > >>
>> > > > >
>> > > >
>> > > >
>> > > >
>> > > > --
>> > > > Lance Norskog
>> > > > goks...@gmail.com
>> > > >
>> > >
>> >
>
>



-- 
Lance Norskog
goks...@gmail.com

Re: field queries seem slow

Reply via email to