We only replicate twice an hour so we are far from real-time indexing.
Our application never writes to master rather we just pick up all
changes using updated_at timestamps when delta-importing using DIH.
We don't have any warming queries in firstSearcher/newSearcher event
listeners. My initial post was asking how I would go about doing this
with a large number of queries. Our queries themselves tend to have a
lot of faceting and other restrictions on them so I would rather not
list them all out using xml. I was hoping there was some sort of log
replayer handler or class that would replay a bunch of queries while the
node is offline. When its done, it will bring the node back online ready
to serve requests.
On 12/8/10 6:15 AM, Jonathan Rochkind wrote:
How often do you replicate? Do you know how long your warming queries take to
complete?
As others in this thread have mentioned, if your replications (or ordinary
commits, if you weren't using replication) happen quicker than warming takes to
complete, you can get overlapping indexes being warmed up, and run out of RAM
(causing garbage collection to take lots of CPU, if not an out-of-memory
error), or otherwise block on CPU with lots of new indexes being warmed at once.
Solr is not very good at providing 'real time indexing' for this reason,
although I believe there are some features in post-1.4 trunk meant to support
'near real time search' better.
________________________________________
From: Mark [static.void....@gmail.com]
Sent: Tuesday, December 07, 2010 10:24 PM
To: solr-user@lucene.apache.org
Subject: Re: Warming searchers/Caching
Maybe I should explain my problem a little more in detail.
The problem we are experiencing is after a delta-import we notice a
extremely high load time on the slave machines that just replicated. It
goes away after a min or so production traffic once everything is cached.
I already have a before/after hook that is in place before/after
replication takes place. The before hook removes the slave from the
cluster and then starts to replicate. When its done it calls the after
hook and I would like to warm up the cache in this method so no users
experience extremely long wait times.
On 12/7/10 4:22 PM, Markus Jelsma wrote:
XInclude works fine but that's not what your looking for i guess. Having the
100 top queries is overkill anyway and it can take too long for a new searcher
to warmup.
Depending on the type of requests, i usually tend to limit warming to popular
filter queries only as they generate a very high hit ratio at make caching
useful [1].
If there are very popular user entered queries having a high initial latency,
i'd have them warmed up as well.
[1]: http://wiki.apache.org/solr/SolrCaching#Tradeoffs
Warning: I haven't used this personally, but Xinclude looks like what
you're after, see: http://wiki.apache.org/solr/SolrConfigXml#XInclude
Best
Erick
On Tue, Dec 7, 2010 at 6:33 PM, Mark<static.void....@gmail.com> wrote:
Is there any plugin or easy way to auto-warm/cache a new searcher with a
bunch of searches read from a file? I know this can be accomplished using
the EventListeners (newSearcher, firstSearcher) but I rather not add 100+
queries to my solrconfig.xml.
If there is no hook/listener available, is there some sort of Handler
that performs this sort of function? Thanks!