Hi Tim,

Due to our performance needs we optimize the index early in the morning and
then run the cache-warming queries once we mount the optimized index on our
servers.  If you are indexing and serving using the same Solr instance, you
shouldn't have to re-run the cache warming queries when you add documents. 
I believe that the disk writes caused by adding the documents to the index
should put that data in the OS cache.   Actually 1600 queries are not a lot
of queries.  If you are using actual user queries from your logs you may
need more.   We used some tools based on Luke to analyze our index and
determine which words would most benefit by being in the OS cache (assuming
users entered a phrase query containing those words.)  You can experiment to
see how many queries you need to fill memory by emptying the OS cache and
then send queries and use top to watch memory usage.

Your options  (assuming peformance with current hardware does not meet your
needs ) are using SSD's, increasing memory on the machine, or splitting the
index using Solr shards.  If you either increase memory on the machine or
split the index, you will still have to run cache warming queries.

One other thing you might consider is to use stop words or CommonGrams to
reduce disk I/O requirments for phrase queries containing common words.  
(Our experiments with CommonGrams and cache-warming are described in our
blog : http://www.hathitrust.org/blogs/large-scale-search
http://www.hathitrust.org/blogs/large-scale-search )

Tom




Hi Tom,

1600 warming queries, that's quite many. Do you run them every time a
document is added to the index? Do you have any tips on warming?

If the index size is more than you can have in RAM, do you recommend
to split the index to several servers so it can all be in RAM?

I do expect phrase queries. Total index size is 107 GB. *prx files are
total 65GB and *frq files 38GB. It's probably worth buying more RAM.

/Tim


-- 
View this message in context: 
http://old.nabble.com/persistent-cache-tp27562126p27598026.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to