Hi Tim,

We generally run about 1600 cache-warming queries to warm up the OS disk
cache and the Solr caches when we mount a new index.

Do you have/expect phrase queries?   If you don't, then you don't need to
get any position information into your OS disk cache.  Our position
information takes about 85% of the total index size (*prx files).  So with a
100GB index, your *frq files might only be 15-20GB and you could probably
get more than half of that in 16GB of memory.

If you have limited memory and a large index, then you need to choose cache
warming queries carefully as once the cache is full, further queries will
start evicting older data from the cache.  The tradeoff is to populate the
cache with data that would require the most disk access if the data was not
in the cache versus populating the cache based on your best guess of what
queries your users will execute.  A good overview of the issues is the paper
by Baeza-Yates ( http://doi.acm.org/10.1145/1277741.1277775 The Impact of
Caching on Search Engines )


Tom Burton-West
Digital Library Production Service
University of Michigan Library
-- 
View this message in context: 
http://old.nabble.com/persistent-cache-tp27562126p27567840.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to