On 7-Jun-07, at 1:04 AM, Manoharam Reddy wrote:

Some musing:-
(I have used Nutch before and one thing I observed there was that if I
delete the crawl folder when Nutch is running, users can still search
and obtain proper results. It seems Nutch caches all the indexes in
the memory when it starts. I don't understand how is that feasible
when the size of the crawl is in the order of 10 GBs where as you have
a RAM + swap of only a few GBs.)

This is true also for Solr, because it is an OS feature: if you delete a file that is open by certain processes, it isn't really deleted at all (check disk usage stats).

How is Solr caching better than this?

It is unrelated. Solr can cache certain reusable components of queries (namely, filters), and provides for fully-customizable schema and arbitrary query execution on it.

-Mike

Reply via email to