Re: how to crawl when Solr is search engine?

Mike Klaas Thu, 07 Jun 2007 11:06:36 -0700

On 7-Jun-07, at 1:04 AM, Manoharam Reddy wrote:

Some musing:-
(I have used Nutch before and one thing I observed there was that if I
delete the crawl folder when Nutch is running, users can still search
and obtain proper results. It seems Nutch caches all the indexes in
the memory when it starts. I don't understand how is that feasible
when the size of the crawl is in the order of 10 GBs where as you have
a RAM + swap of only a few GBs.)

This is true also for Solr, because it is an OS feature: if youdelete a file that is open by certain processes, it isn't reallydeleted at all (check disk usage stats).

How is Solr caching better than this?

It is unrelated. Solr can cache certain reusable components ofqueries (namely, filters), and provides for fully-customizable schemaand arbitrary query execution on it.


-Mike

Re: how to crawl when Solr is search engine?

Reply via email to