Re: Solr Stale pages

2018-08-30 Thread Cassandra Targett
As Jan pointed out, unless your client sends Solr some instructions for what to do with those documents specifically, Solr doesn't do anything. In your example, Nutch crawls 30 documents at first, and 30 documents are sent to Solr and added to the index. On next crawl, it finds 27 documents, and 2

Re: Solr Stale pages

2018-08-30 Thread kunhu0...@gmail.com
Thanks for the update I'm using Nutch 1.14 and Solr 6.6.3 and Zookeeper 3.4.12. We are using two Solr and configured as Solr cloud. Please let me know if anything is missing -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Solr Stale pages

2018-08-30 Thread Jan Høydahl
Hi Please give us more context. You can start with telling us which crawler you are using and more about your architecture. It is NOT Solr's responsibility to add/delete documents on its own. it is the client (crawler) that has to know when a document is stale or gone from the source, and then

Solr Stale pages

2018-08-30 Thread kunhu0...@gmail.com
Hello All, I would like to know how Solr will handle the stale pages. For example there are 30 documents indexed for a domain abc.com and in the second collection i have only 27 documents for the same abc.com domain and this needs to be indexed in Solr. So how solr will handle the old pages alr