Re: Seeking a simple way to test my index.

2018-09-19 Thread Erick Erickson
will do so. thank you. > > > Chip > > > From: Alexandre Rafalovitch > Sent: Wednesday, September 19, 2018 2:05:41 PM > To: solr-user > Subject: Re: Seeking a simple way to test my index. > > Have you looked at Apache Nutch? Seems like the direct match for your > - g

Re: Seeking a simple way to test my index.

2018-09-19 Thread Chip Calhoun
I do use Nutch as my crawler, but just as my crawler, so I hadn't thought to look for an answer there. I will do so. thank you. Chip From: Alexandre Rafalovitch Sent: Wednesday, September 19, 2018 2:05:41 PM To: solr-user Subject: Re: Seeking a simple w

Re: Seeking a simple way to test my index.

2018-09-19 Thread Alexandre Rafalovitch
Have you looked at Apache Nutch? Seems like the direct match for your - growing - requirements and it does integrate with Solr. Or one of the other solutions, like http://stormcrawler.net/ http://www.norconex.com/collectors/ Otherwise, this does not really feel like a Solr question. Regards, A

Seeking a simple way to test my index.

2018-09-19 Thread Chip Calhoun
I've got a Solr instance which crawls roughly 3,500 seed pages, depth of 1, at 240 institutions, all but 1 of which I don't control. I recrawl once a month or so. Naturally if one of the sites I crawl changes, then I need to know to update my seed URLs. I've been checking this by hand, which was