Some month ago I have tested YaCy, this works pretty well.
http://yacy.net/en/
You can install it as stand-alone and setup your own crawler (single or
cluster).
Very nice admin and control surface.
After installation disable the internal database and enable the feed to SOLR,
thats it.
Regards,
May be you can take a look at Crawl-Anywhere which have administration
web interface, solr indexer and search web application.
www.crawl-anywhere.com
Regards.
Dominique
Le 05/09/12 17:05, Lochschmied, Alexander a écrit :
This may be a bit off topic: How do you index an existing website and c
-Original message-
> From:Lochschmied, Alexander
> Sent: Thu 06-Sep-2012 16:04
> To: solr-user@lucene.apache.org
> Subject: AW: Website (crawler for) indexing
>
> Thanks Rafał and Markus for your comments.
>
> I think Droids it has serious problem with URL parameters in current version
Hello!
You can implement your own crawler using Droids
(http://incubator.apache.org/droids/) or use Apache Nutch
(http://nutch.apache.org/), which is very easy to integrate with Solr
and is very powerful crawler.
--
Regards,
Rafał Kuć
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch -
Please take a look at the Apache Nutch project.
http://nutch.apache.org/
-Original message-
> From:Lochschmied, Alexander
> Sent: Wed 05-Sep-2012 17:09
> To: solr-user@lucene.apache.org
> Subject: Website (crawler for) indexing
>
> This may be a bit off topic: How do you index an exis