On Fri, Apr 8, 2011 at 6:23 AM, Jens Mueller <supidupi...@googlemail.com>wrote:

> Hello all,
>
> thanks for your generous help.
>
> I think I now know everything:  (What I want to do is to build a web
> crawler
> and index the documents found). I will start with the setup as suggested by
>
>
Write a web crawler from scratch is... ambitious.
Have you looked at Nutch (http://nutch.apache.org/)?  It uses Solr for
indexing, it may help you get a head start.
If you've never used Hadoop before it may take some getting used to, but I
have helped a customer implement it and helped a couple of their devs
(medium-seniority) get up to speed, and it didn't take them too long to get
used to it.

Andrea

Reply via email to