Thanks to everyone who responded, no wonder I was getting confused, I was
completely focusing on the wrong half of the equation.

I had a cursory look through some of the Nutch documentation available and
it is looking promising.

Thanks everyone.

Mark

On Tue, Dec 7, 2010 at 10:19 PM, webdev1977 <webdev1...@gmail.com> wrote:

>
> I my experience, the hardest (but most flexible part) is exactly what was
> mentioned.. processing the data.  Nutch does have a really easy plugin
> interface that you can use, and the example plugin is a great place to
> start.  Once you have the raw parsed text, you can do what ever you want
> with it.  For example, I wrote a  plugin to add geospatial information to
> my
> NutchDocument.  You then map the fields you added in the NutchDocument to
> something you want to have Solr index.  In my case I created a geography
> field where I put lat, lon info.  Then you create that same geography field
> in the nutch to solr mapping file as well as your solr schema.xml file.
> Then, when you run the crawl and tell it to use "solrindex" it will send
> the
> document to solr to be indexed.  Since you have your new field in the
> schema, it knows what to do with it at index time.  Now you can build a
> user
> interface around what you want to do with that field.
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-Newbie-need-a-point-in-the-right-direction-tp2031381p2033687.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Reply via email to