Your Nutch indexes the site and host fields. If that is not enough you can use 
its subcollection plugin to write values for URL patterns.

On Wednesday 02 November 2011 15:52:37 Fred Zimmerman wrote:
> I want to be able to list some searches to particular sources, e.g. "wiki
> only", "crawled only", etc.  So I think I need to create a source field in
> the schema.xml.  However, the native data for these sources does not
> contain source info (e.g. "crawled").  So I want to use (I think)
> <copyfield> to add a string to each data set as I import it, e.g.
> "website-X-crawl".  So my question is, how do I insert a string value into
> a blank field?

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Reply via email to