i wanna crawl http://www.amazone.com/  and just wanna product title ,
product information, writer, publisher.

and other data i wanna ignore.

How about 
http://blog.foofactory.fi/2007/02/online-indexing-integrating-nutch-with.html

or if you're prepared to wait or help out there's
http://svn.apache.org/repos/asf/labs/droids/README.TXT

Reply via email to