Thanks saurish.
My office *intranet *is a sharepoint website. When I am crawling it using
nutch, i am getting "Unauthorized access(404)" error. NTLM realm is used in
this website.
I checked on one nutch JIRA link that sharepoint could be accessed using
nutch. Nutch has below properties in nutch-
Hi,
Looks like there is support for Sharepoint as well as Windows Share in
ManifoldCF.
Yes, You can craw folders with Nutch (Atleast i have worked on a windows pc
with a local file folder).
Nutch 1.7 and Solr 4.5.1 have worked for me.
Regards,
--
View this message in context:
http://lucene
Rashmi,
As far as I know Nutch is a web crawler. I don't think it can crawl documents
from Microsoft Share Point. ManifoldCF is a better fit in your case.
Regarding versioning if you don't have previous setups, then use latest
versions of each.
Ahmet
On Sunday, January 26, 2014 5:24 PM, ra