First: Please pardon the cross-post to solr-user for reference. I hope
to continue this thread in solr-dev. Please answer to solr-dev.
1) more documentation (and posisbly some locking configuration options) on
how you can use Solr to access an index generated by the nutch crawler (i
think Thors
: In that respect I agree with the original posting that Solr lacks
: functionality with respect to desired functionality. One can argue that
: more or less random data should be structured by the user writing a
: decent application. However a more easy to use and configurable plugin
: architectur
Hi,
I want to pick up this old thread from the summer (see below). I do
understand that Solr is inteded for more structured data, and that Nutch
is a good basis for cluttered information, particularly fetched from
crawlers.
However Solr's ease of setup and flexible schemas make it a viable
: the text out of these types of documents. You could borrow the
: document parsing pieces from Lucene's contrib and Nutch and glue them
: together into your client that speaks to Solr, or perhaps Solr isn't
: the right approach for your needs? It certainly is possible to add
: these capabiliti
On Aug 30, 2006, at 2:42 AM, Bruno wrote:
browsing through the message thread I tried to find a trail
addressing file
system crawls. I want to implement an enterprise search over a
networked
filesystem, crawling all sorts of documents, such as html, doc, ppt
and pdf.
Nutch provides plugins
formats.
Is there support for the same functionality in solr?
Bruno
--
View this message in context:
http://www.nabble.com/document-support-for-file-system-crawling-tf2188066.html#a6053318
Sent from the Solr - User forum at Nabble.com.