Re: Solr Indexing Rich Documents

Ahmet Arslan Fri, 26 Apr 2013 08:04:35 -0700

Hi Furkan,

post.jar meant to be used as example, quick start etc. For production 
(incremental updates, deletes) consider using http://manifoldcf.apache.org for 
indexing rich documents. It utilises ExtractingRequestHandler feature of solr.


--- On Fri, 4/26/13, Furkan KAMACI <furkankam...@gmail.com> wrote:

> From: Furkan KAMACI <furkankam...@gmail.com>
> Subject: Re: Solr Indexing Rich Documents
> To: solr-user@lucene.apache.org
> Date: Friday, April 26, 2013, 3:39 PM
> Thanks for the answer, I get an error
> now: FileNotFound Exception as I
> mentioned at other thread. Now I' trying to solve it.
> 
> 2013/4/26 Jack Krupansky <j...@basetechnology.com>
> 
> > It's called SolrCell or the ExtractingRequestHandler
> (/update/extract),
> > which the newer post.jar knows to use for some file
> types:
> > http://wiki.apache.org/solr/ExtractingRequestHandler
> >
> > -- Jack Krupansky
> >
> > -----Original Message----- From: Furkan KAMACI
> > Sent: Friday, April 26, 2013 4:48 AM
> > To: solr-user@lucene.apache.org
> > Subject: Solr Indexing Rich Documents
> >
> >
> > I have a large corpus of rich documents i.e. pdf and
> doc files. I think
> > that I can use directly the example jar of Solr.
> However for a real time
> > environment what should I care? Also how do you send
> such kind of documents
> > into Solr to index, I think post.jar does not handle
> that file type?  I
> > should mention that I don't store documents in a
> database.
> >
>

Re: Solr Indexing Rich Documents

Reply via email to