Hi Furkan, post.jar meant to be used as example, quick start etc. For production (incremental updates, deletes) consider using http://manifoldcf.apache.org for indexing rich documents. It utilises ExtractingRequestHandler feature of solr.
--- On Fri, 4/26/13, Furkan KAMACI <furkankam...@gmail.com> wrote: > From: Furkan KAMACI <furkankam...@gmail.com> > Subject: Re: Solr Indexing Rich Documents > To: solr-user@lucene.apache.org > Date: Friday, April 26, 2013, 3:39 PM > Thanks for the answer, I get an error > now: FileNotFound Exception as I > mentioned at other thread. Now I' trying to solve it. > > 2013/4/26 Jack Krupansky <j...@basetechnology.com> > > > It's called SolrCell or the ExtractingRequestHandler > (/update/extract), > > which the newer post.jar knows to use for some file > types: > > http://wiki.apache.org/solr/ExtractingRequestHandler > > > > -- Jack Krupansky > > > > -----Original Message----- From: Furkan KAMACI > > Sent: Friday, April 26, 2013 4:48 AM > > To: solr-user@lucene.apache.org > > Subject: Solr Indexing Rich Documents > > > > > > I have a large corpus of rich documents i.e. pdf and > doc files. I think > > that I can use directly the example jar of Solr. > However for a real time > > environment what should I care? Also how do you send > such kind of documents > > into Solr to index, I think post.jar does not handle > that file type? I > > should mention that I don't store documents in a > database. > > >