Re: ExtractingRequestHandler

Ravish Bhagdev Tue, 03 Apr 2012 09:16:39 -0700

(Bit off-topic but...) I understand the fact that Solr isn't meant to
'store' everything, but because highlighting matches requires a field to be
stored I would expect most people having to end-up storing full document
content in their indexes?  Can't think there is any good workaround for
this...


Rav

On Sun, Apr 1, 2012 at 6:15 PM, Erick Erickson <[email protected]>wrote:

> Yes, you can. but.... Generally, storing the raw input in Solr is
> not the best approach. The problem here is that pretty soon
> you get a huge index that contains *everything*. Solr was not
> intended to be a data store.
>
> Besides, you then need to store the binary form of the file. Solr
> only deals with text, not markup.
>
> Most people index the text in Solr, and enough information
> so the application knows where to go to fetch the original
> document when the user drills down (e.g. file path, database
> PK, etc). Would that work for your situation?
>
> Best
> Erick
>
> On Sat, Mar 31, 2012 at 3:55 PM,  <[email protected]> wrote:
> > Hi,
> >
> > I want to index various filetypes in solr, this can easily done with
> > ExtractingRequestHandler. But I also need the extracted content back.
> > I know ext.extract.only but then nothing gets indexed, right?
> >
> > Can I index the document AND get the content back as with
> ext.extract.only?
> > In a single request?
> >
> > Thank you
> >
> >
>

Re: ExtractingRequestHandler

Reply via email to