Hello Cam,
The wiki for RichDocuments explains how you can add meta data to the
RDUpdater.
http://wiki.apache.org/solr/UpdateRichDocuments
I have used the patch to index docs and thier meta data, but it was not
exactly what we needed.
Brian.
Am Mittwoch, den 14.05.2008, 12:38 +0300 schrieb
Hello Elizabeth;
Yes, I have PDF files, and metadata about them already extracted.
so I need something like:
someone
content of my pdf file
it seems that the updaterichdocument patch can only accept pdfs in raw form
- so it is not possible to feed metadata.
Have you found a solution other th
C.B., are you saying you have metadata about your PDF files (i.e.,
title, author, etc) separate from the PDF file itself, or are you
saying you want to extract that information from the PDF file? The
first of these is pretty easy, the second of these can be difficult
or impossible, dependin
yes, I have seen the documentation on RichDocumentRequestHandler at the
http://wiki.apache.org/solr/UpdateRichDocuments page.
However, from what I understand this just feeds documents to solr. How can I
construct something like: document_id, document_name, document_text and feed
it in. (i.e. my doc
Solr does not have this support built in, but there's a patch for it:
https://issues.apache.org/jira/browse/SOLR-284
On Mon, May 12, 2008 at 2:02 PM, Cam Bazz <[EMAIL PROTECTED]> wrote:
> Hello,
>
> Before making a little program to extract the txt from my pdfs and feed it
> into solr with xml,