Re: Solr indexing binary files

Jack Krupansky Thu, 14 Mar 2013 13:21:42 -0700

Take a look at Solr Cell:

http://wiki.apache.org/solr/ExtractingRequestHandler

Include a dynamicField with a "*" pattern and you will see the wide varietyof metadata that is available for PDF and other rich document formats.


-- Jack Krupansky

-----Original Message-----From: Luis

Sent: Thursday, March 14, 2013 3:30 PM
To: solr-user@lucene.apache.org
Subject: Solr indexing binary files

Hi, I am new with Solr and I am extracting metadata from binary filesthrough

URLs stored in my database.  I would like to know what fields are available
for indexing from PDFs (the ones that would be initiated as in column=””).
For example how would I extract something like file size, format or file
type.

I would also like to know how to create customized fields in Solr.  How
those metadata and text content are mapped into Solr schema?  Would I have
to declare that in the solrconfig.xml or do some more tweaking somewhere
else?  If someone has a code snippet that could show me it would be greatly
appreciated.

Thank you in advance.




--

View this message in context:http://lucene.472066.n3.nabble.com/Solr-indexing-binary-files-tp4047470.htmlSent from the Solr - User mailing list archive at Nabble.com.

Re: Solr indexing binary files

Reply via email to