And when you do use the ExtractingRequestHandler (aka Solr Cell), you
can find the metadata fields by using the ext.extract.only=true setting.
You might also find this article by Sami Siren helpful: <http://www.lucidimagination.com/index.php?option=com_content&task=view&id=106
>
Erik
On Feb 20, 2009, at 8:39 PM, Otis Gospodnetic wrote:
Josh,
You didn't mention whether you are using http://wiki.apache.org/solr/ExtractingRequestHandler
, but if you are not, maybe this already has what you need: http://wiki.apache.org/solr/ExtractingRequestHandler#head-c413be32c951c89c0a28f4f8336aa7d2774ec2d6
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
----- Original Message ----
From: Josh Joy <joshjd...@gmail.com>
To: solr-user@lucene.apache.org
Sent: Saturday, February 21, 2009 9:11:01 AM
Subject: mapping pdf metadata
Hi,
I'm having trouble figuring out how to map the tika metadata fields
to my
own solr schema document fields. I guess the first hurdle I need to
overcome, is where can I find a list of the Tika PDF metadata
fields that
are available for mapping?
Thanks,
Josh