Maybe this helps:
http://www.packtpub.com/article/indexing-data-solr-1.4-enterprise-search-server-2
Cheers,
Stefan
Am 12.08.2010 19:45, schrieb Ma, Xiaohui (NIH/NLM/LHC) [C]:
Does anyone know if I need define fields in schema.xml for indexing pdf files?
If I need, please tell me how I can do it.
I defined fields in schema.xml and created data-configuration file by using
xpath for xml files. Would you please tell me if I need do it for pdf files and
how I can do?
Thanks so much for your help as always!
-----Original Message-----
From: Marco Martinez [mailto:mmarti...@paradigmatecnologico.com]
Sent: Thursday, August 12, 2010 11:45 AM
To: solr-user@lucene.apache.org
Subject: Re: index pdf files
To help you we need the description of your fields in your schema.xml and
the query that you do when you search only a single word.
Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42
2010/8/12 Ma, Xiaohui (NIH/NLM/LHC) [C]<xiao...@mail.nlm.nih.gov>
I wrote a simple java program to import a pdf file. I can get a result when
I do search *:* from admin page. I get nothing if I search a word. I wonder
if I did something wrong or miss set something.
Here is part of result I get when do *:* search:
*********************************************
-<doc>
-<arr name="attr_Author">
<str>Hristovski D</str>
</arr>
-<arr name="attr_Content-Type">
<str>application/pdf</str>
</arr>
-<arr name="attr_Keywords">
<str>microarray analysis, literature-based discovery, semantic
predications, natural language processing</str>
</arr>
-<arr name="attr_Last-Modified">
<str>Thu Aug 12 10:58:37 EDT 2010</str>
</arr>
-<arr name="attr_content">
<str>Combining Semantic Relations and DNA Microarray Data for Novel
Hypotheses Generation Combining Semantic Relations and DNA Microarray Data
for Novel Hypotheses Generation Dimitar Hristovski, PhD,1 Andrej
Kastrin,2...............
*********************************************
Please help me out if anyone has experience with pdf files. I really
appreciate it!
Thanks so much,
--
*******************************************
Stefan Moises
Senior Softwareentwickler
shoptimax GmbH
Guntherstraße 45 a
90461 Nürnberg
Amtsgericht Nürnberg HRB 21703
GF Friedrich Schreieck
Tel.: 0911/25566-25
Fax: 0911/25566-29
moi...@shoptimax.de
http://www.shoptimax.de
*******************************************