Maybe this helps: http://www.packtpub.com/article/indexing-data-solr-1.4-enterprise-search-server-2

Cheers,
Stefan

Am 12.08.2010 19:45, schrieb Ma, Xiaohui (NIH/NLM/LHC) [C]:
Does anyone know if I need define fields in schema.xml for indexing pdf files? 
If I need, please tell me how I can do it.

I defined fields in schema.xml and created data-configuration file by using 
xpath for xml files. Would you please tell me if I need do it for pdf files and 
how I can do?

Thanks so much for your help as always!

-----Original Message-----
From: Marco Martinez [mailto:mmarti...@paradigmatecnologico.com]
Sent: Thursday, August 12, 2010 11:45 AM
To: solr-user@lucene.apache.org
Subject: Re: index pdf files

To help you we need the description of your fields in your schema.xml and
the query that you do when you search only a single word.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/8/12 Ma, Xiaohui (NIH/NLM/LHC) [C]<xiao...@mail.nlm.nih.gov>

I wrote a simple java program to import a pdf file. I can get a result when
I do search *:* from admin page. I get nothing if I search a word. I wonder
if I did something wrong or miss set something.

Here is part of result I get when do *:* search:
*********************************************
-<doc>
-<arr name="attr_Author">
  <str>Hristovski D</str>
  </arr>
-<arr name="attr_Content-Type">
  <str>application/pdf</str>
  </arr>
-<arr name="attr_Keywords">
  <str>microarray analysis, literature-based discovery, semantic
predications, natural language processing</str>
  </arr>
-<arr name="attr_Last-Modified">
  <str>Thu Aug 12 10:58:37 EDT 2010</str>
  </arr>
-<arr name="attr_content">
  <str>Combining Semantic Relations and DNA Microarray Data for Novel
Hypotheses Generation Combining Semantic Relations and DNA Microarray Data
for Novel Hypotheses Generation Dimitar Hristovski, PhD,1 Andrej
Kastrin,2...............
*********************************************
Please help me out if anyone has experience with pdf files. I really
appreciate it!

Thanks so much,



--
*******************************************
Stefan Moises
Senior Softwareentwickler

shoptimax GmbH
Guntherstraße 45 a
90461 Nürnberg
Amtsgericht Nürnberg HRB 21703
GF Friedrich Schreieck

Tel.: 0911/25566-25
Fax:  0911/25566-29
moi...@shoptimax.de
http://www.shoptimax.de
*******************************************

Reply via email to