New to Solr, not so new to search.  I have an existing data model that I am 
pushing into a Solr index.  For example, I am indexing a product which includes 
product brochures in multiple locales.  So this single Solr document contains 
multiple text fields which require linguistics analyzers.  The data for these 
text fields comes from multiple pdf files.  As i am currently supporting 4 
locales, I will have a different pdf file for each locale.    In addition I 
have a number of other fields that are used by the application. Solr will be 
returning a reference used by the application to determine what data and pdf to 
display.  With the Extraction Request handler I don't see how I would be able 
to support multiple pdf files.  I'm am planning to use Solr 1.4 for this 
project.

 

My assumption is that I will need to do the pdf parsing prior to sending the 
document to Solr.  Is there a way to do the extraction at the field level?  I 
want to make sure I am not missing something in Solr Cel before I invest the 
effort to parse the documents on the client side.

 

Thanks

_________________________________________________________________
Windows Live: Make it easier for your friends to see what you’re up to on 
Facebook.
http://windowslive.com/Campaign/SocialNetworking?ocid=PID23285::T:WLMTAGL:ON:WL:en-US:SI_SB_facebook:082009

Reply via email to