Re: Indexing multiple pdf's and partial update of pdf

2016-03-24 Thread Alexandre Rafalovitch
ot;; > --data-binary @example/exampledocs/sample.html -H 'Content-type:text/html' > > > > Thanks > > Jay > > > > -Original Message- > From: Reth RM [mailto:reth.ik...@gmail.com] > Sent: Thursday, March 24, 2016 12:24 AM > To: solr-user@lucene

RE: Indexing multiple pdf's and partial update of pdf

2016-03-24 Thread Jay Parashar
tp://localhost:8983/solr/techproducts/update/extract?&extractOnly=true"; --data-binary @example/exampledocs/sample.html -H 'Content-type:text/html' Thanks Jay -Original Message- From: Reth RM [mailto:reth.ik...@gmail.com] Sent: Thursday, March 24, 2016 12:24 AM To: s

Re: Indexing multiple pdf's and partial update of pdf

2016-03-23 Thread Reth RM
Are you using apache tika parser to parse pdf files? 1) Solr support parent-child block join using which you can index more than one file data within document object(if that is what you are looking for) https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-BlockJoinQueryParse

Indexing multiple pdf's and partial update of pdf

2016-03-23 Thread Jay Parashar
Hi, I have couple of questions regarding indexing files (say pdf). 1) Is there any way to index more than one file to one document with a unique id? One way I think is to do a “extractOnly” of all the documents and then index that extract separately. Is there an easier way? 2) If my