Re: PDF Indexing

2014-04-02 Thread Jack Krupansky
: Wednesday, April 2, 2014 3:35 PM To: solr-user@lucene.apache.org Subject: Re: PDF Indexing Hi Sujatha, There is no built in mechanism. Prepare page documents outside of the solr. http://searchhub.org/2012/02/14/indexing-with-solrj/ And you may want to save text content somewhere too. If you change

Re: PDF Indexing

2014-04-02 Thread Ahmet Arslan
Hi Sujatha, There is no built in mechanism. Prepare page documents outside of the solr.  http://searchhub.org/2012/02/14/indexing-with-solrj/ And you may want to save text content somewhere too. If you change something in index analysis/schema you need to reindex. If you save text data, you can

Re: PDF indexing issues

2013-11-18 Thread Marcello Lorenzi
Hi, I have checked the PDF Jira issue but there isn't solution into this because some users experienced the same issue with different CMAP entries. Could it possible to update the PDFBOX library in the SolR installation? Thanks, Marcello On 11/15/2013 06:27 PM, Furkan KAMACI wrote: You shou

Re: PDF indexing issues

2013-11-15 Thread Furkan KAMACI
You should check the Apache PDFBox project. A similar question: https://issues.apache.org/jira/browse/PDFBOX-940 2013/11/15 Marcello Lorenzi > Hi, > during you testing of Apache SOLR 4.3, we have noticed some errors > occurred for PDF indexing: > > ERROR - 2013-11-15 15:14:26.248; org.apache.pd

Re: PDF indexing

2012-05-08 Thread Lance Norskog
post.jar and curl do the same thing. Look at post.sh, which uses curl. On Mon, May 7, 2012 at 12:57 PM, Tolga wrote: > On 05/07/2012 10:35 PM, Jack Krupansky wrote: >> >> Try SolrCell (ExtractingRequestHandler). >> >> See: >> http://wiki.apache.org/solr/ExtractingRequestHandler >> >> -- Jack Krup

Re: PDF indexing

2012-05-07 Thread Tolga
On 05/07/2012 10:35 PM, Jack Krupansky wrote: Try SolrCell (ExtractingRequestHandler). See: http://wiki.apache.org/solr/ExtractingRequestHandler -- Jack Krupansky -Original Message- From: Tolga Sent: Monday, May 07, 2012 3:24 PM To: solr-user@lucene.apache.org Subject: PDF indexing H

Re: PDF indexing

2012-05-07 Thread Jack Krupansky
Try SolrCell (ExtractingRequestHandler). See: http://wiki.apache.org/solr/ExtractingRequestHandler -- Jack Krupansky -Original Message- From: Tolga Sent: Monday, May 07, 2012 3:24 PM To: solr-user@lucene.apache.org Subject: PDF indexing Hi, From what I have read, I think I have