: Wednesday, April 2, 2014 3:35 PM
To: solr-user@lucene.apache.org
Subject: Re: PDF Indexing
Hi Sujatha,
There is no built in mechanism. Prepare page documents outside of the solr.
http://searchhub.org/2012/02/14/indexing-with-solrj/
And you may want to save text content somewhere too. If you change
Hi Sujatha,
There is no built in mechanism. Prepare page documents outside of the solr.
http://searchhub.org/2012/02/14/indexing-with-solrj/
And you may want to save text content somewhere too. If you change something in
index analysis/schema you need to reindex. If you save text data, you can
Hi,
I have checked the PDF Jira issue but there isn't solution into this
because some users experienced the same issue with different CMAP
entries. Could it possible to update the PDFBOX library in the SolR
installation?
Thanks,
Marcello
On 11/15/2013 06:27 PM, Furkan KAMACI wrote:
You shou
You should check the Apache PDFBox project. A similar question:
https://issues.apache.org/jira/browse/PDFBOX-940
2013/11/15 Marcello Lorenzi
> Hi,
> during you testing of Apache SOLR 4.3, we have noticed some errors
> occurred for PDF indexing:
>
> ERROR - 2013-11-15 15:14:26.248; org.apache.pd
post.jar and curl do the same thing. Look at post.sh, which uses curl.
On Mon, May 7, 2012 at 12:57 PM, Tolga wrote:
> On 05/07/2012 10:35 PM, Jack Krupansky wrote:
>>
>> Try SolrCell (ExtractingRequestHandler).
>>
>> See:
>> http://wiki.apache.org/solr/ExtractingRequestHandler
>>
>> -- Jack Krup
On 05/07/2012 10:35 PM, Jack Krupansky wrote:
Try SolrCell (ExtractingRequestHandler).
See:
http://wiki.apache.org/solr/ExtractingRequestHandler
-- Jack Krupansky
-Original Message- From: Tolga Sent: Monday, May 07, 2012 3:24
PM To: solr-user@lucene.apache.org Subject: PDF indexing
H
Try SolrCell (ExtractingRequestHandler).
See:
http://wiki.apache.org/solr/ExtractingRequestHandler
-- Jack Krupansky
-Original Message-
From: Tolga
Sent: Monday, May 07, 2012 3:24 PM
To: solr-user@lucene.apache.org
Subject: PDF indexing
Hi,
From what I have read, I think I have