mplications. It it turns out that you
> have to do this, consider running Tika in the app layer and
> doing the extraction on demand there. It's not very hard, see:
> https://lucidworks.com/blog/indexing-with-solrj/
> and ignore the db bits.
>
> Best,
> Erick
>
> On T
Hi everyone,
I use solr to index and search in office file (docx, pptx, ...). To reduce
the size of solr index, I do not store the content of the file on solr,
however now my customer want to preview the content of the file.
I have read the document of ExtractingRequestHandler, but it seems that
Hi Uwe,
Today, I downloaded Solr 5.1 and it worked fine. It seems that this bug fix
SOLR-7139 is only included in 5.1, not 5.0.
Thank everyone for your support.
Trung.
On Tue, Apr 28, 2015 at 10:21 AM, trung.ht wrote:
> Hi Uwe,
>
> Thanks for the answer, but it looks like it does no
t; >
> > I haven't experimented with our OCR parser yet, but this should give a
> good
> > start: https://wiki.apache.org/tika/TikaOCR .
> >
> > Have you installed tesseract?
> >
> > Tika colleagues,
> > Any other tips? What else has to be configured an
>> >
>> > Regards,
>> > Alex
>> > On 23 Apr 2015 10:24 pm, "Ahmet Arslan"
>> wrote:
>> >
>> > > Hi Trung,
>> > >
>> > > I didn't know about OCR capabilities of tika.
>> > > Someone
"Ahmet Arslan"
> wrote:
> >
> > > Hi Trung,
> > >
> > > I didn't know about OCR capabilities of tika.
> > > Someone who is familiar with sold-cell can inform us whether this
> > > functionality is added to solr or not.
> > &g
es not do OCR. It cannot exact text from image based
> pdfs.
>
> Ahmet
>
>
>
> On Thursday, April 23, 2015 7:33 AM, trung.ht wrote:
>
>
>
> Hi,
>
> I want to use solr to index some scanned document, after settings solr
> document with a two field "c
Hi,
I want to use solr to index some scanned document, after settings solr
document with a two field "content" and "filename", I tried to upload the
attached file, but it seems that the content of the file is only "\n \n
\n".
But if I used the tesseract from command line I got the result corre