Re: Indexing PDF

ahmad ajiloo Tue, 04 Oct 2011 11:05:00 -0700

I have this problem too, in indexing some of persian pdf files.

2011/10/4 Héctor Trujillo <hecto...@gmail.com>


> Hi all, I'm indexing pdf's files with SolrJ, and most of them work. But
> with
> some files I’ve got problems because they stored estrange characters. I got
> stored this content:
> +++++++
>
> Starting a Search Application
>
> 
> Abstract
>
> Starting
> a Search Application A Lucid Imagination White Paper ¥ April 2009 Page i
>
> 
> Starting a Search Application A Lucid Imagination White Paper ¥ April 2009
> Page ii Do You Need Full-text Search?
>
> ∞
>
> ∞
> ∞
>
> Starting
> a Search Application A Lucid Imagination White Paper ¥ April 2009 Page 1
>
> Identifying
> Ideal Results
>
> Starting
> a Search Application A Lucid Imagination White Paper ¥ April 2009 Page 2
>
> Starting
> a Search Application A Lucid Imagination White Paper
>
>
> +++++++
>
> But if I open the pdf file I have no problem to see the content correctly.
>
> I think this is a question of the charset encoding, but I don't know if I
> can avoid this behaviour with a different analyzer o tokenizer to be
> applied
> in indexing time, may be.
>
> I've got this problem with some documents downloaded from Lucid's Web.
>
>
>
> I don't know if some have had the same problem and know how to solve this.
>
> Thanks
>
> Best regards
>

Re: Indexing PDF

Reply via email to