Re: Indexing PDF

Héctor Trujillo Wed, 05 Oct 2011 02:01:04 -0700

Sorry you have the reason, this file was indexed with a .Net web service
client, that calls a Java application(a web service) that calls Solr using
SolrJ.


I will try to index this in a different way, may be this resolve the
problem.

Thanks

Best regards



El 5 de octubre de 2011 08:42, Héctor Trujillo <hecto...@gmail.com>escribió:

>   It seems unreasonable that if I want to index a local file, I have to
> references this local file by an URL.
>
> This isn't a estrange file, this is a file downloaded from lucid web portal
> called: Starting a Search Application.pdf
>
> This problem may be a codification problem, or char set problem. I open
> this file with a PDF Reader and I have no problems, and I don’t Know why
> referencing this file with and URL will fix this problem, can you help me?
>
> I'm working with SolrJ, from Java, does some have the same problem with
> SolrJ?
>
>
>
> Thanks to Paul Libbrecht, for your option.
>
>
>
> Best regards
>
>
>
>
>
>
> 2011/10/4 Paul Libbrecht <p...@hoplahup.net>
>
>> full of boxes for me.
>> Héctor, you need another way to reference these!
>> (e.g. a URL)
>>
>> paul
>>
>>
>> Le 4 oct. 2011 à 16:49, Héctor Trujillo a écrit :
>>
>> > Hi all, I'm indexing pdf's files with SolrJ, and most of them work. But
>> with
>> > some files I’ve got problems because they stored estrange characters. I
>> got
>> > stored this content:
>> > +++++++
>> >
>> > Starting a Search Application
>> >
>> 
>> > Abstract
>> >
>> Starting
>> > a Search Application A Lucid Imagination White Paper ¥ April 2009 Page
>> i
>> >
>> 
>> > Starting a Search Application A Lucid Imagination White Paper ¥ April
>> 2009
>> > Page ii Do You Need Full-text Search?
>> >
>> ∞
>> >
>> ∞
>> > ∞
>> >
>> Starting
>> > a Search Application A Lucid Imagination White Paper ¥ April 2009 Page
>> 1
>> >
>> Identifying
>> > Ideal Results
>> >
>> Starting
>> > a Search Application A Lucid Imagination White Paper ¥ April 2009 Page
>> 2
>> >
>> Starting
>> > a Search Application A Lucid Imagination White Paper
>> >
>> >
>> > +++++++
>> >
>> > But if I open the pdf file I have no problem to see the content
>> correctly.
>> >
>> > I think this is a question of the charset encoding, but I don't know if
>> I
>> > can avoid this behaviour with a different analyzer o tokenizer to be
>> applied
>> > in indexing time, may be.
>> >
>> > I've got this problem with some documents downloaded from Lucid's Web.
>> >
>> >
>> >
>> > I don't know if some have had the same problem and know how to solve
>> this.
>> >
>> > Thanks
>> >
>> > Best regards
>>
>>
>

Re: Indexing PDF

Reply via email to