Hey Erick,

thanks for your answer. They are not indexed correctly. Also throught the solr 
admin interface I see these typical questionmarks within a rhombus where a 
blank space should be.
I now figured out the following (not sure if it is relevant at all):
- PDF documents created with "Acrobat PDFMaker 10.0 for Word" are indexed 
correctly, no issues
- PDF documents (with editable form fields) created with "Adobe InDesign CS5 
(7.0.1)"  are indexed with the blank space issue

Best
Steve

-----Ursprüngliche Nachricht-----
Von: Erick Erickson [mailto:erickerick...@gmail.com] 
Gesendet: Mittwoch, 22. April 2015 17:11
An: solr-user@lucene.apache.org
Betreff: Re: Odp.: solr issue with pdf forms

Are they not _indexed_ correctly or not being displayed correctly?
Take a look at admin UI>>schema browser>> your field and press the "load terms" 
button. That'll show you what is _in_ the index as opposed to what the raw data 
looked like.

When you return the field in a Solr search, you get a verbatim, un-analyzed 
copy of your original input. My guess is that your browser isn't using the 
compatible character encoding for display.

Best,
Erick

On Wed, Apr 22, 2015 at 7:08 AM,  <steve.sch...@t-systems.com> wrote:
> Thanks for your answer. Maybe my English is not good enough, what are you 
> trying to say? Sorry I didn't get the point.
> :-(
>
>
> -----Ursprüngliche Nachricht-----
> Von: LAFK [mailto:tomasz.bo...@gmail.com]
> Gesendet: Mittwoch, 22. April 2015 14:01
> An: solr-user@lucene.apache.org; solr-user@lucene.apache.org
> Betreff: Odp.: solr issue with pdf forms
>
> Out of my head I'd follow how are writable PDFs created and encoded.
>
> @LAFK_PL
>   Oryginalna wiadomość
> Od: steve.sch...@t-systems.com
> Wysłano: środa, 22 kwietnia 2015 12:41
> Do: solr-user@lucene.apache.org
> Odpowiedz: solr-user@lucene.apache.org
> Temat: solr issue with pdf forms
>
> Hi guys,
>
> hopefully you can help me with my issue. We are using a solr setup and have 
> the following issue:
> - usual pdf files are indexed just fine
> - pdf files with writable form-fields look like this:
> Ich bestätige mit meiner Unterschrift, dass alle Angaben korrekt und v
> ollständig sind
>
> Somehow the blank space character is not indexed correctly.
>
> Is this a know issue? Does anybody have an idea?
>
> Thanks a lot
> Best
> Steve

Reply via email to