Hey Erick, thanks for your answer. They are not indexed correctly. Also throught the solr admin interface I see these typical questionmarks within a rhombus where a blank space should be. I now figured out the following (not sure if it is relevant at all): - PDF documents created with "Acrobat PDFMaker 10.0 for Word" are indexed correctly, no issues - PDF documents (with editable form fields) created with "Adobe InDesign CS5 (7.0.1)" are indexed with the blank space issue
Best Steve -----Ursprüngliche Nachricht----- Von: Erick Erickson [mailto:erickerick...@gmail.com] Gesendet: Mittwoch, 22. April 2015 17:11 An: solr-user@lucene.apache.org Betreff: Re: Odp.: solr issue with pdf forms Are they not _indexed_ correctly or not being displayed correctly? Take a look at admin UI>>schema browser>> your field and press the "load terms" button. That'll show you what is _in_ the index as opposed to what the raw data looked like. When you return the field in a Solr search, you get a verbatim, un-analyzed copy of your original input. My guess is that your browser isn't using the compatible character encoding for display. Best, Erick On Wed, Apr 22, 2015 at 7:08 AM, <steve.sch...@t-systems.com> wrote: > Thanks for your answer. Maybe my English is not good enough, what are you > trying to say? Sorry I didn't get the point. > :-( > > > -----Ursprüngliche Nachricht----- > Von: LAFK [mailto:tomasz.bo...@gmail.com] > Gesendet: Mittwoch, 22. April 2015 14:01 > An: solr-user@lucene.apache.org; solr-user@lucene.apache.org > Betreff: Odp.: solr issue with pdf forms > > Out of my head I'd follow how are writable PDFs created and encoded. > > @LAFK_PL > Oryginalna wiadomość > Od: steve.sch...@t-systems.com > Wysłano: środa, 22 kwietnia 2015 12:41 > Do: solr-user@lucene.apache.org > Odpowiedz: solr-user@lucene.apache.org > Temat: solr issue with pdf forms > > Hi guys, > > hopefully you can help me with my issue. We are using a solr setup and have > the following issue: > - usual pdf files are indexed just fine > - pdf files with writable form-fields look like this: > Ich bestätige mit meiner Unterschrift, dass alle Angaben korrekt und v > ollständig sind > > Somehow the blank space character is not indexed correctly. > > Is this a know issue? Does anybody have an idea? > > Thanks a lot > Best > Steve