Are they not _indexed_ correctly or not being displayed correctly? Take a look at admin UI>>schema browser>> your field and press the "load terms" button. That'll show you what is _in_ the index as opposed to what the raw data looked like.
When you return the field in a Solr search, you get a verbatim, un-analyzed copy of your original input. My guess is that your browser isn't using the compatible character encoding for display. Best, Erick On Wed, Apr 22, 2015 at 7:08 AM, <steve.sch...@t-systems.com> wrote: > Thanks for your answer. Maybe my English is not good enough, what are you > trying to say? Sorry I didn't get the point. > :-( > > > -----Ursprüngliche Nachricht----- > Von: LAFK [mailto:tomasz.bo...@gmail.com] > Gesendet: Mittwoch, 22. April 2015 14:01 > An: solr-user@lucene.apache.org; solr-user@lucene.apache.org > Betreff: Odp.: solr issue with pdf forms > > Out of my head I'd follow how are writable PDFs created and encoded. > > @LAFK_PL > Oryginalna wiadomość > Od: steve.sch...@t-systems.com > Wysłano: środa, 22 kwietnia 2015 12:41 > Do: solr-user@lucene.apache.org > Odpowiedz: solr-user@lucene.apache.org > Temat: solr issue with pdf forms > > Hi guys, > > hopefully you can help me with my issue. We are using a solr setup and have > the following issue: > - usual pdf files are indexed just fine > - pdf files with writable form-fields look like this: > Ich�bestätige�mit�meiner�Unterschrift,�dass�alle�Angaben�korrekt�und�vollständig�sind > > Somehow the blank space character is not indexed correctly. > > Is this a know issue? Does anybody have an idea? > > Thanks a lot > Best > Steve