You can always replace String type with Text type and KeywordAnalyzer
definition.

That keeps the whole input as one token, but still allows to modify
(e.g. normalize spaces with PatternReplaceCharFilterFactory) or even
one of the ICU filters (warning: ICU is dark magic...)

Regards,
   Alex.
On Mon, 19 Nov 2018 at 18:53, Angel Todorov <attodo...@gmail.com> wrote:
>
> That’s correct - the original source of my data which I was crawling had
> 160 as space. This took a while to find. :)  Solr is working fine. Thank
> you !
>
>
> On Tue, 20 Nov 2018 at 1:28, Shawn Heisey <apa...@elyograg.org> wrote:
>
> > On 11/19/2018 3:31 PM, Angel Todorov wrote:
> > > the *real* issue is that SOLR expects a character with a code of 160 for
> > > space, while the standard space as typed from a keyboard has a code of
> > 32.
> > > Both appear exactly the same. Here's where the issue comes from. If i
> > > generate the 160 space, and copy paste it, it works fine for string even
> > > like this "Some Text".
> >
> > If you have to send a character with code 160 to get a match, then that
> > is what was indexed in the original document.  A field using class
> > StrField (which is what the "string" type is almost always configured
> > as) does not change the input -- it would not change a 32 to a 160.
> > With that type, the query must match the indexed data EXACTLY -- if the
> > indexed data is code 160, then a regular space will not match it.
> >
> > You can't use the StrField class if you're expecting a match to a
> > non-breaking space when you use a standard space.
> >
> > Thanks,
> > Shawn
> >
> >

Reply via email to