You can also use any of the other tokenizers. WhitespaceTokenizer for instance. There are a couple that use regular expressions. Etc. See: https://cwiki.apache.org/confluence/display/solr/Tokenizers
Each one has it's considerations. WhitespaceTokenizer won't, for instance, separate out punctuation so you might then have to use a filter to remove those. Regex's can be tricky to get right ;). Etc.... Best, Erick On Mon, May 22, 2017 at 5:26 AM, Muhammad Zahid Iqbal <zahid.iq...@northbaysolutions.net> wrote: > Hi, > > > Before applying tokenizer, you can replace your special symbols with some > phrase to preserve it and after tokenized you can replace it back. > > For example: > <charFilter class="solr.PatternReplaceCharFilterFactory" pattern="(\+)" > replacement="xxx" /> > > > Thanks, > Zahid iqbal > > On Mon, May 22, 2017 at 12:57 AM, Fundera Developer < > funderadevelo...@outlook.com> wrote: > >> Hi all, >> >> I am a bit stuck at a problem that I feel must be easy to solve. In >> Spanish it is usual to find the term 'i+d'. We are working with Solr 5.5, >> and StandardTokenizer splits 'i' and 'd' and sometimes, as we have in the >> index documents both in Spanish and Catalan, and in Catalan it is frequent >> to find 'i' as a word, when a user searches for 'i+d' it gets Catalan >> documents as results. >> >> I have tried to use the SynonymFilter, with something like: >> >> i+d => investigacionYdesarrollo >> >> But it does not seem to change anything. >> >> Is there a way I could set an exception to the Tokenizer so that it does >> not split this word? >> >> Thanks in advance! >> >>