There certainly is a lot to learn!
Right, the only problem I have with your analysis chain is that
the WhitespaceTokenizer doesn't strip punctuation so you'll
have terms like "texto." (note the period).
Something like PatternReplaceFilterFactory would help here.
Best,
Erick
On Tue, Jun 28, 2016
Hi Erick,
Thanks for your comments! In fact, I started with Solr one month ago, so I
am still learning! =)
I understand the differences between the Solr tokenizers, but there are so
many options that take some time to find the one that fits our need.
I found a solution to my problem with the con
OK, you really have to get familiar with the
admin/analysis page. Whitespace tokenizer
is really simple, it breaks up on whitespace. So
punctuation is kept in the index. Which is very
rarely what you want. Use something like
StandardTokenizer or maybe a filter that
removes all non-alpha-num charact