did you configured PatternReplaceFilterFactory?
















At 2021-01-08 12:16:06, "Rahul Goswami" <rahul196...@gmail.com> wrote:
>Hello,
>So recently I was debugging a problem on Solr 7.7.2 where the query wasn't
>returning the desired results. Turned out that the indexed terms had
>underscore separated terms, but the query didn't. I was under the
>impression that terms separated by underscore are also tokenized by
>StandardTokenizerFactory, but turns out that's not the case. Eg:
>'hello-world' would be tokenized into 'hello' and 'world', but
>'hello_world' is treated as a single token.
>Is this a bug or a designed behavior?
>
>If this is by design, it would be helpful if this behavior is included in
>the documentation since it is similar to the behavior with periods.
>
>https://lucene.apache.org/solr/guide/6_6/tokenizers.html#Tokenizers-StandardTokenizer
>"Periods (dots) that are not followed by whitespace are kept as part of the
>token, including Internet domain names. "
>
>Thanks,
>Rahul

Reply via email to