Hello, So recently I was debugging a problem on Solr 7.7.2 where the query wasn't returning the desired results. Turned out that the indexed terms had underscore separated terms, but the query didn't. I was under the impression that terms separated by underscore are also tokenized by StandardTokenizerFactory, but turns out that's not the case. Eg: 'hello-world' would be tokenized into 'hello' and 'world', but 'hello_world' is treated as a single token. Is this a bug or a designed behavior?
If this is by design, it would be helpful if this behavior is included in the documentation since it is similar to the behavior with periods. https://lucene.apache.org/solr/guide/6_6/tokenizers.html#Tokenizers-StandardTokenizer "Periods (dots) that are not followed by whitespace are kept as part of the token, including Internet domain names. " Thanks, Rahul