On 1/5/2017 3:08 AM, Sebastian Riemer wrote: > I now face the problem, that searching for a book with > text:978-3-8052-5094-8* does not return the single result I expect. > However searching for text:9783805250948* instead returns a result. > Note, that I am adding a wildcard at the end automatically, to further > broaden the resultset. Note also, that it does not seem to matter > whether I put backslashes in front of the hyphen or not (to be exact, > when sending via SolrJ from my application, I put in the backslashes, > but I don't see a difference when using SolrAdmin as I guess SolrAdmin > automatically inserts backslashes if needed?)
As soon as you use a wildcard, the query is no longer run through the analysis chain, which means that it keeps all those hyphens. That will never match anything in the index, because the StandardTokenizer has removed all the hyphens in the tokens that it puts into the index. The fact that wildcards skip analysis is a source of major confusion. I assume that the analysis skip is required for correct operation, although I have never delved that deeply into the internals. A hyphen is only a special character if it's the first character in a word. It's generally a good idea to escape the special characters anyway, but in this case it doesn't matter, which is why you can send it unescaped. If you want to use wildcards, you're going to have to use them on an untokenized (normally "string") field, or the results will probably not be what you expect. Thanks, Shawn