Steve: You _really_ want to get acquainted with the admin UI/Analysis page ;). Choose a core/collection and you should see the choice. It shows you exactly what transformations your data goes through. If you hover over the light gray pairs of letters, you’ll get a tooltip showing you what part of your analysis chain is responsible for a particular change. I un-check the “verbose” box 95% of the time BTW.
The critical bit is that what comes out of the end of the analysis pipe are the tokens that are actually _in_ the index. From there, problems like this make more sense. My bet is that, as Walter says, you have a stemmer in the analysis chain and the actual token in the index is “kinas” so of course “kinase*” won’t be found. By adding OR kinase to the query, that token is stemmed to “kinas” and matches. Also, adding &debug=query to your URL will show you what the query looks like after parsing and analysis, also a major tool for figuring out what’s really happening. Wildcards are not stemmed, which can lead to surprising results. There’s no perfect answer here. Let’s claim wildcards _were_ stemmed. Then you’d have to try to explain why “running*” returned a doc with only “run” or “runner” or “runs” or... in it, but searching for “runnin*” did not due the stemmer not recognizing it as a stemmable word. Finally, one of my personal hot buttons is wildcards in general. They’re very often over-used because people are used to simple search capabilities. Something about “if your only tool is a hammer, every problem looks like a nail”. That gets into training users too though... Best, Erick > On Feb 11, 2020, at 9:24 PM, Fischer, Stephen > <sfisc...@pennmedicine.upenn.edu> wrote: > > Hi, > > I am a solr newbie. I was surprised to discover that a search for kinase* > returned fewer results than kinase. > > Then I read the wildcard > documentation<https://lucene.apache.org/solr/guide/6_6/the-standard-query-parser.html#TheStandardQueryParser-WildcardSearches>, > and saw why. kinase* will not match the word "kinase". > > Our end-users won't expect this behavior. Presumably the solution would be > for them (actually us, on their behalf), to use kinase* OR kinase. > > But that is kind of a hack. > > Is there a way we can configure solr to have wildcards match on end-of-word? > > Thanks, > Steve