My guess is that your field analysis isn't stripping the various non alpha-num characters, thus "the]" is actually a token in your index, square bracket and all. If that's true, it certainly doesn't match the stopword "the".
You can check by using the TermsComponent, pointing it at your field and setting terms.prefix=the See: https://cwiki.apache.org/confluence/display/solr/The+Terms+Component Best, Erick On Tue, Jul 5, 2016 at 2:34 PM, Steven White <swhite4...@gmail.com> wrote: > HI Everyone, > > I'm trying to understand why I get a hit when I search for "the}" but not > when I search for "the" (searches are done without the quotes and "the" is > a stopword in my case). > > Here is the debugQuery output using "the}": > "debug": { > "rawquerystring": "the}", > "querystring": "the}", > "parsedquery": "(+DisjunctionMaxQuery(((ALL_FIELDS:the} > ALL_FIELDS:the))~1.0))/no_coord", > "parsedquery_toString": "+((ALL_FIELDS:the} ALL_FIELDS:the))~1.0", > "explain": { > "-1.5.1804": "\n0.14220011 = sum of:\n 0.14220011 = > weight(ALL_FIELDS:the in 0) [DefaultSimilarity], result of:\n 0.14220011 > = score(doc=0,freq=2.0), product of:\n 0.51863563 = queryWeight, > product of:\n 2.4816046 = idf(docFreq=4, maxDocs=22)\n > 0.20899205 = queryNorm\n 0.27418116 = fieldWeight in 0, product of:\n > 1.4142135 = tf(freq=2.0), with freq of:\n 2.0 = > termFreq=2.0\n 2.4816046 = idf(docFreq=4, maxDocs=22)\n > 0.078125 = fieldNorm(doc=0)\n", > "-1.5.3552": "\n0.14220011 = sum of:\n 0.14220011 = > weight(ALL_FIELDS:the in 0) [DefaultSimilarity], result of:\n 0.14220011 > = score(doc=0,freq=2.0), product of:\n 0.51863563 = queryWeight, > product of:\n 2.4816046 = idf(docFreq=4, maxDocs=22)\n > 0.20899205 = queryNorm\n 0.27418116 = fieldWeight in 0, product of:\n > 1.4142135 = tf(freq=2.0), with freq of:\n 2.0 = > termFreq=2.0\n 2.4816046 = idf(docFreq=4, maxDocs=22)\n > 0.078125 = fieldNorm(doc=0)\n", > "-1.5.3554": "\n0.14220011 = sum of:\n 0.14220011 = > weight(ALL_FIELDS:the in 1) [DefaultSimilarity], result of:\n 0.14220011 > = score(doc=1,freq=2.0), product of:\n 0.51863563 = queryWeight, > product of:\n 2.4816046 = idf(docFreq=4, maxDocs=22)\n > 0.20899205 = queryNorm\n 0.27418116 = fieldWeight in 1, product of:\n > 1.4142135 = tf(freq=2.0), with freq of:\n 2.0 = > termFreq=2.0\n 2.4816046 = idf(docFreq=4, maxDocs=22)\n > 0.078125 = fieldNorm(doc=1)\n", > "-1.5.1802": "\n0.1137601 = sum of:\n 0.1137601 = > weight(ALL_FIELDS:the in 0) [DefaultSimilarity], result of:\n 0.1137601 > = score(doc=0,freq=2.0), product of:\n 0.51863563 = queryWeight, > product of:\n 2.4816046 = idf(docFreq=4, maxDocs=22)\n > 0.20899205 = queryNorm\n 0.21934493 = fieldWeight in 0, product of:\n > 1.4142135 = tf(freq=2.0), with freq of:\n 2.0 = > termFreq=2.0\n 2.4816046 = idf(docFreq=4, maxDocs=22)\n > 0.0625 = fieldNorm(doc=0)\n" > }, > "QParser": "ExtendedDismaxQParser", > "altquerystring": null, > "boost_queries": null, > "parsed_boost_queries": [], > "boostfuncs": null, > "filter_queries": [ > "ISBN_GROUP_ID:2" > ], > "parsed_filter_queries": [ > "ISBN_GROUP_ID:2" > ], > > Here is the debugQuery output using "the" > "debug": { > "rawquerystring": "the", > "querystring": "the", > "parsedquery": "(+())/no_coord", > "parsedquery_toString": "+()", > "explain": {}, > "QParser": "ExtendedDismaxQParser", > "altquerystring": null, > "boost_queries": null, > "parsed_boost_queries": [], > "boostfuncs": null, > "filter_queries": [ > "ISBN_GROUP_ID:2" > ], > "parsed_filter_queries": [ > "ISBN_GROUP_ID:2" > ], > > As expected, I get no hits when I search for just "}": > "debug": { > "rawquerystring": "}", > "querystring": "}", > "parsedquery": "(+DisjunctionMaxQuery((ALL_FIELDS:})~1.0))/no_coord", > "parsedquery_toString": "+(ALL_FIELDS:})~1.0", > "explain": {}, > "QParser": "ExtendedDismaxQParser", > "altquerystring": null, > "boost_queries": null, > "parsed_boost_queries": [], > "boostfuncs": null, > "filter_queries": [ > "ISBN_GROUP_ID:2" > ], > "parsed_filter_queries": [ > "ISBN_GROUP_ID:2" > ], > > In case it matters, I'm also getting a hit when I search for "the." or > "the]" or "the/" or "the," or "the=" etc. > > Thanks in advanced. > > Steve