On 10/14/12 12:19 PM, Jack Krupansky wrote:
There's a miscommunication here somewhere. Is Solr 4.0 still passing
"*:*" to the analyzer? Show us the parsed query for "*:*", as well as
the debugQuery "explain" for the score.
I'm not quite sure what you mean by the parsed query for "*:*".
This fake analyzer using NGramTokenizer divides "*:*" into three tokens
"*", ":", and "*", on purpose to simulate our Tokenizer's behavior.
An excerpt of he XML results from the query is pasted in the bottom of
this message.
I mean, "*:*" (MatchAllDocsQuery) has a "constant score", so there
isn't any way for it to be "suboptimal".
That's exactly the point I'd like to raise.
No matter what analyzers are assigned to fields, the hit score for "*:*"
must remain 1.0, but it's not happening when an analyzer that divides
"*:*" are in use.
Here's an excerpt of the query response. Notice this element, which
should not be there, in my opinion:
DisjunctionMaxQuery((name:"* : *"^0.5))
There is a space between * and :, and another space between : and *.
<response>
<lstname="responseHeader">
<intname="status">0</int>
<intname="QTime">33</int>
<lstname="params">
<strname="indent">on</str>
<strname="wt"/>
<strname="version">2.2</str>
<strname="rows">10</str>
<strname="defType">edismax</str>
<strname="pf">name^0.5</str>
<strname="fl">*,score</str>
<strname="debugQuery">on</str>
<strname="start">0</str>
<strname="q">*:*</str>
<strname="qt"/>
<strname="fq"/>
</lst>
</lst>
<resultname="response"numFound="32"start="0"maxScore="0.14764866">
<doc>
<strname="id">GB18030TEST</str>
<strname="name">Test with some GB18030 encoded characters</str>
<arrname="features">
<str>No accents here</str>
<str>这是一个功能</str>
<str>This is a feature (translated)</str>
<str>这份文件是很有光泽</str>
<str>This document is very shiny (translated)</str>
</arr>
<floatname="price">0.0</float>
<strname="price_c">0,USD</str>
<boolname="inStock">true</bool>
<longname="_version_">1415830106215022592</long>
<floatname="score">0.14764866</float>
</doc>
...
</result>
<lstname="debug">
<strname="rawquerystring">*:*</str>
<strname="querystring">*:*</str>
<strname="parsedquery">
(+MatchAllDocsQuery(*:*) DisjunctionMaxQuery((name:"* : *"^0.5)))/no_coord
</str>
<strname="parsedquery_toString">+*:* (name:"* : *"^0.5)</str>
<lstname="explain">
<strname="GB18030TEST">
0.14764866 = (MATCH) sum of: 0.14764866 = (MATCH) MatchAllDocsQuery,
product of: 0.14764866 = queryNorm
</str>
</lst>
<strname="QParser">ExtendedDismaxQParser</str>
<nullname="altquerystring"/>
<nullname="boostfuncs"/>
...
</lst>
</lst>
</lst>
</response>