Re the relevancy changes I note below for edismax, there are already some 
issues filed:

pertaining to the difference in how the phrase queries are merged into the main 
query:
  See Michael Dodsworth's comment of 25/Sep/12  on this issue:   
https://issues.apache.org/jira/browse/SOLR-2058  <-- ticket is closed, but this 
issue is not addressed.

and pertaining to skipping terms in phrase boosting when part of the query is a 
phrase:
  https://issues.apache.org/jira/browse/SOLR-4130

- Naomi


On Sep 3, 2013, at 5:54 PM, Naomi Dushay wrote:

> When I have a field using CJKBigramFilter,  parsed CJK chars have a different 
> parsedQuery than  non-CJK  queries.
> 
>   (旧小说 is 3 chars, so 2 bigrams)
> 
> args sent in:       q={!qf=bi_fld}旧小说&pf=&pf2=&pf3=
> 
>  debugQuery
>    <str name="rawquerystring">{!qf=bi_fld}旧小说</str>
>    <str name="querystring">{!qf=bi_fld}旧小说</str>
>    <str name="parsedquery">(+DisjunctionMaxQuery((((bi_fld:旧小 
> bi_fld:小说)~2))~0.01) ())/no_coord</str>
>    <str name="parsedquery_toString">+(((bi_fld:旧小 bi_fld:小说)~2))~0.01 ()</str>
> 
> 
> If i use a non-CJK query string, with the same field:
> 
> args sent in:      q={!qf=bi_fld}foo bar&pf=&pf2=&pf3=
> 
> debugQuery:
>    <str name="rawquerystring">{!qf=bi_fld}foo bar</str>
>    <str name="querystring">{!qf=bi_fld}foo bar</str>
>    <str name="parsedquery">(+((DisjunctionMaxQuery((bi_fld:foo)~0.01) 
> DisjunctionMaxQuery((bi_fld:bar)~0.01))~2))/no_coord</str>
>    <str name="parsedquery_toString">+(((bi_fld:foo)~0.01 
> (bi_fld:bar)~0.01)~2)</str>
> 
> 
> Why are the  parsedquery_toString   formula different?  And is there any 
> difference in the actual relevancy formula?    
> 
> How can you tell the difference between the MinNrShouldMatch and a qs or ps 
> or tie value, if they are all represented as ~n  in the parsedQuery string?
> 
> 
> To try to get a handle on qs, ps, tie and mm:
> 
>  args:  q={!qf=bi_fld pf=bi_fld}"a b" c d&qs=5&ps=4
> 
> debugQuery:
>   <str name="rawquerystring">{!qf=bi_fld pf=bi_fld}"a b" c d</str>
>   <str name="querystring">{!qf=bi_fld pf=bi_fld}"a b" c d</str>
>   <str name="parsedquery">(+((DisjunctionMaxQuery((bi_fld:"a b"~5)~0.01) 
> DisjunctionMaxQuery((bi_fld:c)~0.01) DisjunctionMaxQuery((bi_fld:d)~0.01))~3) 
> DisjunctionMaxQuery((bi_fld:"c d"~4)~0.01))/no_coord</str>
>   <str name="parsedquery_toString">+(((bi_fld:"a b"~5)~0.01 (bi_fld:c)~0.01 
> (bi_fld:d)~0.01)~3) (bi_fld:"c d"~4)~0.01</str>
> 
> 
> I get that qs, the query slop, is for explicit phrases in the query, so "a 
> b"~5    makes sense.   I also get that ps is for boosting of phrases, so I 
> get  (bi_fld:"c d"~4) … but where is   (cjk_uni_pub_search:"a b c d"~4)  ?
> 
> 
> Using dismax (instead of edismax):
> 
> args:   q={!dismax  qf=bi_fld pf=bi_fld}"a b" c d&qs=5&ps=4
> 
> debugQuery:
>   <str name="rawquerystring">{!dismax qf=bi_fld pf=bi_fld}"a b" c d</str>
>   <str name="querystring">{!dismax qf=bi_fld pf=bi_fld}"a b" c d</str>
>   <str name="parsedquery">(+((DisjunctionMaxQuery((bi_fld:"a b"~5)~0.01) 
> DisjunctionMaxQuery((bi_fld:c)~0.01) DisjunctionMaxQuery((bi_fld:d)~0.01))~3) 
> DisjunctionMaxQuery((bi_fld:"a b c d"~4)~0.01))/no_coord</str>
>   <str name="parsedquery_toString">+(((bi_fld:"a b"~5)~0.01 (bi_fld:c)~0.01 
> (bi_fld:d)~0.01)~3) (bi_fld:"a b c d"~4)~0.01</str>
> 
> 
> So is this an edismax bug?
> 
> 
> 
> FYI,   I am running Solr 4.4. I have fields defined like so:
> <fieldtype name="text_cjk_bi" class="solr.TextField" 
> positionIncrementGap="10000" autoGeneratePhraseQueries="false">
>   <analyzer>
>     <tokenizer class="solr.ICUTokenizerFactory" />
>     <filter class="solr.CJKWidthFilterFactory"/>
>     <filter class="solr.ICUTransformFilterFactory" 
> id="Traditional-Simplified"/>
>     <filter class="solr.ICUTransformFilterFactory" id="Katakana-Hiragana"/>
>     <filter class="solr.ICUFoldingFilterFactory"/>
>     <filter class="solr.CJKBigramFilterFactory" han="true" hiragana="true" 
> katakana="true" hangul="true" outputUnigrams="false" />
>   </analyzer>
> </fieldtype>
> 
> The request handler uses edismax:
> 
> <requestHandler name="search" class="solr.SearchHandler" default="true">
> <lst name="defaults">
> <str name="defType">edismax</str>
> <str name="q.alt">:</str>
> <str name="mm">6<-1 6<90%</str>
> <int name="qs">1</int>
> <int name="ps">0</int>
> 

Reply via email to