Hello all,
What is the difference between the following two queries that causes
them to give different results? Is there a parsing issue with "OR NOT"
or is something else going on?
a) ("batman" AND "indiana jones") OR NOT ("cancer") /*only seems to
match the and clause*/
parsedquery=BoostedQuery(boost(+(+((+((_text_ws:batman)^2.0 |
(_text_txt:batman)^0.5 | (_text_txt_en_split:batman)^0.1)
+((_text_ws:"indiana jones")^2.0 | (_text_txt:"indiana jones")^0.5 |
(_text_txt_en_split:"indiana jone")^0.1)) -(+((_text_ws:cancer)^2.0 |
(_text_txt:cancer)^0.5 | (_text_txt_en_split:cancer)^0.1))))
b) ("batman" AND "indiana jones") OR (NOT ("cancer")) /*gives the
results we expected*/
parsedquery=BoostedQuery(boost(+(+((+((_text_ws:batman)^2.0 |
(_text_txt:batman)^0.5 | (_text_txt_en_split:batman)^0.1)
+((_text_ws:"indiana jones")^2.0 | (_text_txt:"indiana jones")^0.5 |
(_text_txt_en_split:"indiana jone")^0.1)) (-(+((_text_ws:cancer)^2.0 |
(_text_txt:cancer)^0.5 | (_text_txt_en_split:cancer)^0.1)) +*:*)^1.0))
The first thing I notice is the '+*.*)^1.0' component in the 2nd query's
parsedquery which is not in the 1st query's parsedquery response. The
first query does not seem to be matching any of the "NOT" articles to
include in the union of sets and is not giving us the expected results.
Is wrapping "NOT" a general requirement when preceded by an operator?
We are using SolrCloud 6.6 and are using q.op=AND with edismax.
Thanks!
-Michael/NewsRx
Full debug outputs:
{rawquerystring={!boost
b=recip(ms(NOW/DAY,issuedate_tdt),3.16e-11,1,1)}{!edismax}(("batman" AND
"indiana jones") OR NOT ("cancer")), querystring={!boost
b=recip(ms(NOW/DAY,issuedate_tdt),3.16e-11,1,1)}{!edismax}(("batman" AND
"indiana jones") OR NOT ("cancer")),
parsedquery=BoostedQuery(boost(+(+((+((_text_ws:batman)^2.0 |
(_text_txt:batman)^0.5 | (_text_txt_en_split:batman)^0.1)
+((_text_ws:"indiana jones")^2.0 | (_text_txt:"indiana jones")^0.5 |
(_text_txt_en_split:"indiana jone")^0.1)) -(+((_text_ws:cancer)^2.0 |
(_text_txt:cancer)^0.5 |
(_text_txt_en_split:cancer)^0.1)))),1.0/(3.16E-11*float(ms(const(1506916800000),date(issuedate_tdt)))+1.0))),
parsedquery_toString=boost(+(+((+((_text_ws:batman)^2.0 |
(_text_txt:batman)^0.5 | (_text_txt_en_split:batman)^0.1)
+((_text_ws:"indiana jones")^2.0 | (_text_txt:"indiana jones")^0.5 |
(_text_txt_en_split:"indiana jone")^0.1)) -(+((_text_ws:cancer)^2.0 |
(_text_txt:cancer)^0.5 |
(_text_txt_en_split:cancer)^0.1)))),1.0/(3.16E-11*float(ms(const(1506916800000),date(issuedate_tdt)))+1.0)),
QParser=ExtendedDismaxQParser, altquerystring=null, boost_queries=null,
parsed_boost_queries=[], boostfuncs=null,
boost_str=recip(ms(NOW/DAY,issuedate_tdt),3.16e-11,1,1),
boost_parsed=org.apache.lucene.queries.function.valuesource.ReciprocalFloatFunction:1.0/(3.16E-11*float(ms(const(1506916800000),date(issuedate_tdt)))+1.0),
filter_queries=[issuedate_tdt:[2000\-09\-18T04\:00\:00Z/DAY TO
2017\-10\-02T04\:00\:00Z/DAY+1DAY}, types_ss:(TrademarkApp OR
Stockmarket OR AllClinicalTrials OR PressRelease OR Patent OR SEC OR
Scholarly OR ClinicalTrial)],
parsed_filter_queries=[+issuedate_tdt:[969249600000 TO 1507003200000},
+(types_ss:TrademarkApp types_ss:Stockmarket types_ss:AllClinicalTrials
types_ss:PressRelease types_ss:Patent types_ss:SEC types_ss:Scholarly
types_ss:ClinicalTrial)]}
{rawquerystring={!boost
b=recip(ms(NOW/DAY,issuedate_tdt),3.16e-11,1,1)}{!edismax}(("batman" AND
"indiana jones") OR (NOT ("cancer"))), querystring={!boost
b=recip(ms(NOW/DAY,issuedate_tdt),3.16e-11,1,1)}{!edismax}(("batman" AND
"indiana jones") OR (NOT ("cancer"))),
parsedquery=BoostedQuery(boost(+(+((+((_text_ws:batman)^2.0 |
(_text_txt:batman)^0.5 | (_text_txt_en_split:batman)^0.1)
+((_text_ws:"indiana jones")^2.0 | (_text_txt:"indiana jones")^0.5 |
(_text_txt_en_split:"indiana jone")^0.1)) (-(+((_text_ws:cancer)^2.0 |
(_text_txt:cancer)^0.5 | (_text_txt_en_split:cancer)^0.1))
+*:*)^1.0)),1.0/(3.16E-11*float(ms(const(1506916800000),date(issuedate_tdt)))+1.0))),
parsedquery_toString=boost(+(+((+((_text_ws:batman)^2.0 |
(_text_txt:batman)^0.5 | (_text_txt_en_split:batman)^0.1)
+((_text_ws:"indiana jones")^2.0 | (_text_txt:"indiana jones")^0.5 |
(_text_txt_en_split:"indiana jone")^0.1)) (-(+((_text_ws:cancer)^2.0 |
(_text_txt:cancer)^0.5 | (_text_txt_en_split:cancer)^0.1))
+*:*)^1.0)),1.0/(3.16E-11*float(ms(const(1506916800000),date(issuedate_tdt)))+1.0)),
QParser=ExtendedDismaxQParser, altquerystring=null, boost_queries=null,
parsed_boost_queries=[], boostfuncs=null,
boost_str=recip(ms(NOW/DAY,issuedate_tdt),3.16e-11,1,1),
boost_parsed=org.apache.lucene.queries.function.valuesource.ReciprocalFloatFunction:1.0/(3.16E-11*float(ms(const(1506916800000),date(issuedate_tdt)))+1.0),
filter_queries=[issuedate_tdt:[2000\-09\-18T04\:00\:00Z/DAY TO
2017\-10\-02T04\:00\:00Z/DAY+1DAY}, types_ss:(TrademarkApp OR
Stockmarket OR AllClinicalTrials OR PressRelease OR Patent OR SEC OR
Scholarly OR ClinicalTrial)],
parsed_filter_queries=[+issuedate_tdt:[969249600000 TO 1507003200000},
+(types_ss:TrademarkApp types_ss:Stockmarket types_ss:AllClinicalTrials
types_ss:PressRelease types_ss:Patent types_ss:SEC types_ss:Scholarly
types_ss:ClinicalTrial)]}