Dirk,

There are 3 open JIRAs related to this behavior:

https://issues.apache.org/jira/browse/SOLR-3739
https://issues.apache.org/jira/browse/SOLR-3740 
https://issues.apache.org/jira/browse/SOLR-3741

We worked around it by adding the explicit + signs if the query matched the 
problematic patterns.  A pain, I know.

-----Original Message-----
From: Dirk Buchhorn [mailto:dirk.buchh...@finkundpartner.de] 
Sent: Thursday, June 18, 2015 3:31 AM
To: solr-user@lucene.apache.org
Subject: Extended Dismax Query Parser with AND as default operator

Hello,

I have a question to the extended dismax query parser. If the default operator 
is changed to AND (q.op=AND) then the search results seems to be incorrect. I 
will explain it on some examples. For this test I use solr v5.1 and the tika 
core from the example directory.
== Preparation ==
Add the following lines to the schema.xml file
  <field name="id" type="string" indexed="true" stored="true" required="true"/>
  <uniqueKey>id</uniqueKey>
Change the field "text" to stored="true"
Remove the multiValued attribute from the title and text field (we don't need 
multivaled fields in our test)

Add test data (use curl or fiddler)
Url:http://localhost:8983/solr/tika/update/json?commit=true
Header: Content-type: application/json
[
  {"id":"1", "title":"green", "author":"Jon", "text":"blue"},
  {"id":"2", "title":"green", "author":"Jon Jessie", "text":"red"},
  {"id":"3", "title":"yellow", "author":"Jessie", "text":"blue"},
  {"id":"4", "title":"green", "author":"Jessie", "text":"blue"},
  {"id":"5", "title":"blue", "author":"Jon", "text":"yellow"},
  {"id":"6", "title":"red", "author":"Jon", "text":"green"} ]

== Test ==
The following parameter are always set.
default operator is AND: q.op=AND
use the extended dismax query parser: defType=edismax set the default query 
fields to title and text: qf=title text
sort: id asc

=== #1 test ===
q=red green
response:
{ "numFound":2,"start":0,
  "docs":[
    {"id":"2","title":"green","author":"Jon Jessie","text":"red"},
    {"id":"6","title":"red","author":"Jon","text":"green"}]
}
parsedquery_toString: "+(((text:green | title:green) (text:red | title:red))~2)"

This test works as expected.

=== #2 test ===
We use a group
q=(red green)
Same response as test one.
parsedquery_toString: "+(((text:green | title:green) (text:red | title:red))~2)"

This test works as expected.

=== #3 test ===
q=green red author:Jessie
response:
{ "numFound":1,"start":0,
  "docs":[{"id":"2","title":"green","author":"Jon Jessie","text":"red"}] }
parsedquery_toString: "+(((text:green | title:green) (text:red | title:red) 
author:jessie)~3)"

This test works as expected.

=== #4 test ===
q=(green red) author:Jessie
response:
{ "numFound":2,"start":0,
  "docs":[
    {"id":"2","title":"green","author":"Jon Jessie","text":"red"},
    {"id":"4","title":"green","author":"Jessie","text":"blue"}]
}
parsedquery_toString: "+((((text:green | title:green) (text:red | title:red)) 
author:jessie)~2)"

The same result as the 3th test was expected. Why no AND is used for the query 
group?

=== #5 test ===
q=(+green +red) author:Jessie
response:
{ "numFound":4,"start":0,
  "docs":[
    {"id":"2","title":"green","author":"Jon Jessie","text":"red"},
    {"id":"3","title":"yellow","author":"Jessie","text":"blue"},
    {"id":"4","title":"green","author":"Jessie","text":"blue"},
    {"id":"6","title":"red","author":"Jon","text":"green"}]
}
parsedquery_toString: "+((+(text:green | title:green) +(text:red | title:red)) 
author:jessie)"

Now AND is used for the group but the author is concatenated with OR. Why?

=== #6 test ===
q=(+green +red) +author:Jessie
response:
{ "numFound":3,"start":0,
  "docs":[
    {"id":"2","title":"green","author":"Jon Jessie","text":"red"},
    {"id":"3","title":"yellow","author":"Jessie","text":"blue"},
    {"id":"4","title":"green","author":"Jessie","text":"blue"}]
}
parsedquery_toString: "+((+(text:green | title:green) +(text:red | title:red)) 
+author:jessie)"

Still not the expected result.

=== #7 test ===
q=+(+green +red) +author:Jessie
response:
{ "numFound":1,"start":0,
  "docs":[{"id":"2","title":"green","author":"Jon Jessie","text":"red"}] }
parsedquery_toString: "+(+(+(text:green | title:green) +(text:red | title:red)) 
+author:jessie)"

Now the result is ok. But if all operators must be given then q.op=AND is 
useless.

=== #8 test ===
q=green author:(Jon Jessie)
Found four results, expected are one. The query must changed to '+green 
+author:(+Jon +Jessie)' to get the expected result.

Is this a bug in the extended dismax parser or what is the reason for not 
consequently applying q.op=AND to the query expression?

Kind regards

Dirk Buchhorn

Reply via email to