Re: Solr Clustering Issue

Joseph Obernberger Fri, 24 Jul 2015 06:23:44 -0700

Thank you Upayavira and Shawn. Yes - the query works correctly usingthe standard select. I have a workaround where I simply specify thefields I want to search in each part of the query and do not specify adf. Just an FYI in case someone else runs into this.


-Joe


On 7/23/2015 10:51 AM, Shawn Heisey wrote:

On 7/23/2015 7:51 AM, Joseph Obernberger wrote:

Hi Upayavira - the URL was:

http://server1:9100/solr/MYCOL1/clustering?q=Collection:(COLLECT1008+OR+COLLECT2587)+AND+(amazon+AND+soap)&wt=json&indent=true&clustering=true&rows=1&df=FULL_DOCUMENT&debugQuery=true


Here is the relevant part of the response - notice that the default
field (FULL_DOCUMENT) is not in the response, and that it appears to
ignore parts of the query string.

<snip>

     "parsedquery_toString":"+(Collection:(COLLECT1008 (id:OR^10.0 |
text:or^0.5) (id:COLLECT2587)^10.0 | text:collect2587^0.5) (id:AND^10.0
| text:and^0.5) (id:(amazon^10.0 | text:amazon^0.5) (id:AND^10.0 |
text:and^0.5) (id:soap)^10.0 | text:soap^0.5))",
     "QParser":"ExtendedDismaxQParser",


According to the last line I quoted above, you are using the edismax
parser.  This parser does not use the df parameter, it uses qf and other
parameters to determine which fields to search.  It appears that you do
have a qf parameter, listing the id field with a boost of 10, and the
text field with a boost of 0.5.

Something else I noticed, not sure if it's relevant:  The presence of
"id:OR^10.0" in that parsed query is very strange.  That is something I
would expect from the dismax parser, not edismax.

There have been some bugs with edismax and parentheses, it's conceivable
that there might be more problems:

https://issues.apache.org/jira/browse/SOLR-5435
https://issues.apache.org/jira/browse/SOLR-3377

Sometimes bugs with parentheses are fixed by adding spaces to separate
them from their contents.

Thanks,
Shawn

Re: Solr Clustering Issue

Reply via email to