Re: Solr Clustering Issue

Shawn Heisey Thu, 23 Jul 2015 07:52:35 -0700

On 7/23/2015 7:51 AM, Joseph Obernberger wrote:
> Hi Upayavira - the URL was:
> 
> http://server1:9100/solr/MYCOL1/clustering?q=Collection:(COLLECT1008+OR+COLLECT2587)+AND+(amazon+AND+soap)&wt=json&indent=true&clustering=true&rows=1&df=FULL_DOCUMENT&debugQuery=true
> 
> 
> Here is the relevant part of the response - notice that the default
> field (FULL_DOCUMENT) is not in the response, and that it appears to
> ignore parts of the query string.


<snip>

>     "parsedquery_toString":"+(Collection:(COLLECT1008 (id:OR^10.0 |
> text:or^0.5) (id:COLLECT2587)^10.0 | text:collect2587^0.5) (id:AND^10.0
> | text:and^0.5) (id:(amazon^10.0 | text:amazon^0.5) (id:AND^10.0 |
> text:and^0.5) (id:soap)^10.0 | text:soap^0.5))",
>     "QParser":"ExtendedDismaxQParser",


According to the last line I quoted above, you are using the edismax
parser.  This parser does not use the df parameter, it uses qf and other
parameters to determine which fields to search.  It appears that you do
have a qf parameter, listing the id field with a boost of 10, and the
text field with a boost of 0.5.

Something else I noticed, not sure if it's relevant:  The presence of
"id:OR^10.0" in that parsed query is very strange.  That is something I
would expect from the dismax parser, not edismax.

There have been some bugs with edismax and parentheses, it's conceivable
that there might be more problems:

https://issues.apache.org/jira/browse/SOLR-5435
https://issues.apache.org/jira/browse/SOLR-3377

Sometimes bugs with parentheses are fixed by adding spaces to separate
them from their contents.

Thanks,
Shawn

Re: Solr Clustering Issue

Reply via email to