Hello all,

I'm performing some queries with a big list of terms in OR on our solr instance,

and this odd situation happened


- A. query with N alternatives returns ~130.000 documents

- B. query with N-3 alternatives returns ~ 6.000.000 documents


N is relatively small in this case, but in general can be large.


How it's possible that if I specify less terms to match the number of results get higher?

The query is fully positive (no - or NOT inside).


Query A/B are attached.

I also tried with debug=all and I noticed

"parsedquery": "+(DisjunctionMaxQuery((abstract_methods:tuberculosi | ... ) DisjunctionMaxQuery( | .. | .. )


just on the first sub-parenthesis of the query. why is that? is this the reason of the change in number of results? if yes, how can I create a pure-or query (everything optional?)


If you are wondering why I'm adding sub-parenthesis, that's to avoid the max boolean clauses error (If you know some other method that allows phrase searches please tell me)


Thank you

Danilo




--
Danilo Tomasoni
COSBI

As for the European General Data Protection Regulation 2016/679 on the 
protection of natural persons with regard to the processing of personal data, 
we inform you that all the data we possess are object of treatement in the 
respect of the normative provided for by the cited GDPR.

It is your right to be informed on which of your data are used and how; you may 
ask for their correction, cancellation or you may oppose to their use by 
written request sent by recorded delivery to The Microsoft Research – 
University of Trento Centre for Computational and Systems Biology Scarl, Piazza 
Manifattura 1, 38068 Rovereto (TN), Italy.

Attachment: A.json
Description: application/json

Attachment: B.json
Description: application/json

Reply via email to