Required operator (+) is being ignored when using default conjunction operator AND
Using solr 8.3.0 it seems like required operator isn't functioning properly when default conjunction operator is AND. Steps to reproduce: 20 docs all have text field 17 have the value A 13 have the value B 10 have both A and B (the intersection) ===the data=== [ { "id": "0", "_text_": [ "abc", "123", "xyz" ] }, { "id": "1", "_text_": [ "abc", "123", "xyz" ] }, { "id": "2", "_text_": [ "abc", "123", "xyz" ] }, { "id": "3", "_text_": [ "abc", "123" ] }, { "id": "4", "_text_": [ "abc", "123" ] }, { "id": "5", "_text_": [ "abc", "123", "xyz" ] }, { "id": "6", "_text_": [ "abc", "123" ] }, { "id": "7", "_text_": [ "abc", "123" ] }, { "id": "8", "_text_": [ "abc", "123", "xyz" ] }, { "id": "9", "_text_": [ "abc", "123" ] }, { "id": "10", "_text_": [ "abc", "xyz" ] }, { "id": "11", "_text_": [ "abc" ] }, { "id": "12", "_text_": [ "abc", "xyz" ] }, { "id": "13", "_text_": [ "abc", "xyz" ] }, { "id": "14", "_text_": [ "abc", "xyz" ] }, { "id": "15", "_text_": [ "abc", "xyz" ] }, { "id": "16", "_text_": [ "abc", "xyz" ] }, { "id": "17", "_text_": [ "xyz", "123" ] }, { "id": "18", "_text_": [ "def", "123", "xyz" ] }, { "id": "19", "_text_": [ "def", "123" ] } ] == default operator is set to AND my query is: http://localhost:8983/solr/new_core/select?debug.explain.structured=true&debugQuery=on&q=%7B!q.op%3DAND%7D%20%2Babc%20OR%20123&rows=20 the response: { "responseHeader":{ "status":0, "QTime":7, "params":{ "q":"{!q.op=AND} +abc OR 123", "rows":"20", "debug.explain.structured":"true", "debugQuery":"on"}}, "response":{"numFound":20,"start":0,"docs":[ { "id":"3", "_version_":1662786291343818752}, { "id":"4", "_version_":1662786291343818753}, { "id":"6", "_version_":1662786291344867329}, { "id":"7", "_version_":1662786291345915904}, { "id":"9", "_version_":1662786291346964480}, { "id":"0", "_version_":1662786291339624448}, { "id":"1", "_version_":1662786291342770176}, { "id":"2", "_version_":1662786291342770177}, { "id":"5", "_version_":1662786291344867328}, { "id":"8", "_version_":1662786291345915905}, { "id":"17", "_version_":1662786291350110209}, { "id":"19", "_version_":1662786291351158784}, { "id":"18", "_version_":1662786291350110210}, { "id":"11", "_version_":1662786291348013056}, { "id":"10", "_version_":1662786291346964481}, { "id":"12", "_version_":1662786291348013057}, { "id":"13", "_version_":1662786291348013058}, { "id":"14", "_version_":1662786291349061632}, { "id":"15", "_version_":1662786291349061633}, { "id":"16", "_version_":1662786291350110208}] }, "debug":{ "rawquerystring":"{!q.op=AND} +abc OR 123", "querystring":"{!q.op=AND} +abc OR 123", "parsedquery":"_text_:abc _text_:123", "parsedquery_toString":"_text_:abc _text_:123", "explain":{ "3":{ "match":true, "value":0.29721633, "description":"sum of:", "details":[{ "match":true, "value":0.08681979, "description":"weight(_text_:abc in 3) [SchemaSimilarity], result of:", "details":[{ "match":true, "value":0.08681979, "description":"score(freq=1.0), computed as boost * idf * tf from:"
Re: Required operator (+) is being ignored when using default conjunction operator AND
Hoss, thanks a lot for the response. OK, so it seems like I got into to the "uncanny valley" of the search operators:/ I red your attached blog post (and more) but still the penny hasn't dropped yet about what causes the operator clash when the default operator is AND. I red that when q.op=AND, OR will change the left(if not MUST_NOT) and right clause Occurs to SHOULD - what that means is that the "order of operations" in this case is giving the infix operator the mandate to control the prefix operator? A little background - I am trying to implement a google search like service and want to have the ability to have required and prohibit operators while still allowing default intersection operation as default operator. How can I achieve this with this limitation? On Wed, Apr 1, 2020, 20:08 Chris Hostetter wrote: > > : Using solr 8.3.0 it seems like required operator isn't functioning > properly > : when default conjunction operator is AND. > > You're mixing the "prefix operators" with the "infix operators" which is > always a recipe for disaster. > > The use of q.op=AND vs q.op=OR in these examples only > complicates the issue because q.op isn't really overriding any sort of > implicit > "infix operator" when clauses exist w/o an infix operator between them, it > is overriding the implicit MUST/SHOULD/MUST_NOT given to each clause as > parsed ... but in general setting q.op-AND really only makes sense when > you expect/intend to only be using "infix operators" > > This write up i did several years ago is still very accurate -- the bottom > line is you REALLY don't want to mix infix and prefix operators.. > > https://lucidworks.com/post/why-not-and-or-and-not/ > > ...because the results of mixing them really only "make sense" given the > context that the parser goes left to right (ie: no precedence) and has > no explicit "prefix" operator syntax for "SHOULD" > > > -Hoss > http://www.lucidworks.com/ >
Re: Required operator (+) is being ignored when using default conjunction operator AND
Hoss, thanks a lot for the informative response. I understood my misunderstanding with infix and prefix operators. Need to rethink about the term occurrence support in my search service. Cheers! On Mon, Apr 6, 2020, 20:43 Chris Hostetter wrote: > > : I red your attached blog post (and more) but still the penny hasn't > dropped > : yet about what causes the operator clash when the default operator is > AND. > : I red that when q.op=AND, OR will change the left(if not MUST_NOT) and > : right clause Occurs to SHOULD - what that means is that the "order of > : operations" in this case is giving the infix operator the mandate to > : control the prefix operator? > > Not quite anything that complex... sorry, but the blog post was focused > on > describe *what* happens when parsing, do explain why mixng prefix/infix is > bad ... i avoided getting bogged down into *why* it happens exactly the > way it does. > > > To get to the "why" you have to circle back to the higher level concept > that the "prefix" operators very closely align to the underlying concepts > of the BooleanQuery/BooleanClause data structures: that each clause has an > "Occur" property which is either: MUST/SHOULD/MUST_NOT (or FILTER, but > setting asside scoring that's functionally equivilent to MUST). > > The 'infix' operators just manipulate the Occur property of the clauses on > either side of them. > > 'q.op=AND' and 'q.op=OR' are functionally really about setting the > "Default Occur Value For All Clauses That Do Not Have An Explicit Occur > Value" (ie: q.op=Occur.MUST and q.op=Occur.SHOULD) ... where the explicit > Occur value for each clause would be specified by it's prefix (+=MUST, > -=MUST_NOT ... there is no supported prefix for SHOULD, which is why > q.op=SHOULD is the defualt nad chaning it complicates the parser logic) > > In essence: After the q.op/default.occur is applied to all clauses (that > don't already have a prefix), then there is a left to right parsing that > let's the infix operators modify the "Occur" values of the clauses on > either side of them -- if those Occur values match the "default" for this > parser. > > So let's imagine 2 requests... > > 1) {!q.op=AND}a +b OR c +d AND e > 2) {!q.op=OR} x +y OR z +r AND s > > Here's what those wind up looking like internally with the default > applied... > > 1) q.op=MUST:MUST(a) MUST(b) OR MUST(c) MUST(d) AND MUST(e) > 2) q.op=SHOULD: SHOULD(x) MUST(y) OR SHOULD(z) MUST(r) AND SHOULD(s) > > And here's how the infix operators change things as it parses left to > right building up the clauses... > > 1) q.op=MUST:MUST(a) SHOULD(b) SHOULD(c) MUST(d) MUST(e) > 2) q.op=SHOULD: SHOULD(x) MUST(y) SHOULD(z) MUST(r) MUST(s) > > It's not actually done in "two passes" -- it's just that as the parsing > is done left to right, the default Occur is used unless/until set by a > prefix operators, and infix operators not only set the occur value > for the "next" clause, but also reach back to override the prior > Occur value if it matches the Default: because there is no "history" kept > to indicate that it was explicitly set, or how. the left to right parsing > just does the best it can with the context it's got. > > : A little background - I am trying to implement a google search like > : service and want to have the ability to have required and prohibit > : operators while still allowing default intersection operation as default > : operator. How can I achieve this with this limitation? > > If you want "intersection" to be the defualt, i'm not sure why you care > about having a "required" operator? (you didn't mention anything about an > "optional" operator even though your original example explicitly used > "OR" ... so not really sure if that was just a contrived example or if you > actaully care about supporting it? > > If you're not hung up on using a specific syntax, you might want to > consider the "simple" QParser -- it unfortunately re-uses the 'q.op=AND' > param syntax to indicate what the default Occur should be for clauses, but > the overall syntax is much simple: there is a prefix negation operator, > but other wise the infix "+" and "|" operators support boolean AND and OR > -- there is no prefix operators for MUST/SHOULD. You can also turn off > individual operators you don't like... > > > https://lucene.apache.org/solr/guide/8_5/other-parsers.html#OtherParsers-SimpleQueryParser > > > -Hoss > http://www.lucidworks.com/ >
Can solr index replacement character
Hi community, During integration tests with new data source I have noticed weird scenario where replacement character can't be searched, though, seems to be stored. I mean, honestly, I don't want that irrelevant data stored in my index but I wondered if solr can index replacement character (U+FFFD �) as string, if so, how to search it? And in general, is there any built-in char filtration?! Thanks