Hi Hoss, Thanks for that lengthy feedback, it is much appreciated.
Let me reset and bear in mind that I'm new to Solr. I'm using Solr 5.0 (will switch over to 5.1 later this week) and my need is as follows. In my application, a user types "Apache Solr Notes". I take that text and send it over to Solr like so: http://localhost:8983/solr/db/select?q=title:(Apache%20Solr%20Notes)&fl=id%2Cscore%2Ctitle&wt=xml&indent=true&q.op=AND And I get a hit on "Apache Solr Release Notes". This is all good. Now if the same user types "Apache: Solr Notes" (notice the ":" after "Apache") I will get a SyntaxError. The fix is to escape ":" before I send it to Solr. What I want to figure out is how can I tell Solr / Lucene to ignore ":" and escape it for me? In this example, I used ":" but my need is for all other operators and reserved Solr / Lucene characters. This need to be configurable via a URL parameter to Solr / Lucene because there are times I will send text to Solr that has valid operators and other times not. If such a URL parameter exists, than my client application no longer has to maintain a list of operators to escape and it doesn't have to keep up with Solr as new operators are added. What do you think? I hope I got my message across better this time. PS: Looking at https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-SimpleQueryParser seems to be promising, but it doesn't include an example so I wan't able to figure it out and it looks to me the list of operators is not complete (there is no "{" for example) Thanks Steve On Fri, Apr 17, 2015 at 3:02 PM, Chris Hostetter <hossman_luc...@fucit.org> wrote: > > : It looks to me that "f" with "qq" is doing phrase search, that's not > what I > : want. The data in the field "title" is "Apache Solr Release Notes" > > if you don't wnat phrase queries then you don't want pharse queries and > that's fine -- but it wasn't clear from any of your original emails > because you never provided (that i saw) any concrete examples of the types > of queries you expected, the types of matches you wanted, and the types of > matches you did *NOT* want. details matter.... > > https://wiki.apache.org/solr/UsingMailingLists > > > Based on that one concrete example i've now seen of what you *do* want to > match: it seems that maybe a general description of your objective is that > each of the "words" in your user input should treated as a mandatory > clause in a boolean query -- but the concept of a "word" is already > something that violates your earlier statement about not wanting the query > parser to treat any "reserved characters" as special -- in order to > recognize that "Apache", "Solr" and "Notes" should each be treated as > independent mandatory clauses in a boolean query, then some query parser > needs to recognize that *whitespace* is a syntactically significant > character in your query string: it's what seperates the "words" in your > input. > > the reason the "field" parser produces phrase queries in the example URLs > you mentioned is because that parser doesn't have *ANY* special reserved > characters -- not even whitespace. it passes the entire input string to > the analyzer of the configured (f) field. if you are using TextField with > a Tokenizer that means it gets split on whitespace, resulting in multiple > *sequential* tokens, which will result in a phrase query (on the other > hand, using something like StrField will cause the entire input string, > spaces an all, to be serached as one single Term) > > : I looked over the links you provided and tried out the examples, in each > : case if the user-typed-text contains any reserved characters, it will > fail > : with a syntax error (the exception is when I used "f" and "qq" but like I > : said, that gave me 0 hit). > > As i said: Details matter. which examples did you try? what configs were > you using? what data where you using? which version of solr are you using? > what exactly was the syntax error? etc.... ? > > "f" and "qq" are not magic -- saying you used them just means you used > *some* parser that supports an "f" param ... if you tried it with the > "term" or "field" parser then i don't know why you would have gotten a > SyntaxError, but based on your goal it sounds like those parsers aren't > really useful to you. (see below) > > : If you can give me a concrete example, please do. My need is to pass to > : Solr the text "Apache: Solr Notes" (without quotes) and get a hit as if I > : passed "Apache\: Solr Notes" ? > > To re-iterate, saying you want the same bhavior as if you passed "Apache\: > Solr Notes" is a vague statment -- as if you passed that string to *what* > ? to the standard parser? to the dismax parser? using what request > options? (q.op? qf? df?) ... query strings don't exist in a vacume. the > details & context matters. > > (I'm sorry if it feels like i keep hitting you over the head about this, > i'm just trying to help you realize the breadth and scope of the variables > involved in a question like the one you are asking, so you consider the > full context and understand *how* to think about the problem you are > trying to solve, and what questions to ask yourselve / this list) > > > My *BEST* guess as to a parser that might help you is the "simple" > parser... > > > https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-SimpleQueryParser > > ...by default it supports several syntactically significant operators > (which can be escaped), but those can be disabled using the q.operators > option. As the documentation notes "Any errors in syntax are ignored and > the query parser will interpret as best it can. This can mean, however, > odd results in some cases." so a lot of experimentation with a > large sample of expected good/bad queries is important to make sure you > understand what types of query structures & search results you'll get out > of them > > A trivial example of using the "simple" parser, with the Solr 5.1 > "bin/solr -e techproducts" example configs/data would be... > > > http://localhost:8983/solr/techproducts/select?fl=id,name&debug=query&defType=simple&q.op=AND&q.operators=&df=name&q=apple%20-ipod > > which matches the name "Apple 60 GB iPod with Video Playback Black" even > though there is a "-" in front of ipod, because the "q.operators=" param > tells the parser to ignore all of it's operators. (at which point the > literal string "-ipod" is passed to the analyzer for the "name" field, and > it's striped off by the tokenizer). On the other hand it does not match > the name "Belkin Mobile Power Cord for iPod w/ Dock" because it doesn't > contain "apple". > > > That was a trivial "good" example query -- it's important to remeber > however that localparam parsing happens *before* the actual query parser > is given the input string (it has to since the local params may be used to > specify the parser) so a "bad" example query like this will still produce > a syntax error because the localparams are malformed... > > > http://localhost:8983/solr/techproducts/select?fl=id,name&debug=query&defType=simple&q.op=AND&q.operators=&df=name&q={!bogus%20apple%20-ipod > > ...but here again the local param variable derefrencing i mentioned in my > preivous email can solve this problem for you and prevent syntax errors... > > > http://localhost:8983/solr/techproducts/select?fl=id,name&debug=query&defType=simple&q.op=AND&q.operators=&df=name&q={!simple%20v=$qq}&qq={!bogus%20apple%20-ipod > > ...because now the "simple" parser is passed the full string > including the "{!bogus" prefix, and it treats it as a term, which matches > no docs. > > > > -Hoss > http://www.lucidworks.com/ >