On 1/22/07, Erik Hatcher <[EMAIL PROTECTED]> wrote:
On Jan 21, 2007, at 11:12 PM, Yonik Seeley wrote: > On 1/21/07, Erik Hatcher <[EMAIL PROTECTED]> wrote: >> Yes, I think different syntaxes in different places would be useful. >> For example, a user enters a full-text search query that is suitable >> to use with Solr's QueryParser, and then the user facets a bit. The >> facet "queries" really aren't queries at all, but rather terms that >> don't need to be parsed. Building up a string to be parsed like "# >> {params[:field]}:#{params[:value]}" is tricky because of escaping >> syntax (like colons). So fq wouldn't need to be parsed at all, >> except to pull the field name out to build a TermQuery >> straightforwardly. > > fq params are often queries too (think price ranges, etc). Oh, I realize that quite well :) However, for basic faceted browsing, fq parameters are generally just plain terms and any type of meta-QueryParser syntax in the terms could get in the way when QueryParser is being used. > For syntax, what about > > <!term>myfield:my unescaped value all as a singe term > > Or > > <!term field='myfield'>my unescaped value all as a singe term This second option allows for parameters to be passed to the parser, which is a nice point of extensibility. > Of course, while that prefix looks decent in bare text, adding it to > an XML config file would look ugly since < would need escaping. No biggie there. It's not often that it'd be in a config file. And there is already escaping ugliness for the ping and warmup queries in solrconfig.xml.
But it could be coming back in an XML response, or seen in an access log.
> Another syntax option... something like > > !term:myfield:my unescaped value all as a singe term > #term: > %term: > @term: > > (basically, find a prefix that would be unlikely to appear as an > actual term or wildcard in lucene queryparser syntax) I like the <!term field='myfield'>.... syntax above best of the ones you've mentioned because of the ability to provide parameters cleanly.
I had plans for providing params for all of the syntaxes... I just left it for later since I didn't know if it would be controversial. I wanted to keep it simple at first to avoid scaring people :-) |qp|+a +b #lucene query parser syntax |qp(field='myfield')|+a +b #providing a different default field OR |qp field='myfield')|+a +b #providing a different default field I don't know if it's important enough to consider, but there is also URL escaping to consider (readability of raw acces logs, etc). Here are alternatives sent from firefox to netcat: http://localhost:5000/foo?q=<!term fieldname='foo'>bar GET /foo?q=%3C!term%20fieldname='foo'%3Ebar HTTP/1.1 http://localhost:5000/foo?q=|term fieldname='foo'|bar GET /foo?q=|term%20fieldname='foo'|bar HTTP/1.1 http://localhost:5000/foo?q=|term,fieldname='foo'|bar GET /foo?q=|term,fieldname='foo'|bar HTTP/1.1 <! is growing on me since it will look sort of familiar to people as metadata... but more readable (less escaped) would be nice too. Sorry for worrying about all these little trivial details, but I have a feeling the syntax might be with us a while. This could be a way to add metadata to other parameters as well (not *all* of them, but we could reuse the facility if needed). so <!facet field='foo' offset='50' limit='10', mincount=1> anyone ;-) ;-) OR |facet,field='foo',offset='50',limit='10',mincount='1'| OR [!facet field='foo' offset='50' limit='10', mincount=1] -Yonik