On 1/22/07, Erik Hatcher <[EMAIL PROTECTED]> wrote:

On Jan 21, 2007, at 11:12 PM, Yonik Seeley wrote:
> On 1/21/07, Erik Hatcher <[EMAIL PROTECTED]> wrote:
>> Yes, I think different syntaxes in different places would be useful.
>> For example, a user enters a full-text search query that is suitable
>> to use with Solr's QueryParser, and then the user facets a bit.  The
>> facet "queries" really aren't queries at all, but rather terms that
>> don't need to be parsed.  Building up a string to be parsed like "#
>> {params[:field]}:#{params[:value]}" is tricky because of escaping
>> syntax (like colons).  So fq wouldn't need to be parsed at all,
>> except to pull the field name out to build a TermQuery
>> straightforwardly.
>
> fq params are often queries too (think price ranges, etc).

Oh, I realize that quite well :)

However, for basic faceted browsing, fq parameters are generally just
plain terms and any type of meta-QueryParser syntax in the terms
could get in the way when QueryParser is being used.

> For syntax, what about
>
> <!term>myfield:my unescaped value all as a singe term
>
> Or
>
> <!term field='myfield'>my unescaped value all as a singe term

This second option allows for parameters to be passed to the parser,
which is a nice point of extensibility.

> Of course, while that prefix looks decent in bare text, adding it to
> an XML config file would look ugly since < would need escaping.

No biggie there.  It's not often that it'd be in a config file.  And
there is already escaping ugliness for the ping and warmup queries in
solrconfig.xml.

But it could be coming back in an XML response, or seen in an access log.

> Another syntax option... something like
>
> !term:myfield:my unescaped value all as a singe term
> #term:
> %term:
> @term:
>
> (basically, find a prefix that would be unlikely to appear as an
> actual term or wildcard in lucene queryparser syntax)

I like the <!term field='myfield'>.... syntax above best of the ones
you've mentioned because of the ability to provide parameters cleanly.

I had plans for providing params for all of the syntaxes... I just
left it for later since I didn't know if it would be controversial.  I
wanted to keep it simple at first to avoid scaring people :-)

|qp|+a +b                        #lucene query parser syntax
|qp(field='myfield')|+a +b   #providing a different default field
 OR
|qp field='myfield')|+a +b   #providing a different default field


I don't know if it's important enough to consider, but there is also
URL escaping to consider (readability of raw acces logs, etc).  Here
are alternatives sent from firefox to netcat:

http://localhost:5000/foo?q=<!term fieldname='foo'>bar
GET /foo?q=%3C!term%20fieldname='foo'%3Ebar HTTP/1.1

http://localhost:5000/foo?q=|term fieldname='foo'|bar
GET /foo?q=|term%20fieldname='foo'|bar HTTP/1.1

http://localhost:5000/foo?q=|term,fieldname='foo'|bar
GET /foo?q=|term,fieldname='foo'|bar HTTP/1.1

<! is growing on me since it will look sort of familiar to people as
metadata... but more readable (less escaped) would be nice too.

Sorry for worrying about all these little trivial details, but I have
a feeling the syntax might be with us a while.


This could be a way to add metadata to other parameters as well (not
*all* of them, but we could reuse the facility if needed).

so <!facet field='foo' offset='50' limit='10', mincount=1> anyone ;-) ;-)
OR
|facet,field='foo',offset='50',limit='10',mincount='1'|
OR
[!facet field='foo' offset='50' limit='10', mincount=1]


-Yonik

Reply via email to