On Sep 2, 2010, at 12:35pm, Michael Lackhoff wrote:
According to http://lucene.apache.org/java/2_9_1/
queryparsersyntax.html
only these characters need escaping:
+ - && || ! ( ) { } [ ] ^ " ~ * ? : \
but with this simple query:
TI:stroke; AND TI:journal
I got the error message:
HTTP ERROR: 400
Unknown sort order: TI:journal
My first guess was that it was a URL encoding issue but everything
looks
fine:
http://localhost:8983/solr/select/?q=TI%3Astroke%3B+AND+TI%3Ajournal&version=2.2&start=0&rows=10&indent=on
as you can see, the semicolon is encoded as %3B
There is no problem when the query ends with the semicolon:
TI:stroke;
gives no error.
The first query also works if I escape the semicolon:
TI:stroke\; AND TI:journal
From this I conclude that there is a bug either in the docs or in the
query parser or I missed something. What is wrong here?
The docs need to be updated, I believe. From some code I wrote back in
2006...
// Also note that we escape ';', as Solr uses this to support
embedding
// commands into the query string (yikes), and the code base
we're using
// has a bug where if the ';' doesn't have two tokens after
it (white-
// space separated) then you get an array index out of bounds
error.
I also had this note, no idea if it's still an issue:
// Before we do regular escaping, work around a bug in the
Lucene query
// parser. If the last character is a '\', we can escape it
as '\\', but
// if we build an expression that looks like xxx AND
(<querytext\>) then
// the Lucene query parser will treat the final '\' before
the ')' as
// a signal to escape the ')' character. That's just wrong,
but for now
// we'll just strip off any trailing '\' characters in the
clause.
But in general escaping characters in a query gets tricky - if you can
directly build queries versus pre-processing text sent to the query
parser, you'll save yourself some pain and suffering.
Also, since I did the above code the DisMaxRequestHandler has been
added to Solr, and it (IIRC) tries to be smart about handling this
type of escaping for you.
-- Ken
--------------------------
Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c w e b m i n i n g