On Sep 2, 2010, at 12:35pm, Michael Lackhoff wrote:

According to http://lucene.apache.org/java/2_9_1/ queryparsersyntax.html
only these characters need escaping:
+ - && || ! ( ) { } [ ] ^ " ~ * ? : \
but with this simple query:
TI:stroke; AND TI:journal
I got the error message:
HTTP ERROR: 400
Unknown sort order: TI:journal

My first guess was that it was a URL encoding issue but everything looks
fine:
http://localhost:8983/solr/select/?q=TI%3Astroke%3B+AND+TI%3Ajournal&version=2.2&start=0&rows=10&indent=on
as you can see, the semicolon is encoded as %3B
There is no problem when the query ends with the semicolon:
TI:stroke;
gives no error.
The first query also works if I escape the semicolon:
TI:stroke\; AND TI:journal

From this I conclude that there is a bug either in the docs or in the
query parser or I missed something. What is wrong here?

The docs need to be updated, I believe. From some code I wrote back in 2006...

// Also note that we escape ';', as Solr uses this to support embedding // commands into the query string (yikes), and the code base we're using // has a bug where if the ';' doesn't have two tokens after it (white- // space separated) then you get an array index out of bounds error.

I also had this note, no idea if it's still an issue:

// Before we do regular escaping, work around a bug in the Lucene query // parser. If the last character is a '\', we can escape it as '\\', but // if we build an expression that looks like xxx AND (<querytext\>) then // the Lucene query parser will treat the final '\' before the ')' as // a signal to escape the ')' character. That's just wrong, but for now // we'll just strip off any trailing '\' characters in the clause.

But in general escaping characters in a query gets tricky - if you can directly build queries versus pre-processing text sent to the query parser, you'll save yourself some pain and suffering.

Also, since I did the above code the DisMaxRequestHandler has been added to Solr, and it (IIRC) tries to be smart about handling this type of escaping for you.

-- Ken

--------------------------
Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g





Reply via email to