On 4/15/2015 3:54 PM, Steven White wrote:
> Hi folks,
>
> If a user types in the search box (without quotes): "{!q.op=AND df=text
> solr sys" and I take that text and build the URL like so:
>
> http://localhost:8983/solr/db/select?q={!q.op=AND%20df=text%20solr%20sys&fl=id%2Cscore%2Ctitle&wt=xml&indent=true
>
> This will fail with "Expected identifier" because it is not a valid Solr
> text.

That isn't valid syntax for the lucene query parser ... the localparams
are not closed (it would require a } character), and after the
localparams there would need to be some additional text.

> My question is this: is there a flag I can send to Solr with the URL
> telling it to treat what's in "q" as raw text vs. having it to process it
> as a Solr syntax?  If not, than it means I have to escape all Solr reserved
> characters and words.  If so, where can I find the complete list?  Also,
> what happens when a new reserved characters or word is added to Solr down
> the road?  It means I have to upgrade my application too, which is
> something I would like to avoid.

One way to treat the entire input as literal text is to use the terms
query parser ... but that requires the localparams syntax, and I do not
know exactly what is going to happen if you use a query string that
itself is localparams syntax -- {!xxxx other params} ... so escaping is
probably safer.

https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-TermQueryParser

The other way to handle it is to escape every special character with a
backslash.  The escapeQueryChars method in SolrJ is always kept up to
date, and can escape every special character.

http://lucene.apache.org/solr/4_10_3/solr-solrj/org/apache/solr/client/solrj/util/ClientUtils.html#escapeQueryChars%28java.lang.String%29

The javadoc for that method points to the queryparser syntax for more
info on characters that need escaping.  Scroll to the very end of this page:

http://lucene.apache.org/core/4_10_3/queryparser/org/apache/lucene/queryparser/classic/package-summary.html?is-external=true

That page lists || and && rather than just the single characters | and &
... the escapeQueryChars method in SolrJ will escape both characters, as
it only works at the character level, not the string level.

If you want the *spaces* in your query to be treated literally also, you
must escape them too.  The escapeQueryChars method I've mentioned will
NOT escape spaces.

Note that this does not cover URL escaping -- the & character must be
sent as %26 or the servlet container will treat it as a special
character, before it even gets to Solr.

Thanks,
Shawn

Reply via email to