Thanks Shawn.

I cannot use escapeQueryChars method because my app interacts with Solr via
REST.

The summary of your email is: client's must escape search string to prevent
Solr from failing.

It would be a nice addition to Solr to provide a new query parameter that
tells it to treat the query text as literal text.  Doing so, means you
remove the burden placed on clients to understand and escape reserved Solr
/ Lucene tokens.

Steve

On Wed, Apr 15, 2015 at 7:18 PM, Shawn Heisey <apa...@elyograg.org> wrote:

> On 4/15/2015 3:54 PM, Steven White wrote:
> > Hi folks,
> >
> > If a user types in the search box (without quotes): "{!q.op=AND df=text
> > solr sys" and I take that text and build the URL like so:
> >
> >
> http://localhost:8983/solr/db/select?q={!q.op=AND%20df=text%20solr%20sys&fl=id%2Cscore%2Ctitle&wt=xml&indent=true
> >
> > This will fail with "Expected identifier" because it is not a valid Solr
> > text.
>
> That isn't valid syntax for the lucene query parser ... the localparams
> are not closed (it would require a } character), and after the
> localparams there would need to be some additional text.
>
> > My question is this: is there a flag I can send to Solr with the URL
> > telling it to treat what's in "q" as raw text vs. having it to process it
> > as a Solr syntax?  If not, than it means I have to escape all Solr
> reserved
> > characters and words.  If so, where can I find the complete list?  Also,
> > what happens when a new reserved characters or word is added to Solr down
> > the road?  It means I have to upgrade my application too, which is
> > something I would like to avoid.
>
> One way to treat the entire input as literal text is to use the terms
> query parser ... but that requires the localparams syntax, and I do not
> know exactly what is going to happen if you use a query string that
> itself is localparams syntax -- {!xxxx other params} ... so escaping is
> probably safer.
>
>
> https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-TermQueryParser
>
> The other way to handle it is to escape every special character with a
> backslash.  The escapeQueryChars method in SolrJ is always kept up to
> date, and can escape every special character.
>
>
> http://lucene.apache.org/solr/4_10_3/solr-solrj/org/apache/solr/client/solrj/util/ClientUtils.html#escapeQueryChars%28java.lang.String%29
>
> The javadoc for that method points to the queryparser syntax for more
> info on characters that need escaping.  Scroll to the very end of this
> page:
>
>
> http://lucene.apache.org/core/4_10_3/queryparser/org/apache/lucene/queryparser/classic/package-summary.html?is-external=true
>
> That page lists || and && rather than just the single characters | and &
> ... the escapeQueryChars method in SolrJ will escape both characters, as
> it only works at the character level, not the string level.
>
> If you want the *spaces* in your query to be treated literally also, you
> must escape them too.  The escapeQueryChars method I've mentioned will
> NOT escape spaces.
>
> Note that this does not cover URL escaping -- the & character must be
> sent as %26 or the servlet container will treat it as a special
> character, before it even gets to Solr.
>
> Thanks,
> Shawn
>
>

Reply via email to