Hi; I have developed a Search API for such kind of cases and generate Solr query within that API. I have also have my own query syntax.
When a search query comes into my API I generate query and does not allow for something like *:*. On the other hand I escape query string and append the appropriate field for search query as like: field:(escaped_value) so there is not a security concern about reaching the fields of schema or escaping concern. I think that building a search API something like that and handling security, escaping etc. within it is a way you should consider. If try to do something like that I can answer your questions. Thanks; Furkan KAMACI 2014-04-09 18:29 GMT+03:00 Erick Erickson <erickerick...@gmail.com>: > Note that when I mentioned "filter these characters out" I had > something like PatternReplaceCharFilterFactory or LowerCaseTokenizer > in mind rather than you having to do it manually. Doesn't help > figuring out what to escape on the URL though. > > Best, > Erick > > On Wed, Apr 9, 2014 at 8:05 AM, Shawn Heisey <s...@elyograg.org> wrote: > > On 4/9/2014 8:39 AM, Philip Durbin wrote: > >> Filtering out special characters sounds like a good idea, or possibly > >> escaping some of them. I definitely want to avoid brittleness. > >> > >> Right now I'm passing the query relatively "as is" which means users > >> can type "title:foo" to find documents that have "foo" in the "title" > >> field. But a query for just a colon (":") throws an error > >> (org.apache.solr.search.SyntaxError: Cannot parse ':') so obviously I > >> need to do more processing of the query before I pass it to Solr. I > >> need to escape that colon or something. > >> > >> Is there some general advice on doing some sanity checks or escaping > >> special characters on user-supplied queries before you pass them to > >> Solr? Is it documented in the wiki? I'm using Solrj but I imagine the > >> advice applies to everyone. > > > > SolrJ has the ClientUtils.escapeQueryChars method, which will > > automatically escape any character that has special meaning to the query > > parser. It does so by preceding it with a backslash. > > > > > http://lucene.apache.org/solr/4_7_1/solr-solrj/org/apache/solr/client/solrj/util/ClientUtils.html#escapeQueryChars%28java.lang.String%29 > > > > You do need to be careful with it, though. For a query formatted like > > field:(value) you'd only want to apply it to the 'value' part, because > > if you applied it to the whole query, the colon and parentheses would > > become part of the query text -- probably not what you want. > > > > Thanks, > > Shawn > > >