Thanks,

Here is a ruby translation for those that want it:

solr_query = ""
      doi_part.each_char do |c|
        if (c == '\\' || c == '+' || c == '-' || c == '!' || c == '(' || c
== ')' || c == ':' || c == '^' || c == '[' || c == ']' || c == '\"' || c ==
'{' || c == '}' || c == '~' || c == '*' || c == '?' || c == '|' || c == ';')
          solr_query += '\\'
          solr_query += "#{c}"
        elsif (c == '&')
          solr_query += "%26"
        else
          solr_query += "#{c}"
        end
      end
      solr_query

It still seems to get confused by & characters and turning them into %26
does not work from the solr-ruby connection. ..but it works for most of the
DOI's that I have tried so still a big improvement - thanks.

On Tue, Feb 10, 2009 at 11:19 PM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:

> Hi Ian,
>
> I'll assume this actually did get indexed as a single token, so there is no
> problem there.
> As for query string escaping, perhaps this method from Lucene's QueryParser
> will help:
>
>  /**
>   * Returns a String where those characters that QueryParser
>   * expects to be escaped are escaped by a preceding <code>\</code>.
>   */
>  public static String escape(String s) {
>    StringBuffer sb = new StringBuffer();
>    for (int i = 0; i < s.length(); i++) {
>      char c = s.charAt(i);
>      // These characters are part of the query syntax and must be escaped
>      if (c == '\\' || c == '+' || c == '-' || c == '!' || c == '(' || c ==
> ')' || c == ':'
>        || c == '^' || c == '[' || c == ']' || c == '\"' || c == '{' || c ==
> '}' || c == '~'
>        || c == '*' || c == '?' || c == '|' || c == '&') {
>        sb.append('\\');
>      }
>      sb.append(c);
>    }
>    return sb.toString();
>  }
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
>
> ________________________________
> From: Ian Connor <ian.con...@gmail.com>
> To: solr <solr-user@lucene.apache.org>
> Sent: Tuesday, February 10, 2009 9:28:11 PM
> Subject: Is there a way to query for this value?
>
> I have tried to escape the characters as best I can, but cannot seem to
> find
> one that works.
>
> The value is:
>
> 10.1002/(SICI)1096-9136(199604)13:4<390::AID-DIA121>3.0.CO;2-4
>
> It is a doi (see http://doi.org), so is a valid value to search on.
> However,
> when I query this through ruby or even the admin interface, the parser does
> not like it and returns an error.
>
> What is the way to escape this? Is there such code for ruby?
> --
> Regards,
>
> Ian Connor
>



-- 
Regards,

Ian Connor
1 Leighton St #723
Cambridge, MA 02141
Call Center Phone: +1 (714) 239 3875 (24 hrs)
Fax: +1(770) 818 5697
Skype: ian.connor

Reply via email to