Aw: Re: Cannot parse ":", using HTTP-URL as id

sysrq Wed, 12 Sep 2012 10:41:01 -0700

> term query parser is your friend in this case. With this you don't need to 
> escape anything.
>   SolrQuery query = new SolrQuery();
>   query.setQuery("{!term f=id}bar_http://bar.com/?doc=452";);


But how can I *store* a document with an URL as a field value ? E.g. 
"domain_http://www.domain.com/?p=12345";
The "term query parser" may be able to *retrieve* field values with an ":", but 
my current problem is that I can't store value with ":" with *Solrj*, the Java 
library to communicate with Solr.

> --- On Wed, 9/12/12, sy...@web.de <sy...@web.de> wrote:
> 
> > From: sy...@web.de <sy...@web.de>
> > Subject: Cannot parse ":", using HTTP-URL as id
> > To: solr-user@lucene.apache.org
> > Date: Wednesday, September 12, 2012, 7:40 PM
> > Hi,
> > 
> > I defined a field "id" in my schema.xml and use it as an
> > <uniqueKey>:
> >   <field name="id" type="string" indexed="true"
> > stored="true" required="true" />
> >   <uniqueKey>id</uniqueKey>
> > 
> > I want to store URLs with a prefix in this field to be sure
> > that every id is unique among websites. For example:
> >   domain_http://www.domain.com/?p=12345
> >   foo_http://foo.com
> >   bar_http://bar.com/?doc=452
> > I wrote a Java app, which uses Solrj to communicate with a
> > running Solr instance. Solr (or Solrj, not sure about this)
> > complains that it can't parse ":":
> >   Exception in thread "main"
> > org.apache.solr.common.SolrException:
> >  
> > org.apache.lucene.queryparser.classic.ParseException:
> >   Cannot parse 'id:domain_http://www.domain.com/?p=12345': Encountered " 
> > ":" ":
> > "" at line 1, column 14.
> > 
> > How should I handle characters like ":" to solve this
> > problem?
> > 
> > I already tried to escape the ":" like this:
> >   String id = "domain_http://www.domain.com/?p=12345".replaceAll(":",
> > "\\\\:"));
> >   ...
> >   document.addField("id", id);
> >   ...
> > But then Solr (or Solrj) complains again:
> >   Exception in thread "main"
> > org.apache.solr.common.SolrException:
> >  
> > org.apache.lucene.queryparser.classic.ParseException:
> >   Cannot parse
> > 'id:domain_http\://www.domain.com/?p=12345': Lexical error
> > at line 1, column 42.  Encountered: <EOF> after :
> > "/?p=12345"
> > I use 4 backslashes (\\\\) for double-escape. The first
> > escape is for Java itself, the second is for Solr to handle
> > it (I guess).
> > 
> > So what is the correct or usual way to deal with special
> > characters like ":" in Solr (or Solrj)? I don't know if Solr
> > or Solrj is the problem, but I guess it is Solrj?
> >
>

Aw: Re: Cannot parse ":", using HTTP-URL as id

Reply via email to