Aw: Re: Cannot parse ":", using HTTP-URL as id

sysrq Wed, 12 Sep 2012 11:16:35 -0700

my bad, using "term query parser" works, thanks ahmet.


> Gesendet: Mittwoch, 12. September 2012 um 19:40 Uhr
> Von: sy...@web.de
> An: solr-user@lucene.apache.org
> Betreff: Aw: Re: Cannot parse ":", using HTTP-URL as id
>
> > term query parser is your friend in this case. With this you don't need to 
> > escape anything.
> >   SolrQuery query = new SolrQuery();
> >   query.setQuery("{!term f=id}bar_http://bar.com/?doc=452";);
> 
> But how can I *store* a document with an URL as a field value ? E.g. 
> "domain_http://www.domain.com/?p=12345";
> The "term query parser" may be able to *retrieve* field values with an ":", 
> but my current problem is that I can't store value with ":" with *Solrj*, the 
> Java library to communicate with Solr.
> 
> > --- On Wed, 9/12/12, sy...@web.de <sy...@web.de> wrote:
> > 
> > > From: sy...@web.de <sy...@web.de>
> > > Subject: Cannot parse ":", using HTTP-URL as id
> > > To: solr-user@lucene.apache.org
> > > Date: Wednesday, September 12, 2012, 7:40 PM
> > > Hi,
> > > 
> > > I defined a field "id" in my schema.xml and use it as an
> > > <uniqueKey>:
> > >   <field name="id" type="string" indexed="true"
> > > stored="true" required="true" />
> > >   <uniqueKey>id</uniqueKey>
> > > 
> > > I want to store URLs with a prefix in this field to be sure
> > > that every id is unique among websites. For example:
> > >   domain_http://www.domain.com/?p=12345
> > >   foo_http://foo.com
> > >   bar_http://bar.com/?doc=452
> > > I wrote a Java app, which uses Solrj to communicate with a
> > > running Solr instance. Solr (or Solrj, not sure about this)
> > > complains that it can't parse ":":
> > >   Exception in thread "main"
> > > org.apache.solr.common.SolrException:
> > >  
> > > org.apache.lucene.queryparser.classic.ParseException:
> > >   Cannot parse 'id:domain_http://www.domain.com/?p=12345': Encountered " 
> > > ":" ":
> > > "" at line 1, column 14.
> > > 
> > > How should I handle characters like ":" to solve this
> > > problem?
> > > 
> > > I already tried to escape the ":" like this:
> > >   String id = "domain_http://www.domain.com/?p=12345".replaceAll(":",
> > > "\\\\:"));
> > >   ...
> > >   document.addField("id", id);
> > >   ...
> > > But then Solr (or Solrj) complains again:
> > >   Exception in thread "main"
> > > org.apache.solr.common.SolrException:
> > >  
> > > org.apache.lucene.queryparser.classic.ParseException:
> > >   Cannot parse
> > > 'id:domain_http\://www.domain.com/?p=12345': Lexical error
> > > at line 1, column 42.  Encountered: <EOF> after :
> > > "/?p=12345"
> > > I use 4 backslashes (\\\\) for double-escape. The first
> > > escape is for Java itself, the second is for Solr to handle
> > > it (I guess).
> > > 
> > > So what is the correct or usual way to deal with special
> > > characters like ":" in Solr (or Solrj)? I don't know if Solr
> > > or Solrj is the problem, but I guess it is Solrj?
> > >
> > 
>

Aw: Re: Cannot parse ":", using HTTP-URL as id

Reply via email to