my bad, using "term query parser" works, thanks ahmet.
> Gesendet: Mittwoch, 12. September 2012 um 19:40 Uhr > Von: sy...@web.de > An: solr-user@lucene.apache.org > Betreff: Aw: Re: Cannot parse ":", using HTTP-URL as id > > > term query parser is your friend in this case. With this you don't need to > > escape anything. > > SolrQuery query = new SolrQuery(); > > query.setQuery("{!term f=id}bar_http://bar.com/?doc=452"); > > But how can I *store* a document with an URL as a field value ? E.g. > "domain_http://www.domain.com/?p=12345" > The "term query parser" may be able to *retrieve* field values with an ":", > but my current problem is that I can't store value with ":" with *Solrj*, the > Java library to communicate with Solr. > > > --- On Wed, 9/12/12, sy...@web.de <sy...@web.de> wrote: > > > > > From: sy...@web.de <sy...@web.de> > > > Subject: Cannot parse ":", using HTTP-URL as id > > > To: solr-user@lucene.apache.org > > > Date: Wednesday, September 12, 2012, 7:40 PM > > > Hi, > > > > > > I defined a field "id" in my schema.xml and use it as an > > > <uniqueKey>: > > > <field name="id" type="string" indexed="true" > > > stored="true" required="true" /> > > > <uniqueKey>id</uniqueKey> > > > > > > I want to store URLs with a prefix in this field to be sure > > > that every id is unique among websites. For example: > > > domain_http://www.domain.com/?p=12345 > > > foo_http://foo.com > > > bar_http://bar.com/?doc=452 > > > I wrote a Java app, which uses Solrj to communicate with a > > > running Solr instance. Solr (or Solrj, not sure about this) > > > complains that it can't parse ":": > > > Exception in thread "main" > > > org.apache.solr.common.SolrException: > > > > > > org.apache.lucene.queryparser.classic.ParseException: > > > Cannot parse 'id:domain_http://www.domain.com/?p=12345': Encountered " > > > ":" ": > > > "" at line 1, column 14. > > > > > > How should I handle characters like ":" to solve this > > > problem? > > > > > > I already tried to escape the ":" like this: > > > String id = "domain_http://www.domain.com/?p=12345".replaceAll(":", > > > "\\\\:")); > > > ... > > > document.addField("id", id); > > > ... > > > But then Solr (or Solrj) complains again: > > > Exception in thread "main" > > > org.apache.solr.common.SolrException: > > > > > > org.apache.lucene.queryparser.classic.ParseException: > > > Cannot parse > > > 'id:domain_http\://www.domain.com/?p=12345': Lexical error > > > at line 1, column 42. Encountered: <EOF> after : > > > "/?p=12345" > > > I use 4 backslashes (\\\\) for double-escape. The first > > > escape is for Java itself, the second is for Solr to handle > > > it (I guess). > > > > > > So what is the correct or usual way to deal with special > > > characters like ":" in Solr (or Solrj)? I don't know if Solr > > > or Solrj is the problem, but I guess it is Solrj? > > > > > >