I do in fact see your problem with an earlier 4.0 build, but not with
4.0-BETA.
-- Jack Krupansky
-----Original Message-----
From: Alexandre Rafalovitch
Sent: Thursday, September 06, 2012 10:13 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr 4.0alpha: edismax complaints on certain characters
I am on 4.0 alpha. Maybe it was fixed in beta. But I am most
definitely seeing this in edismax. If I get rid of / and use
debugQuery, I get:
'responseHeader'=>{
'status'=>0,
'QTime'=>14,
'params'=>{
'debugQuery'=>'true',
'indent'=>'true',
'q'=>'foobar',
'qf'=>'TitleEN DescEN',
'wt'=>'ruby',
'defType'=>'edismax'}},
'response'=>{'numFound'=>0,'start'=>0,'docs'=>[]
},
'debug'=>{
'rawquerystring'=>'foobar',
'querystring'=>'foobar',
'parsedquery'=>'(+DisjunctionMaxQuery((DescEN:foobar |
TitleEN:foobar)))/no_coord',
'parsedquery_toString'=>'+(DescEN:foobar | TitleEN:foobar)',
'explain'=>{},
'QParser'=>'ExtendedDismaxQParser',
....
I'll check beta on my machine by tomorrow.
Regards,
Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working. (Anonymous - via GTD
book)
On Thu, Sep 6, 2012 at 10:06 AM, Jack Krupansky <j...@basetechnology.com>
wrote:
That's what I was thinking, but when I tried foo/bar in Solr 3.6 and
4.0-BETA it was working fine - it split the term and generated the proper
query without any error.
I think the problem is if you use the default Lucene query parser, not
edismax. I removed &defType==edismax from my query request and the problem
reproduces.
My two test queries:
http://localhost:8983/solr/select/?debugQuery=true&defType=edismax&qf=features&q=foo/bar
http://localhost:8983/solr/select/?debugQuery=true&df=features&q=foo/bar
The first works; the second fails as reported (in 4.0-BETA, but works in
3.6).
-- Jack Krupansky
-----Original Message----- From: Yonik Seeley
Sent: Thursday, September 06, 2012 9:53 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr 4.0alpha: edismax complaints on certain characters
I believe this is caused by the regex support in
https://issues.apache.org/jira/browse/LUCENE-2039
It certainly seems wrong to interpret a slash in the middle of the
word as the start of a regex, so I've reopened the issue.
-Yonik
http://lucidworks.com
On Thu, Sep 6, 2012 at 9:34 AM, Alexandre Rafalovitch
<arafa...@gmail.com> wrote:
Hello,
I was under the impression that edismax was supposed to be crash proof
and just ignore bad syntax. But I am either misconfiguring it or hit a
weird bug. I basically searched for text containing '/' and got this:
{
'responseHeader'=>{
'status'=>400,
'QTime'=>9,
'params'=>{
'qf'=>'TitleEN DescEN',
'indent'=>'true',
'wt'=>'ruby',
'q'=>'foo/bar',
'defType'=>'edismax'}},
'error'=>{
'msg'=>'org.apache.lucene.queryparser.classic.ParseException:
Cannot parse \'foo/bar \': Lexical error at line 1, column 9.
Encountered: <EOF> after : "/bar "',
'code'=>400}}
Is that normal? If it is, is there a known list of characters I need
to escape or do I just have to catch the exception and tell user to
not do this again?
Regards,
Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working. (Anonymous - via GTD
book)