Opened a JIRA issue: https://issues.apache.org/jira/browse/SOLR-3589, which also lists a couple other related mailing list posts.
On Thu, Jun 28, 2012 at 12:18 PM, Tom Burton-West <tburt...@umich.edu>wrote: > Hello, > > My previous e-mail with a CJK example has received no replies. I > verified that this problem also occurs for English. For example in the > case of the word "fire-fly" , The ICUTokenizer and the WordDelimeterFilter > both split this into two tokens "fire" and "fly". > > With an edismax query and a must match of 2 : q={!edsmax mm=2} if the > words are entered separately at [fire fly], the edismax parser honors the > mm parameter and does the equivalent of a Boolean AND query. However if > the words are entered as a hypenated word [fire-fly], the tokenizer splits > these into two tokens "fire" and "fly" and the edismax parser does the > equivalent of a Boolean OR query. > > I'm not sure I understand the output of the debugQuery, but judging by the > number of hits returned it appears that edismax is not honoring the mm > parameter. Am I missing something, or is this a bug? > > I'd like to file a JIRA issue, but want to find out if I am missing > something here. > > Details of several queries are appended below. > > Tom Burton-West > > edismax query mm=2 query with hypenated word [fire-fly] > > <lst name="debug"> > <str name="rawquerystring">{!edismax mm=2}fire-fly</str> > <str name="querystring">{!edismax mm=2}fire-fly</str> > <str name="parsedquery">+DisjunctionMaxQuery(((ocr:fire ocr:fly)))</str> > <str name="parsedquery_toString">+((ocr:fire ocr:fly))</str> > > > Entered as separate words [fire fly] numFound="184962 > edismax mm=2 > <lst name="debug"> > <str name="rawquerystring">{!edismax mm=2}fire fly</str> > <str name="querystring">{!edismax mm=2}fire fly</str> > <str name="parsedquery"> > +((DisjunctionMaxQuery((ocr:fire)) DisjunctionMaxQuery((ocr:fly)))~2) > </str > > > Regular Boolean AND query: [fire AND fly] numFound="184962 > <str name="rawquerystring">fire AND fly</str> > <str name="querystring">fire AND fly</str> > <str name="parsedquery">+ocr:fire +ocr:fly</str> > <str name="parsedquery_toString">+ocr:fire +ocr:fly</str> > > Regular Boolean OR query: fire OR fly 366047 numFound="366047" > <lst name="debug"> > <str name="rawquerystring">fire OR fly</str> > <str name="querystring">fire OR fly</str> > <str name="parsedquery">ocr:fire ocr:fly</str> > <str name="parsedquery_toString">ocr:fire ocr:fly</str> >