Re: problems when hunspell returns multiple stems

Michael Sokolov Tue, 18 Nov 2014 11:40:03 -0800

followup - hunspell has:

follow/SDRZGJ
follower/M
following/M


follow/G generates following

I guess the reason for the /M entries is to represent the nouns, whichhave plural endings, so that


following->followings

-- I'm not really sure where the bug is, but it seems as if generatingmultiple "stems" causes issues



On 11/18/2014 02:33 PM, Michael Sokolov wrote:

I find that a query for stemmed terms sometimes fails with the edismaxquery parser and hunspell stemmer. Looklng at the output of analysisfor the query (text:following) I can see that it generates twodifferent terms at the same position: "follow" and "following". Thenedismax seems to generate a sloppy phrase query from that; in thedebug output of the query I can see ( text:following text:follow)~2.This doesn't match anything, even though both the words follow andfollowing (as well as followed, follows, etc) both occur in variousdocuments.
First, I'm confused as to what the source of the sloppy query is. Hereare the relevant settings from solrconfig:
<str name="defType">edismax</str>
<str name="qf">archive_id^1 author^20 chapter_title^15 isbn^1publisher^5 subjects^5 text^1 title^120</str>
<str name="pf">chapter_title~2^1 subjects~2^20 text~10^1 title~2^4</str>
<str name="mm">100%</str>
<str name="q.op">OR</str>

Is there some process that generates a slop query for co-occurring terms?
As an aside, the same query returns a document when we use the lucenequery parser: it matches one document. But when I search across ourunstemmed field, it returns more. It appears as if
It seems as if when hunspell returns multiple terms from a single one,this causes problems?
So in summary: why would hunspell generate "following" as a stem for"following"? Probably just a buggy dictionary entry; we could fixthat, but I wouldn't expect the phrase behavior in that case fromedismax either. Can anybody shed some light as to what's going on here?
Thanks

-Mike

Re: problems when hunspell returns multiple stems

Reply via email to