On Wed, May 2, 2012 at 12:16 PM, Ken Krugler
<kkrugler_li...@transpac.com> wrote:

> What confuses me is that Suggester says it's based on SpellChecker, which 
> supposedly does work with shards.
>

It is based on spellchecker apis, but spellchecker's ranking is based
on simple comparators like string similarity, whereas suggesters use
weights.

when spellchecker merges from shards, it just merges all their top-N
into one set and recomputes this same distance stuff over again.

so, suggester can't possibly work like this correctly (forget about
any technical details), as how can it make assumptions about these
weights you provided. if they were e.g. log() weights from your query
logs then it needs to do log-summation across the shards, etc for the
final combined weight to be correct. This is specific to how you
originally computed the weights you gave it. it certainly cannot be
recomputing anything like spellchecker does :)

Anyways, if you really want to do it, maybe
https://issues.apache.org/jira/browse/SOLR-2848 is helpful. The
background is in 3.x there is really only one spellchecker impl
(AbstractLucene or something like that). I don't think distributed
spellcheck works with any other SpellChecker subclasses in 3.x, i
think its "wired" to only work with the Abstract-Lucene ones.

When we added another subclass to 4.0, DirectSpellChecker, he saw that
it was broken here and cleaned up the APIs so that spellcheckers can
override this merge() operation. Unfortunately I forgot to commit
those refactorings James did (which lets any spellchecker override
merge()ing) to the 3.x branch, but the ideas might be useful.

-- 
lucidimagination.com

Reply via email to