On Wed, May 2, 2012 at 12:16 PM, Ken Krugler <kkrugler_li...@transpac.com> wrote:
> What confuses me is that Suggester says it's based on SpellChecker, which > supposedly does work with shards. > It is based on spellchecker apis, but spellchecker's ranking is based on simple comparators like string similarity, whereas suggesters use weights. when spellchecker merges from shards, it just merges all their top-N into one set and recomputes this same distance stuff over again. so, suggester can't possibly work like this correctly (forget about any technical details), as how can it make assumptions about these weights you provided. if they were e.g. log() weights from your query logs then it needs to do log-summation across the shards, etc for the final combined weight to be correct. This is specific to how you originally computed the weights you gave it. it certainly cannot be recomputing anything like spellchecker does :) Anyways, if you really want to do it, maybe https://issues.apache.org/jira/browse/SOLR-2848 is helpful. The background is in 3.x there is really only one spellchecker impl (AbstractLucene or something like that). I don't think distributed spellcheck works with any other SpellChecker subclasses in 3.x, i think its "wired" to only work with the Abstract-Lucene ones. When we added another subclass to 4.0, DirectSpellChecker, he saw that it was broken here and cleaned up the APIs so that spellcheckers can override this merge() operation. Unfortunately I forgot to commit those refactorings James did (which lets any spellchecker override merge()ing) to the 3.x branch, but the ideas might be useful. -- lucidimagination.com