I've been working on getting AnalyzingInfixSuggester to make suggestions
using tokens drawn from multiple fields. I've done this by copying
tokens from each of those fields into a destination field, and building
suggestions using that destination field. This allows me to use
different analysis strategies for each of the fields, which I need, but
it doesn't address a couple of remaining issues:
1. Some source fields are more important than others, and it would be
good to be able to give their tokens greater weight somehow
2. The threshold is applied equally across all tokens, but for some
fields we want to suggest singletons (threshold=0), while for others we
want to use the threshold to exclude low-frequency terms.
I looked a little bit at how to extend the whole framework from Solr on
down to handle multiple source fields intrinsically, rather than using
the copying technique, and it looks like I could possibly manage
something like this by extending DocumentDictionary and plugging in a
different DictionaryFactory. Does that sound like a good approach? Is
there some better way to approach this problem?
Thanks
-Mike
PS Sorry for the cross-post; I realized after I hit send this was
probably a better question for solr-user than lucene...