I've been working on getting AnalyzingInfixSuggester to make suggestions using tokens drawn from multiple fields. I've done this by copying tokens from each of those fields into a destination field, and building suggestions using that destination field. This allows me to use different analysis strategies for each of the fields, which I need, but it doesn't address a couple of remaining issues:

1. Some source fields are more important than others, and it would be good to be able to give their tokens greater weight somehow

2. The threshold is applied equally across all tokens, but for some fields we want to suggest singletons (threshold=0), while for others we want to use the threshold to exclude low-frequency terms.

I looked a little bit at how to extend the whole framework from Solr on down to handle multiple source fields intrinsically, rather than using the copying technique, and it looks like I could possibly manage something like this by extending DocumentDictionary and plugging in a different DictionaryFactory. Does that sound like a good approach? Is there some better way to approach this problem?

Thanks

-Mike

PS Sorry for the cross-post; I realized after I hit send this was probably a better question for solr-user than lucene...

Reply via email to