On 12/19/2017 4:38 AM, Markus Jelsma wrote:
> I have an interesting issue with mm and SynonymQuery and KeywordRepeatFilter. 
> We do query time synonym expansion and use KeywordRepeat for not only finding 
> stemmed tokens. Our synonyms are already preprocessed and contain only 
> stemmed tokens. Synonym file contains: traject,verbind
>
> So, any non-root stem that ends up in a synonym is actually a search for 
> three terms: +DisjunctionMaxQuery(((title_nl:trajecten 
> Synonym(title_nl:traject title_nl:verbind))))
>
> But, our default mm requires that two terms must match if the input query 
> consists of two terms: 2<-1 5<-2 6<90%
>
> So, a simple query looking for a plural (trajecten) will not match a document 
> where the title contains only its singular form: q=trajecten will not match 
> document with title_nl:"een traject"

I would think that doing synonym expansion at index time would remove
any possible confusion about the number of terms at query time.  Queries
that involve synonyms will be slightly less complex, but the index would
be larger, so it's difficult to say whether those kinds of queries would
be any faster or not.

There is one clear disadvantage to index-time synonym expansion: If you
change your synonyms, you have to reindex.

Thanks,
Shawn

Reply via email to