On 12/19/2017 4:38 AM, Markus Jelsma wrote: > I have an interesting issue with mm and SynonymQuery and KeywordRepeatFilter. > We do query time synonym expansion and use KeywordRepeat for not only finding > stemmed tokens. Our synonyms are already preprocessed and contain only > stemmed tokens. Synonym file contains: traject,verbind > > So, any non-root stem that ends up in a synonym is actually a search for > three terms: +DisjunctionMaxQuery(((title_nl:trajecten > Synonym(title_nl:traject title_nl:verbind)))) > > But, our default mm requires that two terms must match if the input query > consists of two terms: 2<-1 5<-2 6<90% > > So, a simple query looking for a plural (trajecten) will not match a document > where the title contains only its singular form: q=trajecten will not match > document with title_nl:"een traject"
I would think that doing synonym expansion at index time would remove any possible confusion about the number of terms at query time. Queries that involve synonyms will be slightly less complex, but the index would be larger, so it's difficult to say whether those kinds of queries would be any faster or not. There is one clear disadvantage to index-time synonym expansion: If you change your synonyms, you have to reindex. Thanks, Shawn