We help clients that perform index-time semantic expansion to hypernyms at index time. For example, they will have a synonyms file that does the following
wing_tips => wing_tips, dress_shoes, shoes dress_shoes => dress_shoes, shoes oxfords => oxfords, dress_shoes, shoes Then at query time, we rely on differing IDF of these terms in the same position to bring up the rare, specific terms matches, followed by increasingly semantically broad matches. For example, Previously, a search for wing_tips would get turned into "wing_tips OR dress_shoes OR shoes". Shoes being very common would get scored lowest. Wing tips being very specific would get scored very highly ( I have a blog post about this (which uses Elasticsearch) http://opensourceconnections.com/blog/2016/12/23/elasticsearch-synonyms-patterns-taxonomies/ ) As our clients upgrade to Solr 6 and above, we're noticing our technique no longer works due to SynonymQuery, which blends the doc freq at query time of synonyms at query time. SynonymQuery seems to be the right direction for most people :) Still I would like to figure out how/if there's a setting anywhere to return to the legacy behavior (a boolean query of term queries) so I don't have to go back to the drawing board for clients that rely on this technique. I've been going through QueryBuilder and I don't see where we could go back to the legacy behavior. It seems to be based on position overlap. Thanks! -Doug -- Consultant, OpenSource Connections. Contact info at http://o19s.com/about-us/doug-turnbull/; Free/Busy (http://bit.ly/dougs_cal)