: For instance, my dictionary holds the following terms: : 1 - a b c d : 2 - c d e : 3 - a b : 4 - a e f g h : : If I put the sentence [a b c d f g h] in as a query, I want to recieve : dictionary items 1 (matching all words a b c d) and 3 (matching words a b) : as matches
this is a pretty hard problem in general ... in my mind i call it the "longest matching sub-phrase" problem, but i have no idea if it has a real name. the only solution i know of using Lucene is to construct a phrase query for each of the sub phrases, giving a bigger query boost to the "longer" phrases ... but it might be possible to design a customer query impl for solving this problem. (i've never had an important enough use case to dedicate a significant amount of time to figuring it out) -Hoss