: Query: "foo bar" : Doc1: "foo bar baz" : Doc2: "foo bar foo bar" : : These two documents should be scored exactly the same. I accomplished the : above in the "normal" query use-case by using the SweetSpotSimilarity class.
You can change this by subclassing SweetSpotSimilarity (or any Similarity class) and overridding the tf(float) function. tf(int) is called for terms, while tf(float) is called for for phrases -- the float value is lower for phrases with a lot of slop, and higher for exact matches. unfortunately, the input to tf(float) is lossy in accounting for docs htat match the phrase multiple times ... the value of "1.0f" might mean it mathes the phrase once exactly, or it might mean thta it matches many times in a sloppy manner. in your case, it sounds like you just want it to return "1" for any input except "0.0f" -Hoss