Thanks for the suggestion. Unfortunately, my implementation requires the Standard query parser -- I sanitize and expand user queries into deeply nested queries with custom boosts and other bells and whistles that make Dismax unappealing. I see from the docs that Similarity.sloppyFreq() is a method for returning a higher score for small edit distances, but it's not clear when that is used. If I make a (Standard) query like a AND b AND c AND "a b c"~1000000 does that imply that during the computation of the score for "a b c"~1000000, sloppyFreq() will be called? That's great for my needs, assuming the 1000000 slop doesn't increase query time horribly.
Michael On Mon, Aug 17, 2009 at 10:15 AM, Mark Miller <markrmil...@gmail.com> wrote: > Dismax QueryParser with pf and ps params? > > http://wiki.apache.org/solr/DisMaxRequestHandler > > -- > - Mark > > http://www.lucidimagination.com > > > > > Michael wrote: > >> Anybody have any suggestions or hints? I'd love to score my queries in a >> way that pays attention to how close together terms appear. >> Michael >> >> On Thu, Aug 13, 2009 at 12:01 PM, Michael <solrco...@gmail.com> wrote: >> >> >> >>> Hello, >>> I'd like to score documents higher that have the user's search terms >>> nearer >>> each other. For example, if a user searches for >>> >>> a AND b AND c >>> >>> the standard query handler should return all documents with [a] [b] and >>> [c] >>> in them, but documents matching the phrase "a b c" should get a boost >>> over >>> those with "a x b c" over those with "b x y c z a", etc. >>> >>> To accomplish this, I thought I might replace the user's query with >>> >>> "a b c"~1000000000 >>> >>> hoping that the slop term gets a higher and higher score the closer >>> together [a] [b] and [c] appear. This doesn't seem to be the case in my >>> experiments; when I debug the query, there's no component of the score >>> based >>> on how close together [a] [b] and [c] are. And I'm suspicious that this >>> would make my queries a whole lot slower -- in reality my users' queries >>> get >>> expanded quite a bit already, and I'd thus need to add many slop terms. >>> >>> Perhaps instead I could modify the Standard query handler to examine the >>> distance between all ANDed tokens, and boost proportionally to the >>> inverse >>> of their average distance apart. I've never modified a query handler >>> before >>> so I have no idea if this is possible. >>> >>> Any suggestions on what approach I should take? The less I have to >>> modify >>> Solr, the better -- I'd prefer a query-side solution over writing a >>> plugin >>> over forking the standard query handler. >>> >>> Thanks in advance! >>> Michael >>> >>> >>> >> >> >> > > > > >