Great, thank you Mark! Michael On Mon, Aug 17, 2009 at 10:48 AM, Mark Miller <markrmil...@gmail.com> wrote:
> PhraseQuery's do score higher if the terms are found closer together. > > does that imply that during the computation of the score for "a b >>> c"~1000000, sloppyFreq() will be called? >>> >> > Yes. PhraseQuery uses PhraseWeight, which creates a SloppyPhraseScorer, > which takes into account Similiarity.sloppyFreq(matchLength). > > > > Michael wrote: > >> Thanks for the suggestion. Unfortunately, my implementation requires the >> Standard query parser -- I sanitize and expand user queries into deeply >> nested queries with custom boosts and other bells and whistles that make >> Dismax unappealing. >> I see from the docs that Similarity.sloppyFreq() is a method for returning >> a >> higher score for small edit distances, but it's not clear when that is >> used. >> If I make a (Standard) query like >> a AND b AND c AND "a b c"~1000000 >> does that imply that during the computation of the score for "a b >> c"~1000000, sloppyFreq() will be called? That's great for my needs, >> assuming the 1000000 slop doesn't increase query time horribly. >> >> Michael >> >> On Mon, Aug 17, 2009 at 10:15 AM, Mark Miller <markrmil...@gmail.com> >> wrote: >> >> >> >>> Dismax QueryParser with pf and ps params? >>> >>> http://wiki.apache.org/solr/DisMaxRequestHandler >>> >>> -- >>> - Mark >>> >>> http://www.lucidimagination.com >>> >>> >>> >>> >>> Michael wrote: >>> >>> >>> >>>> Anybody have any suggestions or hints? I'd love to score my queries in >>>> a >>>> way that pays attention to how close together terms appear. >>>> Michael >>>> >>>> On Thu, Aug 13, 2009 at 12:01 PM, Michael <solrco...@gmail.com> wrote: >>>> >>>> >>>> >>>> >>>> >>>>> Hello, >>>>> I'd like to score documents higher that have the user's search terms >>>>> nearer >>>>> each other. For example, if a user searches for >>>>> >>>>> a AND b AND c >>>>> >>>>> the standard query handler should return all documents with [a] [b] and >>>>> [c] >>>>> in them, but documents matching the phrase "a b c" should get a boost >>>>> over >>>>> those with "a x b c" over those with "b x y c z a", etc. >>>>> >>>>> To accomplish this, I thought I might replace the user's query with >>>>> >>>>> "a b c"~1000000000 >>>>> >>>>> hoping that the slop term gets a higher and higher score the closer >>>>> together [a] [b] and [c] appear. This doesn't seem to be the case in >>>>> my >>>>> experiments; when I debug the query, there's no component of the score >>>>> based >>>>> on how close together [a] [b] and [c] are. And I'm suspicious that >>>>> this >>>>> would make my queries a whole lot slower -- in reality my users' >>>>> queries >>>>> get >>>>> expanded quite a bit already, and I'd thus need to add many slop terms. >>>>> >>>>> Perhaps instead I could modify the Standard query handler to examine >>>>> the >>>>> distance between all ANDed tokens, and boost proportionally to the >>>>> inverse >>>>> of their average distance apart. I've never modified a query handler >>>>> before >>>>> so I have no idea if this is possible. >>>>> >>>>> Any suggestions on what approach I should take? The less I have to >>>>> modify >>>>> Solr, the better -- I'd prefer a query-side solution over writing a >>>>> plugin >>>>> over forking the standard query handler. >>>>> >>>>> Thanks in advance! >>>>> Michael >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> >>> >>> >>> >>> >>> >> >> >> > > > -- > - Mark > > http://www.lucidimagination.com > > > >