Re: threshold of result rankings

Francisco Sanmartin Fri, 30 May 2008 08:05:23 -0700

I've done that already. All you need to do is to create your customrequest handler.


My handler, among other things, what it does is the following:

It receives a factor threshold, such as 0.85. This means that the scoreof the first document returned will be the assumed as the "best"matching document. Then the document number #30 (definable) or the lastdocument if it returns less than 30, will be the "worst" document.


factor = 0.85 (for example)
bestScore = 1000 (for example)
worstScore = 500 (for example score of the document #30)

Then the handler applies the function : threshold = bestScore * factor +worstScore * (1 - factor)

in the example case the threshold = 925. This means that the documentswhose score is above 925 are at least an 85% similar to the firstdocument returned.

So we obtain the threshold based on the score of the documents returned.Why 30? Because statistically there is no much difference between 30 and50 or 100 (This may depend on the number of documents you want return,in my case is the best 3 or 4).

Once we get the threshold based on the score, all I need to do is tocheck if the score of the next document to include in the returning setis above the threshold.


If you need any further help, don't hesitate to ask for it.

Pako



Umar Shah wrote:

Hi,

is there some way of limiting the results  above some fixed threshold?

thanks in anticipation
-umar

Re: threshold of result rankings

Reply via email to