I've done that already. All you need to do is to create your custom
request handler.
My handler, among other things, what it does is the following:
It receives a factor threshold, such as 0.85. This means that the score
of the first document returned will be the assumed as the "best"
matching document. Then the document number #30 (definable) or the last
document if it returns less than 30, will be the "worst" document.
factor = 0.85 (for example)
bestScore = 1000 (for example)
worstScore = 500 (for example score of the document #30)
Then the handler applies the function : threshold = bestScore * factor +
worstScore * (1 - factor)
in the example case the threshold = 925. This means that the documents
whose score is above 925 are at least an 85% similar to the first
document returned.
So we obtain the threshold based on the score of the documents returned.
Why 30? Because statistically there is no much difference between 30 and
50 or 100 (This may depend on the number of documents you want return,
in my case is the best 3 or 4).
Once we get the threshold based on the score, all I need to do is to
check if the score of the next document to include in the returning set
is above the threshold.
If you need any further help, don't hesitate to ask for it.
Pako
Umar Shah wrote:
Hi,
is there some way of limiting the results above some fixed threshold?
thanks in anticipation
-umar