Re: Modify solr score

2017-04-24 Thread tstusr
We came with a simple solution. We use termfreq and write a simple processor that counts words for making a boost function that only calculates the ratio between words that hit terms and the whole field length. Some tests are being made, ma

Re: Modify solr score

2017-04-22 Thread Erik Hatcher
This may be suggesting a solution that is too experimental or using the wrong hammer for the job, but to me it sounds like you could use “payloads” for this type of ranking of terms relationship to a document. See SOLR-1485 for the recent work I’ve been doing (and aim to get committed soon).

Re: Modify solr score

2017-04-21 Thread Rick Leir
Ulf: Maybe there is a way you could filter out the unrelated documents. Qf? Rick On April 21, 2017 2:18:59 PM EDT, tstusr wrote: >Well, I know they can change. > >I think, the main problem here it that (in this point) documents >completely >unrelated to a topic are being ranked as high as documen

Re: Modify solr score

2017-04-21 Thread tstusr
Well, I know they can change. I think, the main problem here it that (in this point) documents completely unrelated to a topic are being ranked as high as documents related. So, in order to penalize them we are trying to use the ratio or term frequency/word length. Nevertheless we aren't able to

Re: Modify solr score

2017-04-21 Thread Walter Underwood
Using a minimum score cut off does not work. The score is not an absolute estimate of relevance. The idf component of the score is a whole-corpus metric. When you add or delete documents, the scores for the exact same query can change. wunder Walter Underwood wun...@wunderwood.org http://observ

Re: Modify solr score

2017-04-21 Thread tstusr
Well, maybe I explain it wrong. We have entry points, each of them are related to a topic. It mens that when we select the first topic all information has to be related in some way to this vocabulary. So, it can work since we select documents not related to each vocabulary of every entry point. To

Re: Modify solr score

2017-04-21 Thread Walter Underwood
It isn’t going to work. The score is not an absolute relevance measurement. It only says that the first document is more relevant than the second, and so on. Scores are not comparable between different queries. The score cannot be used to say that the first hit for query A is a better match than

Re: Modify solr score

2017-04-21 Thread tstusr
Since we report the score, we think there will be some relation between them. As far as we know scoring (and then ranking) are calculated based on tf-idf. What we want to do is to make a qualitative ranking, it means, according to one topic we will tag documents as "very related", "fairly related"

Re: Modify solr score

2017-04-21 Thread alessandro.benedetti
It has been discussed countless times, never rely on score values. Rely on the ranking of your results. It seems you model a as a least of keywords and then you just run a query for each topic. Essentially for you, a is a query. The ranking of your results will already be affected by how many ti