: 1. There are user records of type A, B, C etc. (userId field in index is : common to all records) : 2. A user can have any number of A, B, C etc (e.g. think of A being a : language then user can know many languages like french, english, german etc) : 3. Records are currently stored as a document in index. : 4. A given query can match multiple records for the user : 5. If for a user more records are matched (e.g. if he knows both french and : german) then he is more relevant and should come top in UI. This is the : reason I wanted to add lucene scores assuming the greater score means more : relevance.
if your goal is to get back "users" from each search, then you should probably change your indexing strategry so that each "user" has a single document -- fields like "langauge" can be multivalued, etc... then a search for "language:en langauge:fr" will return users who speak english or french, and hte ones that speak both will score higher. if you really cant change the index structure, then essentially waht you are looking for is a "field collapsing" solution on the userId field, where you want each collapsed group to get a cumulative score. i don't know if the existing field collapsing patches support this -- if you are already willing/capable to do it in the lcient then that may be the simplest thing to support moving foward. Adding the scores is certainly one metric you could use -- it's generally suspicious to try and imply too much meaning to scores in lucene/solr but that's becuase people typically try to imply broader absolute meaning. in the case of a single query the scores are relative eachother, and adding up all the scores for a given userId is approximaly what would happen in my example above -- except that there is also a "coord" factor that would penalalize documents that only match one clause ... it's complicated, but as an approximation adding the scores might give you what you are looking for -- only you can know for sure based on your specific data. -Hoss