I did a bit of research on the list for prior discussions of normalized scores and such. Please forgive me if I overlooked something relevant, but I didn't see anything exactly what I'm looking for.
I am building a replacement for our current text matching engine that takes a list of documents from feed A and finds the best match for each of those in the list of documents from feed B. For purposes of this example, feed A and B might have the fields: title; director; year The people reviewing this matching process need some way of determining why a particular match was made other than the overall score. Was it because the title was a perfect match or was it because the title wasn't that close, but the director and year were dead on? The current idea I have for a strategy to provide this information would be to run my query four times (n + 1 where n is each scoring section), once to find the overall best match (a regular query) then each additional query grouping, requiring, and boosting a different section of the query. I would then store the rank of the "best" item returned by the overall query. That rank can be used to indicate the relevance of that item based on the defined criteria. So, following the indexes mentioned above, my queries would be: The natural "overall" query: (title:"feed A item one title"^10 (+title:feed~ +title:A~ +title:item~ +title:one~ +title:title~)) director:"Director, Feed A." (year:1974^10 year:[1972 TO 1976]) The query for title relevance: +((title:"feed A item one title"^10 (+title:feed~ +title:A~ +title:item~ +title:one~ +title:title~)))^100 director:"Director, Feed A." (year:1974^10 year:[1972 TO 1976]) The query for director relevance: +(director:"Director, Feed A.")^100 (title:"feed A item one title"^10 (+title:feed~ +title:A~ +title:item~ +title:one~ +title:title~)) (year:1974^10 year:[1972 TO 1976]) The query for year relevance: +((year:1974^10 year:[1972 TO 1976]))^100 (title:"feed A item one title"^10 (+title:feed~ +title:A~ +title:item~ +title:one~ +title:title~)) director:"Director, Feed A." If the #1 item returned by the overall query was 1/10 for title, 3/10 for director, and 5/10 for year and those three scoring sections had equal weights of 1.0 to .10 then I would be able to display the following scores: title: 1.0 director: .8 year: .6 overall: 2.4 I looked at the javadocs related to the FunctionQuery class because it looked interesting, but the actual docs were a bit light and I wasn't able to determine if it might help me out with this need. Does this sound unreasonable to anyone? Is there a clearly better way I might have overlooked? Thank you very much for your ideas and comments, Daniel