I have an interesting solr/lucene question and its quite possible that some new 
features in solr might make this much easier that what I am about to try. If 
anyone has a clever idea on how to do this search, please let me know!

Basically, lets state that I have an index in which each documents has a 
content and several metadata fields.

Document Fields:

content
metadata1
metadata2
.....
metadataN
allMetadatas (all the terms indexed in metadata1...N are concatenated in this 
field) 

Assuming that I am searching for documents that contains a certain number of 
terms (term1 to termN) in their metadata fields, I would like to build a search 
query that will return document that satisfy these requirement:

a) All search terms must be present in a metadata field. This is quite easy, we 
can simply search in the field allMetadatas and that will work fine.

b) Now for the hard part, we prefer document in which we found the metadatas in 
the *least number of different fields*. So if one document contains all the 
search terms in 10 different fields, but another document contains all search 
terms but in only 8 fields, we would like those to sort first. 

My first idea was to index terms in the allMetadatas using payloads. Each 
indexed term would also have the specific metadataN field from which they 
originate. Then I can write a scorer to score based on these payloads. 

However, if there is a way to do this without payloads I'm all ears!

-- 
Daniel Shane
Lexum (www.lexum.com)
sha...@lexum.com

Reply via email to