Hi Daniel,

you can also consider using negative boosts.
This can't be done with solr, but docs which don't match the metadata
can be boosted.

This might do what you want :
-metadata1:(term1 AND ... AND termN)^2
-metadata2:(term1 AND ... AND termN)^2
.....
-metadataN:(term1 AND ... AND termN)^2
allMetadatas :(term1 AND ... AND termN)^0.5


Franck Brisbart



Le mercredi 22 janvier 2014 à 19:38 +0000, Petersen, Robert a écrit :
> Hi Daniel,
> 
> How about trying something like this (you'll have to play with the boosts to 
> tune this), search all the fields with all the terms using edismax and use 
> the minimum should match parameter, but require all terms to match in the 
> allMetadata field.    
> https://wiki.apache.org/solr/ExtendedDisMax#mm_.28Minimum_.27Should.27_Match.29
> 
> Lucene query syntax below to give you the general idea, but this query would 
> require all terms to be in one of the metadata fields to get the boost.
> 
> metadata1:(term1 AND ... AND termN)^2
> metadata2:(term1 AND ... AND termN)^2
> .....
> metadataN:(term1 AND ... AND termN)^2
> allMetadatas :(term1 AND ... AND termN)^0.5
> 
> That should do approximately what you want,
> Robi
> 
> -----Original Message-----
> From: Daniel Shane [mailto:sha...@lexum.com] 
> Sent: Tuesday, January 21, 2014 8:42 AM
> To: solr-user@lucene.apache.org
> Subject: Interesting search question! How to match documents based on the 
> least number of fields that match all query terms?
> 
> I have an interesting solr/lucene question and its quite possible that some 
> new features in solr might make this much easier that what I am about to try. 
> If anyone has a clever idea on how to do this search, please let me know!
> 
> Basically, lets state that I have an index in which each documents has a 
> content and several metadata fields.
> 
> Document Fields:
> 
> content
> metadata1
> metadata2
> .....
> metadataN
> allMetadatas (all the terms indexed in metadata1...N are concatenated in this 
> field) 
> 
> Assuming that I am searching for documents that contains a certain number of 
> terms (term1 to termN) in their metadata fields, I would like to build a 
> search query that will return document that satisfy these requirement:
> 
> a) All search terms must be present in a metadata field. This is quite easy, 
> we can simply search in the field allMetadatas and that will work fine.
> 
> b) Now for the hard part, we prefer document in which we found the metadatas 
> in the *least number of different fields*. So if one document contains all 
> the search terms in 10 different fields, but another document contains all 
> search terms but in only 8 fields, we would like those to sort first. 
> 
> My first idea was to index terms in the allMetadatas using payloads. Each 
> indexed term would also have the specific metadataN field from which they 
> originate. Then I can write a scorer to score based on these payloads. 
> 
> However, if there is a way to do this without payloads I'm all ears!
> 


Reply via email to