: bq: we don't want to use either the primary key or the record's : update date as the tie-breaker, as it may introduce an new bias into the : ranking algorithm : : Are you thinking of adding something to your main clause to force this? : If so, why not just use sorting by adding a sort clause like: : : &sort=score desc, datefield desc
i think that is what Gregg mentioned wanting to avoid -- because it will bais results in favor of documents with newer values in the date field. i believe he wants a consistent ordering that resolves ties in docs with identical scores in some way thta doesn't favor documents based on any externally visible propery of the documents themselves. hashing on the uniqueKey seems like it should work, since it would esentially be a random value generated with a consistent seed (the key) regardless of the shards or document addition order -- but depending on your hashing algorithm it could still intorduce some bias assuming your uniqueKeys have some semantic meaning to begin with (if they don't you oculd just sort on them). To be safe, you could generate the hash using more then just the uniqueKey ... why not use *all* of the fields in the document? https://lucene.apache.org/solr/4_1_0/solr-core/org/apache/solr/update/processor/SignatureUpdateProcessorFactory.html http://wiki.apache.org/solr/Deduplication -Hoss