Re: Customzing Solr Dedupe

2015-04-01 Thread Dan Davis
But you can potentially still use Solr dedupe if you do the upfront work (in RDMS or NoSQL pre-index processing) to assign some sort of "Group ID". See OCLC's FRBR Work-Set Algorithm, http://www.oclc.org/content/dam/research/activities/frbralgorithm/2009-08.pdf?urlm=161376 , for some details on o

Re: Customzing Solr Dedupe

2015-04-01 Thread Jack Krupansky
Solr dedupe is based on the concept of a signature - some fields and rules that reduce a document into a discrete signature, and then checking if that signature exists as a document key that can be looked up quickly in the index. That's the conceptual basis. It is not based on any kind of field by