Re: Grouping by simhash signature

2015-12-09 Thread Nickolay41189
Maybe there is some way to override equals function of grouping (change "==" to strdist)? -- View this message in context: http://lucene.472066.n3.nabble.com/Grouping-by-simhash-signature-tp4243236p4244541.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Grouping by simhash signature

2015-12-07 Thread Toke Eskildsen
On Wed, 2015-12-02 at 13:00 -0700, Nickolay41189 wrote: > I try to implement NearDup detection by SimHash > algorithm in Solr. [...] > How can I get groups of nearDup by /simhash_signature/? You could follow the suggested recipe at the page y

Re: Grouping by simhash signature

2015-12-03 Thread Chris Hostetter
: I try to implement NearDup detection by SimHash I'm not really familiar with simhash, but based on your description of it, i'm not sure that any of Solr's deduplication, grouping, or collapsing features will really help you here... : 1) each document has a field /simhash_signature/ that sto

Re: Grouping by simhash signature

2015-12-03 Thread Nikola Smolenski
On Wed, Dec 2, 2015 at 9:00 PM, Nickolay41189 wrote: > I try to implement NearDup detection by SimHash > algorithm in Solr. > Let's say: > 1) each document has a field /simhash_signature/ that stores a sequence of > bits. > 2) that in order to

Grouping by simhash signature

2015-12-02 Thread Nickolay41189
-- View this message in context: http://lucene.472066.n3.nabble.com/Grouping-by-simhash-signature-tp4243236.html Sent from the Solr - User mailing list archive at Nabble.com.