subject:"dealing with duplicates"

Re: dealing with duplicates

2009-08-10 Thread Avlesh Singh

o, > have you tried using http://wiki.apache.org/solr/Deduplication ? > >> > >> Otis > >> -- > >> Sematext is hiring -- http://sematext.com/about/jobs.html?mls > >> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR > >> > >

Re: dealing with duplicates

2009-08-10 Thread Joe Calderon

>> >> Otis >> -- >> Sematext is hiring -- http://sematext.com/about/jobs.html?mls >> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR >> >> >> >> - Original Message >>> From: Joe Calderon >>> To: solr-user@l

Re: dealing with duplicates

2009-08-01 Thread Joe Calderon

; > > > - Original Message >> From: Joe Calderon >> To: solr-user@lucene.apache.org >> Sent: Friday, July 31, 2009 5:06:48 PM >> Subject: dealing with duplicates >> >> hello all, i have a collection of a few million documents; i have many >>

Re: dealing with duplicates

2009-07-31 Thread Otis Gospodnetic

ucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR - Original Message > From: Joe Calderon > To: solr-user@lucene.apache.org > Sent: Friday, July 31, 2009 5:06:48 PM > Subject: dealing with duplicates > > hello all, i have a collection of a few million

dealing with duplicates

2009-07-31 Thread Joe Calderon

hello all, i have a collection of a few million documents; i have many duplicates in this collection. they have been clustered with a simple algorithm, i have a field called 'duplicate' which is 0 or 1 and a fields called 'description, tags, meta', documents are clustered on different criteria and

Re: dealing with duplicates

Re: dealing with duplicates

Re: dealing with duplicates

Re: dealing with duplicates

dealing with duplicates

5 matches

Site Navigation

Mail list logo

Footer information