Re: How to query for similar documents before indexing

2010-05-11 Thread Markus Jelsma
s > duplicates based on that signature and gather that information yourself > as long as such a feature isn't there." > > Can you explain more what you have in mind ? > > Thank you for your help! > > matt > > --- On Mon, 5/10/10, Markus Jelsma wrote: > &g

RE: How to query for similar documents before indexing

2010-05-11 Thread Matthieu Labour
long as such a feature isn't there." Can you explain more what you have in mind ? Thank you for your help! matt --- On Mon, 5/10/10, Markus Jelsma wrote: From: Markus Jelsma Subject: RE: How to query for similar documents before indexing To: solr-user@lucene.apache.org Date: Mond

Re: How to query for similar documents before indexing

2010-05-10 Thread Mark Miller
From: Matthieu Labour Sent: Mon 10-05-2010 23:30 To: solr-user@lucene.apache.org; Subject: RE: How to query for similar documents before indexing Markus Thank you for your response That would be great if the index has the option to prevent duplicate from entering the index. But is it going to

Re: How to query for similar documents before indexing

2010-05-10 Thread Ken Krugler
--- From: Matthieu Labour Sent: Mon 10-05-2010 23:30 To: solr-user@lucene.apache.org; Subject: RE: How to query for similar documents before indexing Markus Thank you for your response That would be great if the index has the option to prevent duplicate from entering the index. But is it goin

RE: How to query for similar documents before indexing

2010-05-10 Thread Markus Jelsma
rom: Matthieu Labour Sent: Mon 10-05-2010 23:30 To: solr-user@lucene.apache.org; Subject: RE: How to query for similar documents before indexing Markus Thank you for your response That would be great if the index has the option to prevent duplicate from entering the index. But is it going to be

RE: How to query for similar documents before indexing

2010-05-10 Thread Matthieu Labour
? Cheers matt --- On Mon, 5/10/10, Markus Jelsma wrote: From: Markus Jelsma Subject: RE: How to query for similar documents before indexing To: solr-user@lucene.apache.org Date: Monday, May 10, 2010, 4:11 PM Hi,     Deduplication [1] is what you're looking for.It can utilize different anal

RE: How to query for similar documents before indexing

2010-05-10 Thread Markus Jelsma
Hi,     Deduplication [1] is what you're looking for.It can utilize different analyzers that will add a one or more signatures or hashes to your document depending on exact or partial matches for configurable fields. Based on that, it should be able to prevent new documents from entering the