Re: Duplicates

Peter Karich Fri, 23 Jul 2010 02:02:50 -0700

Hi Pavel!

The patch can be applied to 1.4.
The performance is ok, but for some situations it could be worse than
without the patch.
For us it works good, but others reported some exceptions
(see the patch site: https://issues.apache.org/jira/browse/SOLR-236)


> I need only to delete duplicates

Could you give us an example what you exactly need?
(Maybe you could index each master document of the 'unique' documents
with an extra field and query for that field?)

Regards,
Peter.

> Thanks.
>
> Does it work with Solr 1.4 (Solr 4.0 mentioned in article)?
> What about performance? I need only to delete duplicates (I don't need cout
> of duplicates or select certain duplicate).
>
> 2010/7/23 Peter Karich <peat...@yahoo.de>
>
>   
>> Another possibility could be the well known 'field collapse' ;-)
>>
>> http://wiki.apache.org/solr/FieldCollapsing
>>
>> Regards,
>> Peter.
>>
>>     
>>> Thanks.
>>>
>>> If I set uniqueKey on the field, then I can save duplicates?
>>> I need to remove duplicates only from search results. The ability to save
>>> duplicates are should be.
>>>
>>> 2010/7/23 Erick Erickson <erickerick...@gmail.com>
>>>
>>>
>>>       
>>>> If the field is a single token, just define the uniqueKey on it in your
>>>> schema.
>>>>
>>>> Otherwise, this may be of interest:
>>>> http://wiki.apache.org/solr/Deduplication
>>>>
>>>> Haven't used it myself though...
>>>>
>>>> best
>>>> Erick
>>>>
>>>> On Thu, Jul 22, 2010 at 6:14 PM, Pavel Minchenkov <char...@gmail.com>
>>>> wrote:
>>>>
>>>>
>>>>         
>>>>> Hi,
>>>>>
>>>>> Is it possible to remove duplicates in search results by a given field?
>>>>>
>>>>> Thanks.
>>>>>
>>>>> --
>>>>> Pavel Minchenkov
>>>>>           
>>
>>     
>
>   


-- 
http://karussell.wordpress.com/

Re: Duplicates

Reply via email to