From: Steven D'Aprano
>To: Python Mailing List
>Sent: Sunday, June 3, 2012 4:00 AM
>Subject: Re: [Tutor] How to identify clusters of similar files
>
>Albert-Jan Roskam wrote:
>> Hi,
>>
>> I want to use difflib to compare a lot (tens of thousands) of
Albert-Jan Roskam wrote:
Hi,
I want to use difflib to compare a lot (tens of thousands) of text files. I
know that many files are quite similar as they are subsequent versions of
the same document (a primitive kind of version control). What would be a
good approach to cluster the files based on
Hi,
I want to use difflib to compare a lot (tens of thousands) of text files. I
know that many files are quite similar as they are subsequent versions of the
same document (a primitive kind of version control). What would be a good
approach to cluster the files based on their likeness? I want t