Hello. I have index with 3 field - path_to_file - stored, not analyzed - unique path to file file_content - stored, not analyzed - file's content file_content_int - analyzed - file's content How to find and delete dublicates in file_content field? I have find http://open.vinayras.com/lucene_duplicate_remover but with lucene 3.x he don't work... Please, sorry my English.
-- С уважением,. ArtUrlWWW
