There is the stringMatch function in the MiscPsycho package. > stringMatch('Hadley', 'Hadley Wickham', normalize = 'no') [1] 8 > stringMatch('Hadley', 'Hadley Wickham', normalize = 'yes') [1] 0.4285714
It uses Levenshtein distance to tell you how much they differ by, either normalized or not. So, the above two tell you the first string differs from the second string by 8 insertions/deletions/substitutions. The second number normalizes the comparison such that 1 denotes perfect agreement and 2 denotes imperfect agreement. Examples of an exact match are below. > stringMatch('Hadley Wickham', 'Hadley Wickham', normalize = 'yes') [1] 1 > stringMatch('Hadley Wickham', 'Hadley Wickham', normalize = 'n') [1] 0 -----Original Message----- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Hadley Wickham Sent: Tuesday, August 24, 2010 10:17 AM To: R-help Subject: [R] Comparing/diffing strings Hi all, all.equal is generally very useful when you want to find the differences between two objects. It breaks down however, when you have two long strings to compare: > all.equal(a, b) [1] "1 string mismatch" Does any one know of any good text diffing tools implemented in R? Thanks, Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.