There is the stringMatch function in the MiscPsycho package. 

> stringMatch('Hadley', 'Hadley Wickham', normalize = 'no')
[1] 8
> stringMatch('Hadley', 'Hadley Wickham', normalize = 'yes')
[1] 0.4285714

It uses Levenshtein distance to tell you how much they differ by, either 
normalized or not. So, the above two tell you the first string differs from the 
second string by 8 insertions/deletions/substitutions. The second number 
normalizes the comparison such that 1 denotes perfect agreement and 2 denotes 
imperfect agreement.

Examples of an exact match are below.

> stringMatch('Hadley Wickham', 'Hadley Wickham', normalize = 'yes')
[1] 1
> stringMatch('Hadley Wickham', 'Hadley Wickham', normalize = 'n')
[1] 0

-----Original Message-----
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Hadley Wickham
Sent: Tuesday, August 24, 2010 10:17 AM
To: R-help
Subject: [R] Comparing/diffing strings

Hi all,

all.equal is generally very useful when you want to find the
differences between two objects.  It breaks down however, when you
have two long strings to compare:

> all.equal(a, b)
[1] "1 string mismatch"

Does any one know of any good text diffing tools implemented in R?

Thanks,

Hadley

-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to