Re: [Rd] Bug in agrep computing edit distance?

2010-11-18 Thread Dickison, Daniel
A followup to this. I got R to compile, and the following patch seems to fix this issue (I don't think my previous attachment worked so it's pasted inline). There is still a quirk, where tail insertions seem to cost 1 extra and I'm not sure why. In the first example below, 3 and 5 should match,

Re: [Rd] Bug in agrep computing edit distance?

2010-11-17 Thread Dickison, Daniel
On 11/17/10 6:06 PM, "Joris Meys" wrote: >Indeed, I get it. If the pattern is "xx", it is only matched against 2 >letters at the same time. All the rest doesn't matter. But still that >doesn't explain > >>agrep("ANNTCG", "ANNXXTCG", max = list(ins=3)) >integer(0) >>agrep("ANNTCG", "ANNXTCG", max

Re: [Rd] Bug in agrep computing edit distance?

2010-11-17 Thread Dickison, Daniel
> agrep("xx ",c("xx ","xyx ","xyzx ","xyzax",max=list(all=2))) >[1] 1 >> agrep("xx ",c("xx ","xyx ","xyzx ","xyzax",max=list(all=3))) >[1] 1 > >If the sequences are made the s

[Rd] Bug in agrep computing edit distance?

2010-11-17 Thread Dickison, Daniel
I posted this yesterday to r-help and Ben Bolker suggested reposting it here... Dickison, Daniel carnegielearning.com> writes: > > The documentation for agrep says it uses the Levenshtein edit distance, > but it seems to get this wrong in certain cases when there is a > combinati