Wacek Kusnierczyk wrote: > William Dunlap wrote: > >> Would your patched code affect the following >> use of regexpr's output as input to substr, to >> pull out the matched text from the string? >> > x<-c("ooo","good food","bad") >> > r<-regexpr("o+", x) >> > substring(x,r,attr(r,"match.length")+r-1) >> [1] "ooo" "oo" "" >> >> > > no; same output > > >> > substr(x,r,attr(r,"match.length")+r-1) >> [1] "ooo" "oo" "" >> >> > > no; same output > > >> > r >> [1] 1 2 -1 >> attr(,"match.length") >> [1] 3 2 -1 >> > attr(r,"match.length")+r-1 >> [1] 3 3 -3 >> attr(,"match.length") >> [1] 3 2 -1 >> >> > > for the positive indices there is no change, as you might expect. > > if i understand your concern, the issue is that regexpr returns -1 (with > the corresponding attribute -1) where there is no match. in this case, > you expect "" as the substring. > > if there is no match, we have: > > start = r = -1 (the start you index provide) > stop = attr(r) + r - 1 = -1 + -1 -1 = -3 (the stop index you provide) > > for a string of length n, my patch computes the final indices as follows: > > start' = n + start - 1 > stop' = n + stop - 1 > > whatever the value of n, stop' - start' = stop - start = -3 - 1 = -4. >
except for that stop - start = -3 - -1 = -2, but that's still negative, i.e., stop' < start'. silly me, sorry. vQ > that is, stop' < start', hence an empty string is returned, by virtue of > the original code. (see the sources for details.) > > does this answer your question? > > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel