Hi Vikas,
You're overworking yourself here, gsub is vectorized!
df$V5 <- gsub("[^AaCcGgTt\\.,]", "", df$V5)
This will be *substantially* faster than looping (using apply) over
every row of your data frame, since you just care about the 5th column
anyways. Also, I switched your regexp for one th
Dear all,
The 5th column of my data frame is like this-
.$.$.$.$.$,$,$...,.,,.,,....,,,,,T...,,,.,,,...,,
,..,,,...,,,..,,..,,,,,,.,,...G....,,.,,
,t.,,c,,.a.,,,.A,,,...,..,,,.,,,,...,,,$
.
2 matches
Mail list logo