Re: [R] Selecting columns whose names contain "mutated" except when they also contain "non" or "un"

Rui Barradas Tue, 24 Apr 2012 10:26:26 -0700

Hello,


Greg Snow wrote
> 
> Here is a method that uses negative look behind:
> 
>> tmp <- c('mutation','nonmutated','unmutated','verymutated','other')
>> grep("(?&lt;!un)(?&lt;!non)muta&quot;, tmp, perl=TRUE)
> [1] 1 4
> 
> it looks for muta that is not immediatly preceeded by un or non (but
> it would match &quot;unusually mutated&quot; since the un is not
> immediatly
> befor the muta).
> 
> Hope this helps,
> 
> On Mon, Apr 23, 2012 at 10:10 AM, Paul Miller &lt;pjmiller_57@&gt; wrote:
>> Hello All,
>>
>> Started out awhile ago trying to select columns in a dataframe whose
>> names contain some variation of the word "mutant" using code like:
>>
>> names(KRASyn)[grep("muta", names(KRASyn))]
>>
>> The idea then would be to add together the various columns using code
>> like:
>>
>> KRASyn$Mutant_comb <- rowSums(KRASyn[grep("muta", names(KRASyn))])
>>
>> What I discovered though, is that this selects columns like "nonmutated"
>> and "unmutated" as well as columns like "mutated", "mutation", and
>> "mutational".
>>
>> So I'd like to know how to select columns that have some variation of the
>> word "mutant" without the "non" or the "un". I've been looking around for
>> an example of how to do that but haven't found anything yet.
>>
>> Can anyone show me how to select the columns I need?
>>
>> Thanks,
>>
>> Paul
>>
>> ______________________________________________
>> R-help@ mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 
> -- 
> Gregory (Greg) L. Snow Ph.D.
> 538280@
> 
> ______________________________________________
> R-help@ mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 


Has anyone realized that both 'non' and 'un' end with the same letter? The
only one we really need to check?

(tmp <- c('mutation','nonmutated','unmutated','verymutated','other')) 

i1 <- grepl("muta", tmp)
i2 <- grepl("nmuta", tmp)

tmp[i1 & !i2]


Now, not an answer to Greg's post, just convoluted.


(tmp <- c(tmp, 'permutation', 'commutation'))

cols <- list()
cols[[1]] <- grep("muta", tmp)
cols[[2]] <- grep("nmuta", tmp)
cols[[3]] <- grep("(per)|(com)muta", tmp)

Reduce(setdiff, cols)

Rui Barradas


--
View this message in context: 
http://r.789695.n4.nabble.com/Selecting-columns-whose-names-contain-mutated-except-when-they-also-contain-non-or-un-tp4580914p4584219.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Selecting columns whose names contain "mutated" except when they also contain "non" or "un"

Reply via email to