Re: [R] Proper use of grep

Doran, Harold Thu, 15 Jul 2010 17:25:21 -0700

Thanks. Yes, I did that on a toy data set and with my real data. It *seems* to 
have worked. I just work with grep so rarely that I didn't want to miss 
something.


-----Original Message-----
From: Erik Iverson [mailto:er...@ccbr.umn.edu] 
Sent: Thursday, July 15, 2010 5:36 PM
To: Doran, Harold
Cc: r-help@r-project.org
Subject: Re: [R] Proper use of grep



Doran, Harold wrote:
> I just need to confirm something with pattern matching folks. I have
> a factor with the following levels in a very large data set:
> 
>> levels(all$Classical.Statistic)
> [1] ""                        "AB;ABD"
> "CollapsedSteps"          "CR_P"                    "CR_Prop;CR_P;AB"
>  [6] "NMK"                     "NMK;P"                   "NMK;P;ABD"
> "P"                       "ABD" [11] "CR_P;CollapsedSteps"
> "NMK;AB;ABD"              "NMK;ABD"                 "NMK;P;AB"
> "NMK;P;AB;ABD" [16] "AB"                      "CRT;CollapsedSteps"
> "NMK;AB"                  "CR_P;CRT;CollapsedSteps" "CR_Prop;CR_P"
> 
> I need to subset the rows in which the term "CollapsedSteps" appears.
> So, it may appear as "CollapsedSteps" or may appear as
> "CR_P;CRT;CollapsedSteps" as you can see above. I'm using grep as
> follows:
> 
> all[grep('CollapsedSteps', all$Classical.Statistic),]
> 
> to find any row in which the term "'CollapsedSteps" appears. Is this
> certain to catch all cases, or is there an intricacy that I may have
> missed.


Well, just try it for yourself on a data.frame that's small enough to 
verify 'manually'.  For instance, the data.frame that contains each 
level exactly once sounds like a good candidate.


test <- subset(all, !duplicated(Classical.Statistic)

and then try your line of code ...

And do you really want "" as a level, or should those by NA?

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Proper use of grep

Reply via email to