I just need to confirm something with pattern matching folks. I have a factor with the following levels in a very large data set:
> levels(all$Classical.Statistic) [1] "" "AB;ABD" "CollapsedSteps" "CR_P" "CR_Prop;CR_P;AB" [6] "NMK" "NMK;P" "NMK;P;ABD" "P" "ABD" [11] "CR_P;CollapsedSteps" "NMK;AB;ABD" "NMK;ABD" "NMK;P;AB" "NMK;P;AB;ABD" [16] "AB" "CRT;CollapsedSteps" "NMK;AB" "CR_P;CRT;CollapsedSteps" "CR_Prop;CR_P" I need to subset the rows in which the term "CollapsedSteps" appears. So, it may appear as "CollapsedSteps" or may appear as "CR_P;CRT;CollapsedSteps" as you can see above. I'm using grep as follows: all[grep('CollapsedSteps', all$Classical.Statistic),] to find any row in which the term "'CollapsedSteps" appears. Is this certain to catch all cases, or is there an intricacy that I may have missed. Thank you Harold > sessionInfo() R version 2.10.1 (2009-12-14) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] gdata_2.8.0 loaded via a namespace (and not attached): [1] gtools_2.6.2 tools_2.10.1 [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.