[R] subset dataset using factor levels instead of factor names

2010-06-22 Thread Hayes, Rachel M
Hi All, I have a factor variable with 52 levels -with long, annoying names. I want to keep only rows with some variables. I can do this using this code: test1 <- subset(nih2009,ic_name %in% c('NATIONAL EYE INSTITUTE','Veterans Affairs')) dim(test1) [1] 2396 38 But this doesn't work: t1 <

[R] help sample from large dataset - misleading error?

2009-11-13 Thread Hayes, Rachel M
Hi All, I want to take a simple random sample from a large dataset, gly, but I'm getting an error message. Any help? dim(gly) [1] 112371 37 > s1 <- sample(gly,100) Error in `[.data.frame`(x, .Internal(sample(length(x), size, replace, : cannot take a sample larger than the popul

[R] rcs fits in design package

2009-09-30 Thread Hayes, Rachel M
Hi all, I have a vector of proportions (post_op_prw) such that >summary(amb$post_op_prw) Min. 1st Qu. MedianMean 3rd Qu.Max.NA's 0. 0. 0. 0.3985 0.9134 0.9962 1. > summary(cut2(amb$post_op_prw,0.0001)) [0.,0.0001) [0.0001,0.9962]

[R] Stratified data summaries

2009-07-09 Thread Hayes, Rachel M
Hi All, I'm trying to automate a data summary using summary or describe from the HMisc package. I want to stratify my data set by patient_type. I was hoping to do something like: Describe(myDataFrame ~ patient_type) I can create data subsets and run the describe function one at a time