Re: [R] Count of rows while looping through data

2011-05-27 Thread jim holtman
When you subset, the factors will carry along all the original levels. You can remove them in your processing by: x$fac <- factor(x$fac) > x <- data.frame(fam=c('a','a','b'), grp=c('1','2','3')) > # split > x.s <- split(x, x$fam) > # notice additional levels > str(x.s$b) 'data.frame': 1 obs. o

Re: [R] Count of rows while looping through data

2011-05-27 Thread Phil Spector
Jeanna - The family variable is being stored as a factor. You could eliminate the NA values manually, or you could try something like x$family = as.character(x$family) before subsetting. If neither of these solutions are satisfactory, please follow the posting guide and provide a repr

Re: [R] Count of rows while looping through data

2011-05-27 Thread Jeanna
I may have prematurely excited... I ended up using the split method since my family indicators are alphanumeric so my issue is as follows. I'm applying this to different subsets of my main data set. The subsets do not contain all families. When I run the method on one of my subsets I get back a

Re: [R] Count of rows while looping through data

2011-05-26 Thread Jeanna
Thank you both. These solutions are far more elegant than anything I could have come up with, and I appreciate the opportunity to learn new commands within the context of my own data. I think I've got it working now. :) -- View this message in context: http://r.789695.n4.nabble.com/Count-of-r

Re: [R] Count of rows while looping through data

2011-05-25 Thread Kenn Konstabel
An alternative approach would be to `split` the data frame by family, then `lapply` a function selecting random row from each slice, and then `rbind` it all together. x = data.frame(family = rep(1:20,sample(2:5,20,replace=TRUE)), xyz=1) randomrow <- function(x) x[sample(1:nrow(x),1),] # step by s

Re: [R] Count of rows while looping through data

2011-05-24 Thread Phil Spector
Jeanna - I can't imagine how you could solve this problem with a loop, but here's one way to solve it using R: First, I'll create a data frame with a family variable: x = data.frame(family = rep(1:20,sample(2:5,20,replace=TRUE))) Next, I'll number each family member within each family: