When you subset, the factors will carry along all the original levels.
You can remove them in your processing by:
x$fac <- factor(x$fac)
> x <- data.frame(fam=c('a','a','b'), grp=c('1','2','3'))
> # split
> x.s <- split(x, x$fam)
> # notice additional levels
> str(x.s$b)
'data.frame': 1 obs. o
Jeanna -
The family variable is being stored as a factor.
You could eliminate the NA values manually, or you
could try something like
x$family = as.character(x$family)
before subsetting. If neither of these solutions are
satisfactory, please follow the posting guide and provide
a repr
I may have prematurely excited...
I ended up using the split method since my family indicators are
alphanumeric so my issue is as follows.
I'm applying this to different subsets of my main data set. The subsets do
not contain all families. When I run the method on one of my subsets I get
back a
Thank you both. These solutions are far more elegant than anything I could
have come up with, and I appreciate the opportunity to learn new commands
within the context of my own data.
I think I've got it working now. :)
--
View this message in context:
http://r.789695.n4.nabble.com/Count-of-r
An alternative approach would be to `split` the data frame by family,
then `lapply` a function selecting random row from each slice, and
then `rbind` it all together.
x = data.frame(family = rep(1:20,sample(2:5,20,replace=TRUE)), xyz=1)
randomrow <- function(x) x[sample(1:nrow(x),1),]
# step by s
Jeanna -
I can't imagine how you could solve this problem with a
loop, but here's one way to solve it using R:
First, I'll create a data frame with a family variable:
x = data.frame(family = rep(1:20,sample(2:5,20,replace=TRUE)))
Next, I'll number each family member within each family:
6 matches
Mail list logo