Hi everyone.
I am quite frustrated that this doesn't work, as all the functions within
work fine by themselves. I'd also like any pointers to how to avoid 'for'
loops in my code. I understand it's less than desirable, but I'm still quite
new and use them a lot.
I have a few wide datasets (90 to 120) with long column names, each name
contains a number of different 'markers'. Each could be considered a factor
variable within the column name. Their are two categories of factors, we'll
call them f1 and f2.
The data frame names look something like this:
'ace_van' , 'boy_van', 'car_xes' , 'ace_xes', 'dog_wall' , 'car_zounds'
f1 <- c('ace', 'boy', 'car', 'dog')
f2 <- c('van', 'wall', 'xes', 'zounds') # actual vectors are length 6 and
7, so I don't want to individually sum the 42 combinations.
> var.table <- function(data, vec1, vec2)
{
table <- as.data.frame(matrix(nrow = length(vec1), ncol = length(vec2)),
row.names = vec1)
names(table) <- vec2
for (i in 1:length(vec1))
{
for (j in 1:length(vec2))
{
indices <- intersect(grep(vec1[j], names(data), value = TRUE),
grep(cats[i], names(data), value = TRUE))
table[i,j] <- sum(data[ ,indices])
}
}
table
}
> var.table(mydf, f1, f2)
Output:
Error in FUN(X[[1L]], ...) :
only defined on a data frame with all numeric variables
Every entry in mydf is an integer with no missing values.
Thanks a ton.
-Ben
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.