Hello list, I have found a weird behavior of the aggregate() function when used with characters. I think the problem as to do with converting characters to factors.
I'm trying to aggregate a character vector using an homemade function. My function is giving me all the possible pairs of modalities observed. Reproducible code: ####### ### my grouping variable gr <- c("A","A","B","B","C","C","C","D","D","E","E","E") ### my variable vari <- c("rs2","rs2","mj2","mj1","rs1","rs1","rs2","mj1","mj1","rs1","mj1","mj2") ### what the table would look like cbind(gr,vari) ### My function that gives every pairs of variables possible (my real function can go up to length(TE)==5, but for the sake of the example, I've reduced it here) faire.paires <- function(TE){ gg <- rbind(c(TE[1],TE[2]), c(TE[1],TE[3])) gg <- gg[rowSums(is.na(gg))==0,,drop=F] gg } ### The function gives exactly what I want when I run it on a specific entry faire.paires(TE = vari[gr=="B"]) ### But with aggregate(), it transforms everything into integer res <- aggregate(list(TE = vari), by=list(gr),faire.paires) res str(res) ### it's like it's using factor than losing the key to tell me which integer ### mean which modality ### if I give it directly factors: res2 <- aggregate(list(TE = as.factor(vari)), by=list(gr),faire.paires) res2 str(res2) ### does not fix the problem. ############ Any idea? I know my function may not be the best or most efficient way to succeed. However, I'm still puzzled on why aggregate gives me this weird output. Best regards, Bastien Ferland-Raymond, M.Sc. Stat., M.Sc. Biol. Division des orientations et projets spéciaux Direction des inventaires forestiers Ministère des Forêts, de la Faune et des Parcs ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.