Thanks Ista for youe help, it works and I understand why.

However, I'm still confuse why the previous code lost the "factor key".  It 
could just have converted to factors and output factors but instead it's 
outputing integer...

I'm not a very big fan of the default stringAsFactors=T, but that's another 
debate.

Anyway, thanks again,

Bastien 

-----Message d'origine-----
De : Ista Zahn [mailto:istaz...@gmail.com] 
Envoyé : 26 janvier 2015 11:51
À : Ferland-Raymond, Bastien (DIF)
Cc : r-help@r-project.org
Objet : Re: [R] Weird behavior of aggregate() function

?aggregate informs you that unless x is a time series it will be converted to a 
data.frame. data.frame will convert your character to a factor unless you tell 
it not to.

You can prevent this by converting vari to a data.frame yourself, passing the 
stringsAsFactors argument, like this:

aggregate(data.frame(TE = vari, stringsAsFactors = FALSE),
by=list(gr),faire.paires)

Best,
Ista

On Mon, Jan 26, 2015 at 11:30 AM,
<bastien.ferland-raym...@mffp.gouv.qc.ca> wrote:
>
> Hello list,
>
> I have found a weird behavior of the aggregate() function when used with 
> characters. I think the problem as to do with converting characters to 
> factors.
>
> I'm trying to aggregate a character vector using an homemade function.  My 
> function is giving me all the possible pairs of modalities observed.
>
>
> Reproducible code:
>
> #######
> ### my grouping variable
> gr <- c("A","A","B","B","C","C","C","D","D","E","E","E")
> ### my variable
> vari <- 
> c("rs2","rs2","mj2","mj1","rs1","rs1","rs2","mj1","mj1","rs1","mj1","m
> j2")
>
> ### what the table would look like
> cbind(gr,vari)
>
> ###  My function that gives every pairs of variables possible (my real 
> function can go up to length(TE)==5, but for the sake of the example, 
> I've reduced it here) faire.paires <- function(TE){ gg <- 
> rbind(c(TE[1],TE[2]),
>             c(TE[1],TE[3]))
> gg <- gg[rowSums(is.na(gg))==0,,drop=F] gg }
>
> ###  The function gives exactly what I want when I run it on a 
> specific entry faire.paires(TE = vari[gr=="B"])
>
> ###  But with aggregate(), it transforms everything into integer res 
> <- aggregate(list(TE = vari), by=list(gr),faire.paires) res
> str(res)
>
> ###  it's like it's using factor than losing the key to tell me which 
> integer ###  mean which modality
>
>
> ###  if I give it directly factors:
> res2 <- aggregate(list(TE = as.factor(vari)), 
> by=list(gr),faire.paires)
> res2
> str(res2)
>
> ###  does not fix the problem.
> ############
>
> Any idea?
>
> I know my function may not be the best or most efficient way to 
> succeed. However, I'm still puzzled on why aggregate gives me this weird 
> output.
>
> Best regards,
>
> Bastien Ferland-Raymond, M.Sc. Stat., M.Sc. Biol.
> Division des orientations et projets spéciaux Direction des 
> inventaires forestiers Ministère des Forêts, de la Faune et des Parcs
>
> ______________________________________________
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to