On Jun 28, 2011, at 3:59 PM, Daniel Malter wrote:
Hi all,
I have two numeric variables that form combinations in a matched
sample.
Let's say I have five levels of x and y. What I am seeking to create
is a
factor variable that ignores the order of x and y, i.e., the factor
should
indicate x=1, y=5, as the same factor as x=5, y=1. Obviously, this
becomes
increasingly cumbersome to do by hand as the number of levels
increases.
f<-1:5
x<-sample(f,100,replace=T)
y<-sample(f,100,replace=T)
d<-matrix(cbind(x,y),ncol=2)
#A working solution is to remove the order, multiply one column by a
scaling
constant, add the second column, and create the factor for this
numeric
value. However, I was wondering whether there is less awkward, more
direct
way to do this.
i<-apply(t(apply(d,1,function(x) sort(x))),1,function(y) 10*y[1]+y[2])
i<-factor(i)
i
I came up with the same solution, but implemented it a bit differently:
> d <- pmin(x,y)+5*pmax(x,y)
> sort(unique(d))
[1] 11 21 22 31 32 33 41 42 43 44 51 52 53 54 55
> d <- factor(pmin(x,y)+10*pmax(x,y))
> unique(d)
[1] 41 42 32 54 51 21 22 33 53 11 31 44 43 52 55
Levels: 11 21 22 31 32 33 41 42 43 44 51 52 53 54 55
Seems that you might find the the BioC people doing something
isomorphic to this with gene allele pairs using their fancy S4 methods.
--
David Winsemius, MD
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.