On Apr 25, 2010, at 1:08 AM, burgundy wrote:


Hi,

I'm trying to assign a score to each row which allow me to identify which rows differ. In the example file below, I've used "," to indicate column separators. In this example, I'd like to identify that row 1 and row 5 are
the same, and row 2 and row 4 are teh same.
Any help much appreciated. Also, any comments on what the command lines do
would be fantastic.
Thanks!!

example file:
0,0,1,0,1,0,0
0,1,0,0,0,0,1
0,0,0,0,0,0,0
0,1,0,0,0,0,1
0,0,1,0,1,0,0
0,0,0,1,0,0,0

example request output:
1
2
3
2
1
4

If you use apply by rows with paste and a collapse argument you can get a text column. Using factor on that text column and then setting levels=unique(fac) one can extract the ordered elements with as.numeric(fac).

On a dataframe, rrr,  with those elements and such a factor, fac:

> as.numeric(factor(rrr$fac, levels=unique(rrr$fac)))
[1] 1 2 3 2 1 4

One needs to use factor a second time because the levels after the first call were set to an alpha-sorted version of fac.

--

David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to