On Apr 25, 2010, at 1:08 AM, burgundy wrote:
Hi,
I'm trying to assign a score to each row which allow me to identify
which
rows differ. In the example file below, I've used "," to indicate
column
separators. In this example, I'd like to identify that row 1 and row
5 are
the same, and row 2 and row 4 are teh same.
Any help much appreciated. Also, any comments on what the command
lines do
would be fantastic.
Thanks!!
example file:
0,0,1,0,1,0,0
0,1,0,0,0,0,1
0,0,0,0,0,0,0
0,1,0,0,0,0,1
0,0,1,0,1,0,0
0,0,0,1,0,0,0
example request output:
1
2
3
2
1
4
If you use apply by rows with paste and a collapse argument you can
get a text column. Using factor on that text column and then setting
levels=unique(fac) one can extract the ordered elements with
as.numeric(fac).
On a dataframe, rrr, with those elements and such a factor, fac:
> as.numeric(factor(rrr$fac, levels=unique(rrr$fac)))
[1] 1 2 3 2 1 4
One needs to use factor a second time because the levels after the
first call were set to an alpha-sorted version of fac.
--
David Winsemius, MD
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.