Greetings Everyone -

I have a data frame "x" that looks like this:

v1   v2
1     A
1     B
1     B
2     B
2     W
2     W
3     D
3     D
3     Z

What I would like to do is create a new data frame, "y", that has one row
for each unique value of v1, and returns the corresponding mode of v2.  If I
were to run it on the above data frame, it should therefore return:

v1   v2
1     B
2     W
3     D

I've been using the following code:

x <- data.frame(v1 = c(1,1,1,2,2,2,3,3,3), v2 =
c("A","B","B","B","W","W","D","D","Z"))
y <- aggregate.data.frame(x, by = list(x$var1), FUN = "Mode")

which relies on the Mode function from package prettyR.  The above code
works for me.

My problem comes when I use my real database.  Running this produces many
warnings, because there are multiple modes of v2 for many values of v1.  My
database is also rather large (~700,000 rows), and I'm wondering if there is
a faster way to get R to process these data.

Thank you for your help and consideration,

Gabriel Yospin
Center for Ecology and Evolutionary Biology
University of Oregon

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to