Re: [R] identify duplicate entries in data frame and calculate mean

2016-05-24 Thread Matthew
Thank you very much, Dan. These work great. Two more great answers to my question. Matthew On 5/24/2016 4:15 PM, Nordlund, Dan (DSHS/RDA) wrote: You have several options. 1. You could use the aggregate function. If your data frame is called DF, you could do something like with(DF, aggreg

Re: [R] identify duplicate entries in data frame and calculate mean

2016-05-24 Thread Matthew
Thanks, Tom. I was making a mistake looking at your example and that's what my problem was. Cool answer, works great. Thank you very much. Matthew On 5/24/2016 4:23 PM, Tom Wright wrote: > Don't see that as being a big problem. If your data grows then dplyr > supports connections to external

Re: [R] identify duplicate entries in data frame and calculate mean

2016-05-24 Thread Tom Wright
Don't see that as being a big problem. If your data grows then dplyr supports connections to external databases. Alternately if you just want a mean, most databases can do that directly in SQL. On Tue, May 24, 2016 at 4:17 PM, Matthew wrote: > Thank you very much, Tom. > This gets me thinking in

Re: [R] identify duplicate entries in data frame and calculate mean

2016-05-24 Thread Matthew
Thank you very much, Tom. This gets me thinking in the right direction. One thing I should have mentioned that I did not is that the number of rows in the data frame will be a little over 40,000 rows. On 5/24/2016 4:08 PM, Tom Wright wrote: > Using dplyr > > $ library(dplyr) > $ x<-data.frame(Len

Re: [R] identify duplicate entries in data frame and calculate mean

2016-05-24 Thread Nordlund, Dan (DSHS/RDA)
You have several options. 1. You could use the aggregate function. If your data frame is called DF, you could do something like with(DF, aggregate(Length, list(Identifier), mean)) 2. You could use the dplyr package like this library(dplyr) summarize(group_by(DF, Identifier), mean(Length)

Re: [R] identify duplicate entries in data frame and calculate mean

2016-05-24 Thread Tom Wright
Using dplyr $ library(dplyr) $ x<-data.frame(Length=c(321,350,340,180,198), ID=c(rep('A234',3),'B123','B225') ) $ x %>% group_by(ID) %>% summarise(m=mean(Length)) On Tue, May 24, 2016 at 3:46 PM, Matthew wrote: > I have a data frame with 10 columns. > In the last colum