Here's another approach: # First put the data in a format that is easier to transmit using dput(): dta <- structure(list(Country = structure(c(1L, 3L, 2L, 1L, 3L, 2L, 1L, 2L, 1L, 3L, 2L, 1L, 3L), .Label = c("AE", "CN", "DE"), class = "factor"), Product = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 1L, 1L, 1L, 2L, 2L), Price = c(20L, 20L, 28L, 28L, 28L, 22L, 28L, 28L, 20L, 20L, 28L, 28L, 28L), Year_Month = c(201204L, 201204L, 201204L, 201204L, 201204L, 201204L, 201204L, 201204L, 201205L, 201205L, 201205L, 201205L, 201205L)), .Names = c("Country", "Product", "Price", "Year_Month"), class = "data.frame", row.names = c(NA, -13L)) # Then > grp <- aggregate(Price~Product+Year_Month, dta, function(x) c(sum(x), + length(x))) > dtmrg <- merge(dta, grp, by=c("Product", "Year_Month")) > newdta <- with(dtmrg, data.frame(Country, Product, Price=Price.x, + Year_Month, Price_average_in_other_area=(Price.y[,1]-Price.x)/ + (Price.y[,2]-1))) > newdta Country Product Price Year_Month Price_average_in_other_area 1 AE 1 20 201204 24 2 DE 1 20 201204 24 3 CN 1 28 201204 20 4 AE 1 20 201205 24 5 DE 1 20 201205 24 6 CN 1 28 201205 20 7 AE 2 28 201204 25 8 DE 2 28 201204 25 9 CN 2 22 201204 28 10 AE 2 28 201205 28 11 DE 2 28 201205 28 12 AE 3 28 201204 28 13 CN 3 28 201204 28
------------------------------------- David L Carlson Department of Anthropology Texas A&M University College Station, TX 77840-4352 -----Original Message----- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of John McKown Sent: Friday, August 1, 2014 9:07 AM To: Lingyi Ma Cc: r-help Subject: Re: [R] How to avoid the three loops in R? On Fri, Aug 1, 2014 at 6:41 AM, Lingyi Ma <lingyi...@gmail.com> wrote: > I have the following data set: > > Country Product Price Year_Month > AE 1 20 201204 > DE 1 20 201204 > CN 1 28 201204 > AE 2 28 201204 > DE 2 28 201204 > CN 2 22 201204 > AE 3 28 201204 > CN 3 28 201204 > AE 1 20 201205 > DE 1 20 201205 > CN 1 28 201205 > AE 2 28 201205 > DE 2 28 201205 > > I want to create the one more column which is "The average price of the > product in other areas". > in other word, for each month, for each product, I calculate the average of > such product in the other area. > > I want sth like: > > Country Product Price Year_Month Price_average_In_Other_area > AE 1 20 201204 14 > AE 2 28 201204 25 The output above looks wrong. The Price_average_In_Other_area for AE, product 1 should be 24? My possible solution: # Initialize data.frame & call it "x". Country <- c("AE","DE","CN","AE","DE","CN","AE","CN","AE","DE","CN","AE","DE"); Product <- c(1,1,1,2,2,2,3,3,1,1,1,2,2); Price <- c(20,20,28,28,28,22,28,28,20,20,28,28,28); Year_Month <- c(201204,201204,201204,201204,201204,201204,201204,201204,201205,201205,201205,201205,201205); x <- data.frame(Country,Product,Price,Year_Month,stringsAsFactors=FALSE); # # library("dplyr"); # # Get the total Price of all Products and number of Products for each Product & Year_Month" y <- summarize(group_by(x, Product, Year_Month),sumPrice=sum(Price),NoPrice=length(Price)); # # Merge the above data back into the original data.frame, based on # Product and Year_Month (similar to SQL inner join). x <- merge(x=x,y=y); # # Now calculate the "other area" average by subtracting the cost in this area # from the total cost in all areas and divide by the number of areas, minus one. # Please note that if a Product and Year_Month is unique, i.e. no other areas # for this Product & Year_Month, this will try to divide by zero. # This gives "Inf" as an answer. x$Prive_average_In_Other_area <- (x$sumPrice-x$Price)/(x$NoPrice-1); # Possible alternate to handle above consideration x$Avg_other <- ifelse(x$NoPrice>1,(x$sumPrice-x$Price)/(x$NoPrice-1),NA); > > Please avoid the three for loop, I have tried and it never end. I have > 1070427 rows. Is there better way to speed up my program? -- There is nothing more pleasant than traveling and meeting new people! Genghis Khan Maranatha! <>< John McKown ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.