Is this what you were looking for as output. You did not show what the output would look like:
> x var1 var2 X. varN 1 122 nnn1 … 1 2 213 nnn2 … 2 3 422 nnn4 … 2 4 432 … … 3 5 441 … … 4 6 500 … … 4 7 550 … … 4 > str(x) 'data.frame': 7 obs. of 4 variables: $ var1: int 122 213 422 432 441 500 550 $ var2: Factor w/ 4 levels "…","nnn1","nnn2",..: 2 3 4 1 1 1 1 $ X. : Factor w/ 1 level "…": 1 1 1 1 1 1 1 $ varN: int 1 2 2 3 4 4 4 > x$newCol <- ave(x$var1, x$varN, FUN=sum) > x var1 var2 X. varN newCol 1 122 nnn1 … 1 122 2 213 nnn2 … 2 635 3 422 nnn4 … 2 635 4 432 … … 3 432 5 441 … … 4 1491 6 500 … … 4 1491 7 550 … … 4 1491 > On Tue, Apr 26, 2011 at 6:31 PM, петрович <bist...@gmail.com> wrote: > Hey Everyone! > I´m a quite new R user .. I found a problem that I'd like to share with you > and help me find a solution. > I have a large txt. file which I opened with read.table command, and what I > understood from many R manuals is that I have a kind of matrix readed with > read.table, > I've used order() to sort my data and now my problem is: I have a variable > that has many repeated values and I would like to operate with the row > indexes of "these repeated values": for example, suppose I have: > > var1 var2 … varN > 122 nnn1 … 1 > 213 nnn2 … 2 > 422 nnn4 … 2 > 432 … … 3 > 441 … … 4 > 500 … … 4 > 550 … … 4 > > So I want to obtain a new column where all elements of var1 are added at the > places where varN are repetead ... so for varN=2 the new column correspond > to this element will be 213+422, for varN=4 will be 441+500+550, where there > is no such repeated values obviously there´s nothing to do and varN is the > unique value. > I made a function to do this but is not so good, (I hava a database with > around 1 million rows and 5 columns) actually, this function works for not > so large data: > > suma.rep=function(X,Y){ > resp=numeric(0) > Z=unique(Y) > for (i in (1:length(Z))) > resp=c(resp,sum(X[which(Y==Z[i])])) > return(resp)} > > When I run this function with my large data, R appears calculating and I > think it would take so long to make my new required column.(maybe 4 days) > Question1: I "feel" that maybe there's a command that could help me to do > this "simple" operation more elegant, I googled it but I couldnt find... Is > there any such a command? > Question2: Is a good idea to handle large data bases files with R, as in my > example? > > Thank you so much for your help. > Christian Paúl > > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.