Can it be this: foo<-tapply(d$tt, d$v, min) data.frame(v=names(foo), tt=foo)
On Sat, May 17, 2008 at 10:56 PM, jim holtman <[EMAIL PROTECTED]> wrote: > Is this what you want: > > > v<-c(rep("v1",3), rep("v2",4), rep("v3",2),"v4",rep("v5",6)) > > > > tt<-c(1,2,3,3,1,2,3,4,5,2,7,9,2,3,1,4) > > d<-data.frame(v,tt) > > do.call(rbind, lapply(split(d, d$v), function(x){ > + x[which.min(x$tt),] > + })) > v tt > v1 v1 1 > v2 v2 1 > v3 v3 4 > v4 v4 2 > v5 v5 1 > > > > > > > On Sat, May 17, 2008 at 3:48 PM, souvik banerjee <[EMAIL PROTECTED]> > wrote: > > > Hi, > > I am facing a problem in data manipulation. Suppose a data > frame > > contains two columns. The first column consists of some repeated > characters > > and the second consists of some numerical values. The problem is to > extract > > and create a new data frame consisting of rows of each unique character > of > > first column with minimum second column entry. For example if "d" is the > > data frame, created with the following R code > > > > > > v<-c(rep("v1",3), rep("v2",4), rep("v3",2),"v4",rep("v5",6)) > > > > tt<-c(1,2,3,3,1,2,3,4,5,2,7,9,2,3,1,4) > > d<-data.frame(v,tt) > > > > then the answer would be > > > > > > v tt > > > > v1 1 > > > > v2 1 > > > > v3 4 > > > > v4 2 > > > > v5 1 > > > > > > > > I have written a small R code given below that does the job (assumming > "d" > > to the initial data frame) > > > > > > > > b<-data.frame(NULL) > > > > i<-1 > > > > x<-d[1,] > > > > while(i<dim(d)[1]) > > > > { > > > > if(length(unique(x[,1]))==1) > > > > { > > > > x<-rbind(x,d[i+1,]) > > > > i=i+1 > > > > } > > > > if(length(unique(x[,1]))>1) > > > > { > > > > y<-x[1:(nrow(x)-1),] > > > > z<-which(y[,2]==min(y[,2])) > > > > b<-rbind(b,y[z,]) > > > > x<-d[i,] > > > > } > > > > } > > > > z<-which(x[,2]==min(x[,2])) > > > > b<-rbind(b,x[z,]) > > > > b > > > > > > > > The code is working properly giving me the desired result, but the > problem > > is that I have to repeat this procedure for many data frames and nearly > > all > > the data frame contains approximately 15,000 repeated characters with > more > > than 12,500 unique characters. Using the above code in a loop is taking a > > considerable amount of time to compute. > > Can anybody suggest me of a faster approach? > > > > Regards > > > > Souvik Bandyopadhyay > > Research Fellow, > > Dept Of Statistics > > Calcutta University > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html< > http://www.r-project.org/posting-guide.html> > > and provide commented, minimal, self-contained, reproducible code. > > > > > > -- > Jim Holtman > Cincinnati, OH > +1 513 646 9390 > > What is the problem you are trying to solve? > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.