In realy, values in a will be not integers, but numeric. They will never be identical, but it could be that they are pretty close - I don't know after how many points after the comma matter. Dimitri
On Wed, Jan 30, 2013 at 2:06 PM, arun <smartpink...@yahoo.com> wrote: > Hi, > Any chance x$a to have the same number repeated? > > If `Item` and `a` are unique, I guess both the solutions should work. > > set.seed(1851) > x<- > data.frame(item=sample(letters[1:20],20,replace=F),a=sample(1:45,20,replace=F),b=sample(20:50,20,replace=F),stringsAsFactors=F) > y<- data.frame(item="z",a=3,b=10,stringsAsFactors=F) > > x[intersect(which(x$a < y$a),which.min(x$a)),] > # item a b > #17 c 1 48 > x[x$a==which.min(x$a[x$a<y$a]),] > # item a b > #17 c 1 48 > #or > > x[x$a%in%which.min(x$a[x$a<y$a]),] > # item a b > #17 c 1 48 > > x[x$a%in%which.min(x$a[x$a<y$a]),]<-y > > tail(x) > # item a b > #15 q 45 30 > #16 g 10 23 > #17 z 3 10 > #18 r 15 39 > #19 l 18 45 > #20 t 35 33 > > #However, if `item` column is unique, but `a` is not, then the one I > mentioned previously arise. > set.seed(1851) > x1<- > data.frame(item=sample(letters[1:20],20,replace=F),a=sample(1:10,20,replace=T),b=sample(20:50,20,replace=F),stringsAsFactors=F) > y1<- data.frame(item="z",a=3,b=10,stringsAsFactors=F) > > > x1[intersect(which(x1$a < y1$a),which.min(x1$a)),] > # item a b > #3 s 1 41 > x1[x1$a==which.min(x1$a[x1$a<y1$a]),] > # item a b > #3 s 1 41 > #11 h 1 46 > #17 c 1 48 > x1[x1$a==which.min(x1$a[x1$a<y1$a]),]<- y1 > A.K. > > > ________________________________ > From: Dimitri Liakhovitski <dimitri.liakhovit...@gmail.com> > To: arun <smartpink...@yahoo.com> > Cc: R help <r-help@r-project.org>; Jessica Streicher < > j.streic...@micromata.de> > Sent: Wednesday, January 30, 2013 1:49 PM > Subject: Re: [R] Fastest way to compare a single value with all values in > one column of a data frame > > > Sorry - I should have clarified: > My identifiers (in column "item") will always be unique. In other words, > one entry in column "item" will never be repeated - neither in x nor in y. > Dimitri > > > On Wed, Jan 30, 2013 at 1:27 PM, Dimitri Liakhovitski < > dimitri.liakhovit...@gmail.com> wrote: > > Thank you, everyone! I'll try to test those different approaches. Really > appreciate your help! > >Dimitri > > > > > >On Wed, Jan 30, 2013 at 11:03 AM, arun <smartpink...@yahoo.com> wrote: > > > >HI, > >> > >>Sorry, my previous solution doesn't work. > >>This should work for your dataset: > >>set.seed(1851) > >>x<- > data.frame(item=sample(letters[1:5],20,replace=TRUE),a=sample(1:15,20,replace=TRUE),b=sample(20:30,20,replace=TRUE),stringsAsFactors=F) > >>y<- data.frame(item="f",a=3,b=10,stringsAsFactors=F) > >> x[x$a%in%which.min(x[x$a<y$a,]$a),]<- y #if there are multiple minimum > values > >> > >>set.seed(1241) > >>x1<- > data.frame(item=sample(letters[1:10],1e4,replace=TRUE),a=sample(1:30,1e4,replace=TRUE),b=sample(1:100,1e4,replace=TRUE),stringsAsFactors=F) > >>y1<- data.frame(item="f",a=3,b=10,stringsAsFactors=F) > >>length(x1$a[x1$a==1]) > >>#[1] 330 > >> system.time({x1[x1$a%in%which.min(x1[x1$a<y1$a,]$a),]<- y1}) > >># user system elapsed > >> # 0.000 0.000 0.001 > >>length(x1$a[x1$a==1]) > >>#[1] 0 > >> > >> > >>#For some reason, it is not working when the multiple number of minimum > values > some value > >> > >>set.seed(1241) > >>x1<- > data.frame(item=sample(letters[1:10],1e5,replace=TRUE),a=sample(1:30,1e5,replace=TRUE),b=sample(1:100,1e5,replace=TRUE),stringsAsFactors=F) > >>y1<- data.frame(item="f",a=3,b=10,stringsAsFactors=F) > >>length(x1$a[x1$a==1]) > >>#[1] 3404 > >>x1[x1$a%in%which.min(x1[x1$a<y1$a,]$a),]<- y1 > >> length(x1$a[x1$a==1]) > >>#[1] 3404 #not getting replaced > >> > >>#However, if I try: > >>set.seed(1241) > >> x1<- > data.frame(item=sample(letters[1:10],1e6,replace=TRUE),a=sample(1:5000,1e6,replace=TRUE),b=sample(1:100,1e6,replace=TRUE),stringsAsFactors=F) > >> y1<- data.frame(item="f",a=3,b=10,stringsAsFactors=F) > >> length(x1$a[x1$a==1]) > >>#[1] 208 > >> system.time(x1[x1$a%in%which.min(x1[x1$a<y1$a,]$a),]<- y1) > >>#user system elapsed > >> # 0.124 0.016 0.138 > >> length(x1$a[x1$a==1]) > >>#[1] 0 > >> > >> > >>#Tried Jessica's solution: > >>set.seed(1851) > >> x<- > data.frame(item=sample(letters[1:5],20,replace=TRUE),a=sample(1:15,20,replace=TRUE),b=sample(20:30,20,replace=TRUE),stringsAsFactors=F) > >> y<- data.frame(item="f",a=3,b=10,stringsAsFactors=F) > >> x[intersect(which(x$a < y$a),which.min(x$a)),] <- y > >> > >> x > >># item a b > >>#1 a 8 25 > >>#2 a 10 26 > >>#3 f 3 10 #replaced > >>#4 e 15 26 > >>#5 b 13 20 > >>#6 a 5 23 > >>#7 d 4 29 > >>#8 e 2 24 > >>#9 c 7 30 > >>#10 e 14 24 > >>#11 d 2 20 > >>#12 e 10 21 > >>#13 c 13 27 > >>#14 d 12 23 > >>#15 b 11 26 > >>#16 e 5 22 > >>#17 c 1 26 #it is not replaced > >>#18 a 8 21 > >>#19 e 10 26 > >>#20 c 2 22 > >> > >> > >> > >> > >>A.K. > >> > >> > >> > >> > >> > >>----- Original Message ----- > >>From: Dimitri Liakhovitski <dimitri.liakhovit...@gmail.com> > >>To: r-help <r-help@r-project.org> > >>Cc: > >>Sent: Tuesday, January 29, 2013 4:11 PM > >>Subject: [R] Fastest way to compare a single value with all values in > one column of a data frame > >> > >> > >>Hello! > >> > >>I have a large data frame x: > >>x<-data.frame(item=letters[1:5],a=1:5,b=11:15) # in actuality, x has > 1000 > >>rows > >>x$item<-as.character(x$item) > >>I also have a small data frame y with just 1 row: > >>y<-data.frame(item="f",a=3,b=10) > >>y$item<-as.character(y$item) > >> > >>I have to decide if y$a is larger than the smallest of all the values in > >>x$a. If it is, I want y to replace the whole row in x that has the lowest > >>value in column a. > >>This is how I'd do it. > >> > >>if(y$a>min(x$a)){ > >> whichmin<-which(x$a==min(x$a)) > >> x[whichmin,]<-y[1,] > >>} > >> > >> > >>I am wondering if there is a faster way of doing it. What would be the > >>fastest possible way? I'd have to do it, unfortunately, many-many times. > >> > >>Thank you very much! > >> > >>-- > >>Dimitri Liakhovitski > >> > >>gfk.com <http://marketfusionanalytics.com/> > >> > >> [[alternative HTML version deleted]] > >> > >>______________________________________________ > >>R-help@r-project.org mailing list > >>https://stat.ethz.ch/mailman/listinfo/r-help > >>PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > >>and provide commented, minimal, self-contained, reproducible code. > >> > >> > > > > > >-- > > > >Dimitri Liakhovitski > >gfk.com > > > -- > > Dimitri Liakhovitski > gfk.com > -- Dimitri Liakhovitski gfk.com <http://marketfusionanalytics.com/> [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.