Hi David, Thanks. That was enlightening.
Whoop. V On Tue, Apr 14, 2015 at 3:53 PM, David L Carlson <dcarl...@tamu.edu> wrote: > Try all.equal(df[1,3], df[2,3]) > > This relates to how decimal numbers are stored in computers. It is not an > R only issue, but it is described in the R-FAQ: > > From the R-FAQ - http://cran.r-project.org/doc/FAQ/R-FAQ.html > > 7.31 Why doesn't R think these numbers are equal? > > The only numbers that can be represented exactly in R's numeric type are > integers and fractions whose denominator is a power of 2. Other numbers > have to be rounded to (typically) 53 binary digits accuracy. As a result, > two floating point numbers will not reliably be equal unless they have been > computed by the same algorithm, and not always even then. For example > > R> a <- sqrt(2) > R> a * a == 2 > [1] FALSE > R> a * a - 2 > [1] 4.440892e-16 > > The function all.equal() compares two objects using a numeric tolerance of > .Machine$double.eps ^ 0.5. If you want much greater accuracy than this you > will need to consider error propagation carefully. > > For more information, see e.g. David Goldberg (1991), "What Every Computer > Scientist Should Know About Floating-Point Arithmetic", ACM Computing > Surveys, 23/1, 5-48, also available via > http://www.validlab.com/goldberg/paper.pdf. > > To quote from "The Elements of Programming Style" by Kernighan and Plauger: > > 10.0 times 0.1 is hardly ever 1.0. > > > ------------------------------------- > David L Carlson > Department of Anthropology > Texas A&M University > College Station, TX 77840-4352 > > > -----Original Message----- > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Vikram > Chhatre > Sent: Tuesday, April 14, 2015 2:40 PM > To: r-help > Subject: [R] Extracting unique entries by a column > > I have a data frame of dim 3x600. There are pairs of rows which have the > exact same value in column 3. > > head(df) > POP1 POP2 ABSDIFF > L0005.01 0.98484848 0.688118812 0.2967297 > L0005.03 0.01515152 0.311881188 0.2967297 > L0008.02 0.97727273 0.004424779 0.9728479 > L0008.04 0.02272727 0.995575221 0.9728479 > L0012.03 0.98684211 0.004385965 0.9824561 > L0012.01 0.01315789 0.995614035 0.9824561 > > I want to unique sort on df$ABSDIFF so that only one row per pair remains > in the subset. > > >df_subset <- df[df(!duplicated(df$ABSDIFF), ] > > This does not work. So I literally checked: > > >identical(df[1,3], df[2,3]) > FALSE > > How is 0.2967297 different from 0.2967297? I am puzzled. > > Thanks for any insight. > > Vikram > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.