On Sep 8, 2011, at 19:03 , Albyn Jones wrote: > Look at > > area <- sum(a$y)*(a$x[1]-a$y[2]) > > The problem appears to be "a$x[1]-a$y[2]"; that is not the length of > the base of an approximating rectangle, whatever it is :-)
I would assume that that is just a typo for a$x[1]-a$x[2], which makes sense if the x grid is regular. The real issue is that from= and to= only restrict the range over which the estimated density is _calculated_ not the one over which it is defined. I.e. there are bits of the density "sticking out" of the interval from 0 to 1. So the integral is naturally less than 1. To see the point, try (with "a" from the original question). plot(a) lines(density(data, from=.4, to=.6), lwd=5) Greg Snow had the advice how really to estimate a density on a restricted region. > > albyn > > On Thu, Sep 08, 2011 at 11:36:23AM -0400, Gonçalo Ferraz wrote: >> Hi, I have a vector 'data' of 58 probability values (bounded between 0 and >> 1) and want to draw a probability density function of these values. For >> this, I used the commands: >> >> data <- runif(58) >> >> a <- density(data, from=0, to=1) >> plot(a, type="l",lwd=3) >> >> But then, when I try to approximate the area under the plotted curve with >> the command: >> >> area <- sum(a$y)*(a$x[1]-a$y[2]) >> >> I get an area that is clearly smaller than 1. >> >> Strangely, if I don't bound the density function with 'to=0,from=1' (which >> is against my purpose because it extends the pdf beyond the limits of a >> probability value), I get an area of 1.000. This suggests that I am >> computing the area well, but using the density function improperly. >> >> Why is this happening? Does anyone know how to constrain the density >> function while still getting a true pdf (summing to 1 under the curve) at >> the end? Should I use a different function? I read through the density >> function notes but could not figure out a solution. >> >> Thank you! >> >> Gonçalo >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > -- > Albyn Jones > Reed College > jo...@reed.edu > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd....@cbs.dk Priv: pda...@gmail.com "Døden skal tape!" --- Nordahl Grieg ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.