[R] Density function: Area under density plot is not equal to 1. Why?

Thu Sep 8 18:57:57 CEST 2011

For bounded density estimation look at the logspline package instead of the regular density function.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Gonçalo Ferraz
> Sent: Thursday, September 08, 2011 9:36 AM
> To: r-help at r-project.org
> Subject: [R] Density function: Area under density plot is not equal to
> 1. Why?
> 
> Hi, I have a vector 'data' of 58 probability values (bounded between 0
> and 1) and want to draw a probability density function of these values.
> For this, I used the commands:
> 
> data <- runif(58)
> 
> a <- density(data, from=0, to=1)
> plot(a, type="l",lwd=3)
> 
> But then, when I try to approximate the area under the plotted curve
> with the command:
> 
> area <- sum(a$y)*(a$x[1]-a$y[2])
> 
> I get an area that is clearly smaller than 1.
> 
> Strangely, if I don't bound the density function with 'to=0,from=1'
> (which is against my purpose because it extends the pdf beyond the
> limits of a probability value), I get an area of 1.000. This suggests
> that I am computing the area well, but using the density function
> improperly.
> 
> Why is this happening? Does anyone know how to constrain the density
> function while still getting a true pdf (summing to 1 under the curve)
> at the end? Should I use a different function? I read through the
> density function notes but could not figure out a solution.
> 
> Thank you!
> 
> Gonçalo
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.