[R] Problem with R "density" function
Martyn Byng
martyn.byng at nag.co.uk
Wed May 14 11:57:39 CEST 2014
Hi,
Have you tried using a different bandwidth rather than the number of points, the default bandwidth gives ...
x <- rnorm(10000)
dd <- density(x,kernel="epanechnikov",n=101)
sum(dd$y)*(dd$x[2]-dd$x[1])
[1] 1.001014
Martyn
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of DHIMAN BHADRA
Sent: 14 May 2014 10:36
To: r-help at r-project.org
Subject: [R] Problem with R "density" function
Hello,
My friend has the following issue with R. I will be glad to receive any response.
Thanks,
Dhiman Bhadra
Hello everyone,
I am trying to use the 'density' function available with the base package of R to estimate the density of a data set for subsequent use. I just noticed that with even 1000 data points, the numerical integral of the estimated density using the Epanechnikov kernel is far from 1. I wonder if I am doing something wrong, or whether there is a bug:
x=rnorm(10000)
> dd=density(x,kernel="epanechnikov",n=101,bw=0.001)
> sum(dd$y)*(dd$x[2]-dd$x[1])
[1] 5.7245
> dd=density(x,kernel="epanechnikov",n=1001,bw=0.001)
> sum(dd$y)*(dd$x[2]-dd$x[1])
[1] 2.870922
> dd=density(x,kernel="epanechnikov",n=10001,bw=0.001)
> sum(dd$y)*(dd$x[2]-dd$x[1])
[1] 0.9989762
So unless I use around 10000 or more data points, the integral is wrong:
there seems to be a scaling factor creeping in. Am I missing something?
Best regards,
*Apratim Guha*
__________________________________________________________________________
*Dr. Apratim Guha*
*Associate Professor, Production & Quantitative Methods Area, IIM Ahmedabad, *
*Vastrapur, Ahmedabad 380015, INDIA. Phone: (91) 79 6632 4803*
*Secretary: Ms. Sujatha Jayprakash: (91) 79 6632 4911*
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
________________________________________________________________________
This e-mail has been scanned for all viruses by Star.\ _...{{dropped:3}}
More information about the R-help
mailing list