[R] Problem with R "density" function

Martyn Byng martyn.byng at nag.co.uk
Wed May 14 11:57:39 CEST 2014


Hi,

Have you tried using a different bandwidth rather than the number of points,  the default bandwidth gives ...

x <- rnorm(10000)
dd <- density(x,kernel="epanechnikov",n=101)
sum(dd$y)*(dd$x[2]-dd$x[1])
[1] 1.001014

Martyn
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of DHIMAN BHADRA
Sent: 14 May 2014 10:36
To: r-help at r-project.org
Subject: [R] Problem with R "density" function

Hello,
My friend has the following issue with R. I will be glad to receive any response.
Thanks,
Dhiman Bhadra

Hello everyone,

I am trying to use the 'density' function available with the base package of R to estimate the density of a data set for subsequent use. I just noticed that with even 1000 data points, the numerical integral of the estimated density using the Epanechnikov kernel is far from 1. I wonder if I am doing something wrong, or whether there is a bug:

x=rnorm(10000)
> dd=density(x,kernel="epanechnikov",n=101,bw=0.001)
> sum(dd$y)*(dd$x[2]-dd$x[1])
[1] 5.7245

> dd=density(x,kernel="epanechnikov",n=1001,bw=0.001)
> sum(dd$y)*(dd$x[2]-dd$x[1])
[1] 2.870922

> dd=density(x,kernel="epanechnikov",n=10001,bw=0.001)
> sum(dd$y)*(dd$x[2]-dd$x[1])
[1] 0.9989762

So unless I use around 10000 or more data points, the integral is wrong:
there seems to be a scaling factor creeping in. Am I missing something?


Best regards,
*Apratim Guha*

__________________________________________________________________________
*Dr. Apratim Guha*
*Associate Professor, Production & Quantitative Methods Area, IIM Ahmedabad, *

*Vastrapur, Ahmedabad 380015, INDIA. Phone: (91) 79 6632 4803*
*Secretary: Ms. Sujatha Jayprakash: (91) 79 6632 4911*

	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

________________________________________________________________________
This e-mail has been scanned for all viruses by Star.\ _...{{dropped:3}}



More information about the R-help mailing list