[R] 2 D density plot interpretation and manipulating the data
Abby Spurdle
@purd|e@@ @end|ng |rom gm@||@com
Sat Oct 10 02:22:25 CEST 2020
> SNP$density <- get_density(SNP$mean, SNP$var)
> > summary(SNP$density)
> Min. 1st Qu. Median Mean 3rd Qu. Max.
> 0 383 696 738 1170 1789
This doesn't look accurate.
The density values shouldn't all be integers.
And I wouldn't expect the smallest density to be zero, if using a
Gaussian kernel.
Are these values rounded or formatted?
(Recombined Excerpts)
> and keep only entries with density > 400
> a=SNP[SNP$density>400,]
> Any idea how do I interpret data points that are left contained within
the ellipses?
Reiterating, they're contour lines, but they should *not* be ellipses.
You could work out the proportion of "densities" > 400.
d <- SNP$density
p.remain <- length (d [d > 400]) / length (d)
p.remain
Or a more succinct version:
p.remain <- sum (SNP$density > 400) / nrow (SNP)
Then you can say that you've plotted data with the highest <p.remain> densities.
More information about the R-help
mailing list