[R] Y axis of 1-D Linear Discriminant Histograms
Bob Farmer
farmerb at dal.ca
Wed Nov 18 18:00:34 CET 2009
Hi all.
I would like to understand what are the units defined on the y-axis
when you plot the one-dimensional predictions (histograms) from lda()
(MASS) discriminant function objects?
While the helpfile suggests that a histogram is returned by default,
the presumably proportion-like values for each group seem to add up to
more than 1, and I'm not sure how to interpret the code from
ldahist(), which, I believe, defines the heights of each bin as
est1/(diff(breaks) * length(data[g == grp]))
where est1 is (as far as I can tell) the frequency within the bin, and
the denominator is apparently the bin width multiplied by the total
sample size for that panel. It seems to be that a far more logical
result would be returned for each bin if the diff(breaks) component
was removed entirely.
While I don't think my concern affects the shape of each group's
histogram, I'd much prefer to display a more intuitive y-axis.
Example:
library(MASS)
ld1<-lda(Species ~ Sepal.Length + Sepal.Width, iris)
plot(ld1, type = "histogram", dimen = 1)
#(eyeballing it suggests that the sum of the "frequencies" reported on
the y-axis for each group exceeds 1)
Thanks very much.
--Bob Farmer
Dalhousie University
More information about the R-help
mailing list