[R] Estimation of skewness from quantiles of near-normal distribution
Leif Kirschenbaum
leif at reflectivity.com
Thu Mar 23 09:46:20 CET 2006
I have summary statistics from many sets (10,000's) of near-normal continuous data. From previously generated QQplots of these data I can visually see that most of them are normal with a few which are not normal. I have the raw data for a few (700) of these sets. I have applied several tests of normality, skew, and kurtosis to these sets to see which test might yield a parameter which identifies the sets which are visibly non-normal on the QQplot. My conclusions thus far has been that the skew is the best determinant of non-normality for these particular data.
Given that I do not have ready access to the sets (10,000's) of data, only to summary statistics which have been calculated on these sets, is there a method by which I may estimate the skew given the following summary statistics:
0.1% 1% 5% 10% 25% 75% 90% 95% 99% 99.9% mean median N sigma
N is usually about 900, and so I would discount the 0.1%, 1%, 99%, and 99.9% quantiles as unreliable due to noisiness in the distributions.
I know that for instance there are general rules for calculated sigma of a normal distribution given quantiles, and so am wondering if there are any general rules for calculating skew given a set of quantiles, mean, and sigma. I am currently thinking of trying polynomial fits on the QQplot using the raw data I have and then empirically trying to derive a relationship between the quantiles and the skew.
Thank you for any ideas.
Leif Kirschenbaum
Senior Yield Engineer
Reflectivity, Inc.
(408) 737-8100 x307
leif at reflectivity.com
More information about the R-help
mailing list