[R] Estimation of skewness from quantiles of near-normal distribution

Leif Kirschenbaum leif at reflectivity.com
Thu Mar 23 09:46:20 CET 2006

I have summary statistics from many sets (10,000's) of near-normal continuous data.  From previously generated QQplots of these data I can visually see that most of them are normal with a few which are not normal.  I have the raw data for a few (700) of these sets.  I have applied several tests of normality, skew, and kurtosis to these sets to see which test might yield a parameter which identifies the sets which are visibly non-normal on the QQplot.  My conclusions thus far has been that the skew is the best determinant of non-normality for these particular data.

Given that I do not have ready access to the sets (10,000's) of data, only to summary statistics which have been calculated on these sets, is there a method by which I may estimate the skew given the following summary statistics:
0.1% 1% 5% 10% 25% 75% 90% 95% 99% 99.9% mean median N sigma

N is usually about 900, and so I would discount the 0.1%, 1%, 99%, and 99.9% quantiles as unreliable due to noisiness in the distributions.

I know that for instance there are general rules for calculated sigma of a normal distribution given quantiles, and so am wondering if there are any general rules for calculating skew given a set of quantiles, mean, and sigma.  I am currently thinking of trying polynomial fits on the QQplot using the raw data I have and then empirically trying to derive a relationship between the quantiles and the skew.

Thank you for any ideas.

Leif Kirschenbaum
Senior Yield Engineer
Reflectivity, Inc.
(408) 737-8100 x307
leif at reflectivity.com

More information about the R-help mailing list