This pertains to the first paragraph, you can use Dagostino test which is an omnibus test combining both skewness and kurtosis and has a high power, istead of only skewness of the data. Try ?dagoTest
Ahmed
Leif Kirschenbaum wrote:
I have summary statistics from many sets (10,000's) of near-normal continuous data. From previously generated QQplots of these data I can visually see that most of them are normal with a few which are not normal. I have the raw data for a few (700) of these sets. I have applied several tests of normality, skew, and kurtosis to these sets to see which test might yield a parameter which identifies the sets which are visibly non-normal on the QQplot. My conclusions thus far has been that the skew is the best determinant of non-normality for these particular data.
Given that I do not have ready access to the sets (10,000's) of data, only to summary statistics which have been calculated on these sets, is there a method by which I may estimate the skew given the following summary statistics:
0.1% 1% 5% 10% 25% 75% 90% 95% 99% 99.9% mean median N sigma
N is usually about 900, and so I would discount the 0.1%, 1%, 99%, and 99.9% quantiles as unreliable due to noisiness in the distributions.
I know that for instance there are general rules for calculated sigma of a normal distribution given quantiles, and so am wondering if there are any general rules for calculating skew given a set of quantiles, mean, and sigma. I am currently thinking of trying polynomial fits on the QQplot using the raw data I have and then empirically trying to derive a relationship between the quantiles and the skew.
Thank you for any ideas.
Leif Kirschenbaum
Senior Yield Engineer
Reflectivity, Inc.
(408) 737-8100 x307
leif@reflectivity.com
______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
---------------------------------
[[alternative HTML version deleted]]