[R] Normality tests
tobias.verbeke at telenet.be
Sat Aug 4 21:40:26 CEST 2007
I'm sorry, I pressed a wrong button and
sent an incomplete answer. Below follows
the completed e-mail.
> I am new to R, and I am writing to seek your advice on how best to use it to run
> R's various normality tests in an automated way.
> In a nutshell, my situation is as follows. I work in an investment bank, and my
> team and I are concerned that the assumption we make in our models that the
> returns of assets are normally distributed may not be justified for certain
> asset classes. We are keen to check this statistically.
> To this end, we have an Excel document which contains historical data on the
> returns of the asset classes we want to investigate, and we would like to run
> R's multiple normality tests on these data to check whether any asset classes
> are flagged up as being statistically non-normal.
> I see from the R documentation that there are several R commands to test for
> this, but is it possible to progamme a tool which can (i) convert the Excel data
> into a format which R can read, then (ii) run all the relevant tests from R,
> then (iii) compare the results (such as the p-values) with a user-defined
> benchmark, and (iv) output a file which shows for each asset class, which tests
> reveal that the null hypothesis of normality is rejected?
The short answer is `yes, this is perfectly possible' by putting all
the pieces in an R script file and sourcing it or processing it in
ad (i): there are several ways of accessing Excel files.
Using RODBC is one of them. Section 8 of the R
Data Import / Export gives an overview of all options.
Here's a simple example for RODBC:
z <- odbcConnectExcel("rexceltest.xls")
dd <- sqlFetch(z, "Sheet1")
ad (ii): this is a matter of conducting the tests and storing
(what you would like to keep from) the test results in an
appropriate data structure.
ad (iii): should be straightforward as well.
ad (iv): you did not specify the output format, but R could
write to a.o. a text file, an HTML file, a LaTeX file
and if needed an Excel file. Relevant packages include
xtable, R2HTML and rcom.
P.S. It is always a good idea to define small functions for each step in
the process and then use these in the function definition of one big
function that would be something like
checkAssetNormality(file = "myassets.xls, otherarg1, otherarg2,
outfile = "res_myassets.html",
outdir = ".")
P.P.S. R has very neat and powerful graphical capabilities. It is
quite easy to rapidly produce large grids of QQ-plots for
all the assets concerned. This would give you additional
information about the nature of the deviation from normality.
> My team and I would be very grateful for your advice on this.
> Yours sincerely,
> R-help at stat.math.ethz.ch mailing list
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help