[R]: global and local variables
Uwe Ligges
ligges at statistik.uni-dortmund.de
Tue Dec 9 19:03:59 CET 2003
No direct way, I think. My idea:
Make preprocess() stand-alone. Then vif() and mci() require an object
retruned by preprocess() as an argument.
Or make it object oriented and let vif() and friends decide whether its
first argument needs no preprocessing or not.
Uwe Ligges
allan clark wrote:
> Hi
>
> Thanx for those who responded to my problem. In my previous email I
> tried to ask a general question and probably never explained myself
> correctly. I wanted to prevent sending this long email. My apologies.
>
> This is my actual problem.
>
> I have a regression problem. I am writing some R code in order to
> calculate some collinearity diagnostics. The diagnostics all rely on a
> function named preprocess. I've written the different diagnostics as
> separate functions so that they may be evaluated separately if
> required.
>
> The two functions are named mci and vif. (I will be writing some
> others later)
>
> mci calculates the mixed condition index as well as the condition
> indices of a given X matrix while
> vif calculates the variance inflation factors of the X matrix.
>
> Another function named colldiag has been written. This function will
> calculate all of the collinearity diagnostics by simply calling the
> separate functions defined previously.
>
> I've attached the code of the different functions as well as a data
> file (say a2) below.
>
> The functions mci and vif work perfectly.
>
> i.e.
>
> > mci(a2)
> [1] "DATA MATRIX CENTERED AND SCALED"
> [1] "CENTERED AND SCALED MATRIX = $data"
> [1] "MEANS OF XDATA = $means"
> [1] "STDS OF XDATA = $stds"
> [1] "THE CONDITION NUMBER AND THE CONDITION INDICES"
> $CN
> [1] 27.34412
>
> $CI
> [1] 1.000000 1.615690 27.344123
>
> $MCI
> Principal.Component Singular.Values Condition.Index
> 1 1 1.4720680 1.000000
> 2 2 0.9111078 1.615690
> 3 3 0.0538349 27.344123
>
> > vif(a2)
> [1] "DATA MATRIX CENTERED AND SCALED"
> [1] "CENTERED AND SCALED MATRIX = $data"
> [1] "MEANS OF XDATA = $means"
> [1] "STDS OF XDATA = $stds"
> [1] "THE VARIANCE INFLATION FACTORS"
> $vif
> x1 x2 x3
> 169.3542 175.6667 1.6875
>
> The output from colldiag is as follows:
>
> > colldiag(a2)
> [1] "DATA MATRIX CENTERED AND SCALED"
> [1] "CENTERED AND SCALED MATRIX = $data"
> [1] "MEANS OF XDATA = $means"
> [1] "STDS OF XDATA = $stds"
> [1] "THE CONDITION NUMBER AND THE CONDITION INDICES"
> $CN
> [1] 27.34412
>
> $CI
> [1] 1.000000 1.615690 27.344123
>
> $MCI
> Principal.Component Singular.Values Condition.Index
> 1 1 1.4720680 1.000000
> 2 2 0.9111078 1.615690
> 3 3 0.0538349 27.344123
>
> [1] "DATA MATRIX CENTERED AND SCALED"
> [1] "CENTERED AND SCALED MATRIX = $data"
> [1] "MEANS OF XDATA = $means"
> [1] "STDS OF XDATA = $stds"
> [1] "THE VARIANCE INFLATION FACTORS"
> $vif
> x1 x2 x3
> 169.3542 175.6667 1.6875
>
>
>
>
> Once you check the colldiag code below you will see that it calls mci
> and vif. In both of these functions they call preprocess. This is
> unnecessary. How can I write the code such that R only calls
> preprocess once?
>
> ONCE AGAIN I APOLOGIZE FOR THE LENGTH OF THIS EMAIL!!!
>
>
> Cheers
> Allan
>
>
>
>
>
> The data file:
>
> x1 x2 x3
> 1 20 -4 5
> 2 21 -4 4
> 3 22 -3 3
> 4 23 -2 2
> 5 24 -1 1
> 6 25 0 2
> 7 26 1 3
> 8 27 2 4
> 9 28 3 5
> 10 29 4 6
> 11 20 -4 5
> 12 21 -4 4
> 13 22 -3 3
> 14 23 -2 2
> 15 24 -1 1
> 16 25 0 2
> 17 26 1 3
> 18 27 2 4
> 19 28 3 5
> 20 29 4 6
>
> preprocess<-function (xdata,center=1,scale=1)
> {
> if(center==1 && scale==1)
> {
> means<-apply(xdata,2,mean)
> stds<-apply(xdata,2, function(x) sqrt(var(x)))
> scalefactor<-((nrow(xdata)-1)^.5)*stds
> data.centsca<-sweep(sweep(xdata,2,means,"-"),2,scalefactor,"/")
> print("DATA MATRIX CENTERED AND SCALED")
> print("CENTERED AND SCALED MATRIX = $data")
> print("MEANS OF XDATA = $means")
> print("STDS OF XDATA = $stds")
> list(data=data.centsca,means=means,stds=stds,prep=1)
> }
>
> else if(center==1 && scale==0)
> {
> means<-apply(xdata,2,mean)
> data.cen<-sweep(xdata,2,means,"-")
> print("DATA MATRIX CENTERED")
> list(data=data.cen,means=means,prep=1)
> }
>
> else if(center==0 && scale==1)
> {
> stds<-apply(xdata,2, function(x) sqrt(var(x)))
> scalefactor<-((nrow(xdata)-1)^.5)*stds
> data.sca<-sweep(xdata,2,scalefactor,"/")
> print("DATA MATRIX SCALED")
> list(data=data.sca,stds=stds,prep=1)
> }
>
> else
> {
> print("YOU HAVE TO SPECIFY WHETHER YOU WANT TO SCALE OR CENTER THE
> MATRIX")
> print("THE preprocess FUNCTION HAS THREE ARGUMENTS. i.e.
> preprocess(xdata,center,scale)")
> print("xdata IS THE MATRIX TO BE TRANSFORMED")
> print("TO CENTER SPECIFY center=1")
> print("TO SCALE SPECIFY scale=1")
> }
>
> # A matrix is standardised as follows:
> # X*(i,j) = ( X(i,j)- XBAR(j) ) / ( sqrt(n-1)* STD(j) )
>
> }
>
> mci<-function (xdata)
> {
> a<-preprocess(xdata)
> b<-svd(a$data)
> cn<-(b$d)[1]/(b$d)[ncol(a$data)]
> ci<-(b$d)[1]/(b$d)[1:ncol(a$data)]
>
> #paste("THE CONDITION NUMBER = ",cn)
>
> #the following produces a table in order to output the mci values
> Principal.Component<-1:ncol(a$data)
> Singular.Values<-b$d
> Condition.Index<-ci
> mcitable<-data.frame(Principal.Component,Singular.Values,Condition.Ind
> ex)
>
> print("THE CONDITION NUMBER AND THE CONDITION INDICES")
> d<-list(CN=cn,CI=ci,MCI=mcitable)
> print(d)
> }
>
> vif<-function (xdata)
> {
> a<-preprocess(xdata)
> vif<-diag(solve(cor(a$data)))
>
> print("THE VARIANCE INFLATION FACTORS")
> b<-list(vif=vif)
> b
> }
>
> colldiag<-function (xdata)
> {
> mci(xdata)
> vif(xdata)
> }
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
More information about the R-help
mailing list