[R] Box-Cox Transformation: Drastic differences when varying added constants
Holger Steinmetz
Holger.steinmetz at web.de
Sun May 16 14:22:18 CEST 2010
Dear experts,
I tried to learn about Box-Cox-transformation but found the following thing:
When I had to add a constant to make all values of the original variable
positive, I found that
the lambda estimates (box.cox.powers-function) differed dramatically
depending on the specific constant chosen.
In addition, the correlation between the transformed variable and the
original were not 1 (as I think it should be to use the transformed variable
meaningfully) but much lower.
With higher added values (and a right skewed variable) the lambda estimate
was even negative and the correlation between the transformed variable and
the original varible was -.91!!?
I guess that is something fundmental missing in my current thinking about
box-cox...
Best,
Holger
P.S. Here is what i did:
# Creating of a skewed variable X (mixture of two normals)
x1 = rnorm(120,0,.5)
x2 = rnorm(40,2.5,2)
X = c(x1,x2)
# Adding a small constant
Xnew1 = X +abs(min(X))+ .1
box.cox.powers(Xnew1)
Xtrans1 = Xnew1^.2682 #(the value of the lambda estimate)
# Adding a larger constant
Xnew2 = X +abs(min(X)) + 1
box.cox.powers(Xnew2)
Xtrans2 = Xnew2^-.2543 #(the value of the lambda estimate)
#Plotting it all
par(mfrow=c(3,2))
hist(X)
qqnorm(X)
qqline(X,lty=2)
hist(Xtrans1)
qqnorm(Xtrans1)
qqline(Xtrans1,lty=2)
hist(Xtrans2)
qqnorm(Xtrans2)
qqline(Xtrans2,lty=2)
#correlation among original and transformed variables
round(cor(cbind(X,Xtrans1,Xtrans2)),2)
--
View this message in context: http://r.789695.n4.nabble.com/Box-Cox-Transformation-Drastic-differences-when-varying-added-constants-tp2218490p2218490.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list