[R-sig-ME] Variable transformation and back transformation

Juan Pedro Steibel steibelj at msu.edu
Thu Mar 19 13:54:01 CET 2009


Hello Christina,
Dr. Bates already gave an excelent explanation with some examples were 
transformed variables are widely accepted.

My experience comes from working with gene expression data, where 
colleagues like to express comparisons as "Fold-change" or ratios. In 
that case, working with logs makes much more sense too. As stated before 
concentrations tend to follow multiplicative models (that is implicit in 
the use of Ratios for comparisons) so the log brings everything to the 
additive scale. Plus the Gausian mixed effects model usually fits 
log-concentration reasonably well.

So I commonly analyze log-concentrations and provide all results in the 
log-scale.

Sometimes I am asked to report values in the Fold Change scale (ratios). 
What I do then is to plot everything in the log-scale and attach as a 
second scale the ratio or "back-transformed" scale (commonly to the 
right axis of the plot). That seems to appease even the most 
"anti-logarithmic people".

The good thing about this approach is that you do not have to deal with 
back-transforming the mean differences (and uncertainty measures), but 
only the scale. for example in the left axis your (log) scale may be: 
-2,-1,0,1,2 and the corresponding back-transformed scale (right axis) in 
log2 (the one used in qPCR data for example) would be: 1/4, 1/2, 1,2,4.  
Moreover, using 1/4 and 1/2 instead of 0.25 and 0.5 seems more appealing 
to biologists as they easily read it as "4-fold down or 2-fold down".

These type of plots are really straight forward to generate using R, 
because you create the plot and add all the annotation needed (second 
scale, etc).

Now if you want to create a table, and report the actual values, AND you 
are forced to go back to the original scale... Then you need to 
back-trasform the results. Back-transforming mean differences or means 
is not a big deal, but providing an uncertainty measure  in the 
backtransformed scale may be. I would recommend in this case to use 
mcmcsamp to tackle the problem. I've never done it, but it should be 
easy to generate a sample from the posterior distribution of 
parameters,  compute the contrast of interest (mean differences) and 
back-transform and summarize them (mean, CI, etc, etc).

In any case you do not have to change your model, if you do it and 
ignore random effects, your SE and other uncertainty measures will be in 
the desired scale, but most likely the inferences will be incorrect.

Hope this helps,
Cheers,
JP



Christina Bogner wrote:
> Dear all,
>
> I have fitted a couple of mixed-effects models to environmental data 
> (chemical and physical soil parameters) with log-transformed dependent 
> variables. I tried generalized mixed-models, but the results were not 
> satisfactory (probably because I am a soil scientist and not a 
> statistician ;-)) Now, as log of concentrations are ecologically not 
> very informative, I would like to back-transform my model parameters. 
> Taking a Gaussian linear mixed-model:
>
> log(Mg2)=intercept+beta1*Silt+beta2*Soil.depth+beta3*Flow.region+b1*Plot+b2*/Soil.Depth%in%Plot+var 
>
> where Mg2 is the concentration of magnesium, betas are fixed-effects 
> and bs random ones. All independent variables except Silt are factors; 
> Silt is continuous.
>
> I would write:
> Mg2=exp(intercept+beta1*mean(Silt in respective 
> Soil.Depth)+beta3*Flow.region+estimate of b1*Plot + estimate of 
> b2*/Soil.Depth%in%Plot+0.5*var)
> to back-transform to the original scale on the Soil.Depth-level.
>
> To back-transform the fixed-effects only, I would drop the estimates 
> of the random-effects:
> Mg2=exp(intercept+beta1*mean(Silt in respective 
> Soil.Depth)+beta3*Flow.region+ 0.5*var)
>
>
> This approach treats the estimated random effects as dummies, not as 
> an additional variance. Is this right?
>
> Thanks a lot for your help
>
> Christina Bogner
> ------------------------------------------------------------------------
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>   


-- 
=============================
Juan Pedro Steibel

Assistant Professor
Statistical Genetics and Genomics

Department of Animal Science & 
Department of Fisheries and Wildlife

Michigan State University
1205-I Anthony Hall
East Lansing, MI
48824 USA 

Phone: 1-517-353-5102
E-mail: steibelj at msu.edu




More information about the R-sig-mixed-models mailing list