[R-sig-ME] Variable transformation and back transformation
Juan Pedro Steibel
steibelj at msu.edu
Thu Mar 19 13:54:01 CET 2009
Hello Christina,
Dr. Bates already gave an excelent explanation with some examples were
transformed variables are widely accepted.
My experience comes from working with gene expression data, where
colleagues like to express comparisons as "Fold-change" or ratios. In
that case, working with logs makes much more sense too. As stated before
concentrations tend to follow multiplicative models (that is implicit in
the use of Ratios for comparisons) so the log brings everything to the
additive scale. Plus the Gausian mixed effects model usually fits
log-concentration reasonably well.
So I commonly analyze log-concentrations and provide all results in the
log-scale.
Sometimes I am asked to report values in the Fold Change scale (ratios).
What I do then is to plot everything in the log-scale and attach as a
second scale the ratio or "back-transformed" scale (commonly to the
right axis of the plot). That seems to appease even the most
"anti-logarithmic people".
The good thing about this approach is that you do not have to deal with
back-transforming the mean differences (and uncertainty measures), but
only the scale. for example in the left axis your (log) scale may be:
-2,-1,0,1,2 and the corresponding back-transformed scale (right axis) in
log2 (the one used in qPCR data for example) would be: 1/4, 1/2, 1,2,4.
Moreover, using 1/4 and 1/2 instead of 0.25 and 0.5 seems more appealing
to biologists as they easily read it as "4-fold down or 2-fold down".
These type of plots are really straight forward to generate using R,
because you create the plot and add all the annotation needed (second
scale, etc).
Now if you want to create a table, and report the actual values, AND you
are forced to go back to the original scale... Then you need to
back-trasform the results. Back-transforming mean differences or means
is not a big deal, but providing an uncertainty measure in the
backtransformed scale may be. I would recommend in this case to use
mcmcsamp to tackle the problem. I've never done it, but it should be
easy to generate a sample from the posterior distribution of
parameters, compute the contrast of interest (mean differences) and
back-transform and summarize them (mean, CI, etc, etc).
In any case you do not have to change your model, if you do it and
ignore random effects, your SE and other uncertainty measures will be in
the desired scale, but most likely the inferences will be incorrect.
Hope this helps,
Cheers,
JP
Christina Bogner wrote:
> Dear all,
>
> I have fitted a couple of mixed-effects models to environmental data
> (chemical and physical soil parameters) with log-transformed dependent
> variables. I tried generalized mixed-models, but the results were not
> satisfactory (probably because I am a soil scientist and not a
> statistician ;-)) Now, as log of concentrations are ecologically not
> very informative, I would like to back-transform my model parameters.
> Taking a Gaussian linear mixed-model:
>
> log(Mg2)=intercept+beta1*Silt+beta2*Soil.depth+beta3*Flow.region+b1*Plot+b2*/Soil.Depth%in%Plot+var
>
> where Mg2 is the concentration of magnesium, betas are fixed-effects
> and bs random ones. All independent variables except Silt are factors;
> Silt is continuous.
>
> I would write:
> Mg2=exp(intercept+beta1*mean(Silt in respective
> Soil.Depth)+beta3*Flow.region+estimate of b1*Plot + estimate of
> b2*/Soil.Depth%in%Plot+0.5*var)
> to back-transform to the original scale on the Soil.Depth-level.
>
> To back-transform the fixed-effects only, I would drop the estimates
> of the random-effects:
> Mg2=exp(intercept+beta1*mean(Silt in respective
> Soil.Depth)+beta3*Flow.region+ 0.5*var)
>
>
> This approach treats the estimated random effects as dummies, not as
> an additional variance. Is this right?
>
> Thanks a lot for your help
>
> Christina Bogner
> ------------------------------------------------------------------------
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
--
=============================
Juan Pedro Steibel
Assistant Professor
Statistical Genetics and Genomics
Department of Animal Science &
Department of Fisheries and Wildlife
Michigan State University
1205-I Anthony Hall
East Lansing, MI
48824 USA
Phone: 1-517-353-5102
E-mail: steibelj at msu.edu
More information about the R-sig-mixed-models
mailing list