[R-sig-Geo] regression kriging in gstat with skewed distributions

Tomislav Hengl hengl at science.uva.nl
Wed Jan 16 11:08:28 CET 2008


Dear Giovanni,

Logit transformation can be automatically applied to any variables which has a lower and upper
physical limits (e.g. 0-100%). In R, you can transform a variable to logits by e.g.:

> points = read.dbf("points.dbf")
> points$SANDt = log((points$SAND/100)/(1-(points$SAND/100)))

After you interpolate your variable, you can back-transform the values by using:

> SAND.rk = krige(fsand$call$formula, points[sel,], SPC, sand.rvgm)

> SAND.rk$pred=exp(SAND.rk$var1.pred)/(1+exp(SAND.rk$var1.pred))*100

The prediction variance can not be back-transformed, but you can use the normalized prediction
variance by dividing it with the sampled variance. See also section 4.2.1 of my lecture notes
(http://geostat.pedometrics.org/).

There are many transformations that can be applied to force a normality of your target variable (see
e.g. http://en.wikipedia.org/wiki/Data_transformation_(statistics) ). The most generic
transformation is to work with the probability density function values (see e.g.
http://dx.doi.org/10.1016/j.jneumeth.2006.11.004 ), this way you do not have to think about how the
histogram looks at all. But then the interpretation of the regression plots becomes rather
difficult. 

In any case, you should apply the transformation already to the target variable because also a
requirement for linear regression is that the residuals are normally distributed around the
regression line.


see also:
FITTING DISTRIBUTIONS WITH R (by Vito Ricci)
http://cran.r-project.org/doc/contrib/Ricci-distributions-en.pdf


Tom Hengl
http://spatial-analyst.net 


-----Original Message-----
From: r-sig-geo-bounces at stat.math.ethz.ch [mailto:r-sig-geo-bounces at stat.math.ethz.ch] On Behalf Of
G. Allegri
Sent: dinsdag 15 januari 2008 15:28
To: r-sig-geo at stat.math.ethz.ch
Subject: [R-sig-Geo] regression kriging in gstat with skewed distributions

I'm trying to realize e regression kriging with gstat package on my
soil samples data. The response variable (ECe measuere) and covariates
appear positvely skewed.
As Tomislav Hengl suggests in its "framework for RK" [1], a logistic
transformation is proposed as a generic way to reduce the skeweness by
using the physical limits of the data.
Is it really a transformation that can be applied in the generic case
of skewed datas? I mean,in my case I have non-normal residuals (from
original data regression), and I'm trying to transform the residuals
(and not the original values) to do SK on them . Is this approach
correct?

A related question is how to do normal score transformations (for my
residuals) in R and gstat. I know gstat doesn't manage transformations
and back-transformations, so it should be done previously in R... but
I can't find any package that permit it in a straisghtforward way.
I've found something with qqnorm(ppoints(data)) and the approx()
function. Is that all?

Giovanni


[1] "A generic framework for spatial prediction of soil variables
based on regressionkriging" Geoderma 122 (1–2), 75–93.

_______________________________________________
R-sig-Geo mailing list
R-sig-Geo at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/r-sig-geo




More information about the R-sig-Geo mailing list