[R] Need to fit a regression line using orthogonal residuals
Henrik Bengtsson
hb at stat.berkeley.edu
Wed Jan 31 03:18:56 CET 2007
The iwpca() in Bioconductor package aroma.light takes a matrix of
column vectors and "fits an R-dimensional hyperplane using iterative
re-weighted PCA". From ?iwpca.matrix:
"Arguments:
X N-times-K matrix where N is the number of observations and K is the
number of dimensions.
Details:
This method uses weighted principal component analysis (WPCA) to fit a
R-dimensional hyperplane through the data with initial internal
weights all equal. At each iteration the internal weights are
recalculated based on the "residuals". If method=="L1", the internal
weights are 1 / sum(abs(r) + reps). This is the same as
method=function(r) 1/sum(abs(r)+reps). The "residuals" are orthogonal
Euclidean distance of the principal components R,R+1,...,K. In each
iteration before doing WPCA, the internal weighted are multiplied by
the weights given by argument w, if specified."
Thus, in your case you want to do:
X <- cbind(x,y)
fit <- iwpca(X)
There are different ways of robustifying the estimate, cf. argument
'method'. For heteroscedastic noise, fitting in L1 is convenient since
there is no bandwidth parameter.
Hope this helps
Henrik
On 1/30/07, Bill.Venables at csiro.au <Bill.Venables at csiro.au> wrote:
> Jonathon Kopecky asks:
>
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Jonathon Kopecky
> Sent: Tuesday, 30 January 2007 5:52 AM
> To: r-help at stat.math.ethz.ch
> Subject: [R] Need to fit a regression line using orthogonal residuals
>
> I'm trying to fit a simple linear regression of just Y ~ X, but both X
> and Y are noisy. Thus instead of fitting a standard linear model
> minimizing vertical residuals, I would like to minimize
> orthogonal/perpendicular residuals. I have tried searching the
> R-packages, but have not found anything that seems suitable. I'm not
> sure what these types of residuals are typically called (they seem to
> have many different names), so that may be my trouble. I do not want to
> use Principal Components Analysis (as was answered to a previous
> questioner a few years ago), I just want to minimize the combined noise
> of my two variables. Is there a way for me to do this in R?
> [WNV] There's always a way if you are prepared to program it. Your
> question is a bit like asking "Is there a way to do this in Fortran?"
> The most direct way to do it is to define a function that gives you the
> sum of the perpendicular distances and minimise it using, say, optim().
> E.g.
> ppdis <- function(b, x, y) sum((y - b[1] - b[2]*x)^2/(1+b[2]^2))
> b0 <- lsfit(x, y)$coef # initial value
> op <- optim(b0, ppdis, method = "BFGS", x=x, y=y)
> op # now to check the results
> plot(x, y, asp = 1) # why 'asp = 1'?? exercise
> abline(b0, col = "red")
> abline(op$par, col = "blue")
> There are a couple of things about this you should be aware of, though
> First, this is just a fiddly way of finding the first principal
> component, so your desire not to use Principal Component Analysis is
> somewhat thwarted, as it must be.
> Second, the result is sensitive to scale - if you change the scales of
> either x or y, e.g. from lbs to kilograms, the answer is different.
> This also means that unless your measurement units for x and y are
> comparable it's hard to see how the result can make much sense. A
> related issue is that you have to take some care when plotting the
> result or orthogonal distances will not appear to be orthogonal.
> Third, the resulting line is not optimal for either predicting y for a
> new x or x from a new y. It's hard to see why it is ever of much
> interest.
> Bill Venables.
>
>
> Jonathon Kopecky
> University of Michigan
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list