[R] anomalies with the loess() function
Gavin Simpson
gavin.simpson at ucl.ac.uk
Wed Oct 27 09:39:02 CEST 2010
On Tue, 2010-10-26 at 20:04 +0200, Federico Bonofiglio wrote:
> Hello Masters,
<snip />
> #I generate 2 fictious vectors
>
> a<-sample(c(sample(1:1000,100),rep(NA,50)))
>
> b<-sample(c(sample(1:1000,100),rep(NA,50)))
<snip />
> a<-na.omit(a)
> b<-na.omit(b)
>
> #check out the evidence.....something's wrong with loess()???
No, something is wrong with your assumptions. lowess() and loess() do
not return anything like the same object. lowess() returns ordered $x
and $y components. The key here is "ordered".
loess() returns a complex list:
> str(fit)
List of 18
$ n : int 66
$ fitted : num [1:66] 390 447 617 494 283 ...
$ residuals: Named num [1:66] 270 188 161 -288 104 ...
..- attr(*, "names")= chr [1:66] "1" "2" "4" "5" ...
$ enp : num 7.87
.... etc.
The fitted values etc are in the order of the data you supplied in a and
b. Hence when you plot these values you will get a cats cradle of lines
because you are asking R to join points in sequence that are *not*
ordered across the plot.
> par(mfrow=c(1,2))
> plot(a,b)
> lines(lowess(a,b),col="red")#if NAs are excluded lowess() runs regularly
> plot(a,b)
> lines(loess(b~a),col="red")#.....but loess() keeps messing all over...!!???
Not sure what you hoped that to do. You are assuming far too much here.
You need to extract the information you want from a loess() object. That
lowess() works here is because it returns components $x and $y and the
underlying plot infrastructure in R will look for this in object it is
supplied. loess() doesn't supply these $x and $y so what you are seeing
is just gibberish because you supplied garbage.
Two alternatives:
a<-sample(c(sample(1:1000,100),rep(NA,50)))
b<-sample(c(sample(1:1000,100),rep(NA,50)))
mod <- loess(b~a, na.action = na.exclude)
fit <- fitted(mod)
ord <- order(a)
plot(a, b)
lines(a[ord], fit[ord], col = "red")
The above respects your fitted values and the locations of missing
values. The next solution will plot the fitted smooth for a set of
regularly spaced values over the range of a
pdata <- data.frame(a = seq(min(a, na.rm = TRUE),
max(a, na.rm = TRUE), length = 100))
pred <- predict(mod, pdata)
plot(a, b)
lines(pdata$a, pred, col = "red")
HTH
G
--
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Dr. Gavin Simpson [t] +44 (0)20 7679 0522
ECRC, UCL Geography, [f] +44 (0)20 7679 0565
Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/
UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
More information about the R-help
mailing list