[R] problem with predict()

Czerminski, Ryszard ryszard at arqule.com
Fri Jun 21 20:21:04 CEST 2002


Thank you for great support so far !
I think I am getting closer, but I still not quite get it...

Two questions:

(1) what is the difference between lm(y~., and lm(y~x, ???
    with second form failing ?

> train <- data.frame(y = yr, x = xr)
> test <- data.frame(y = ys, x = xs)
> model <- lm(y~., train)
> model <- lm(y~x, train)
Error in eval(expr, envir, enclos) : Object "x" not found

(2) and the other problems seems to be data related...

Consider following code:

:::
rm(list=ls())

train.data <- read.csv("train.csv", header = TRUE, row.names = "mol",
comment.char="")
test.data <- read.csv("test.csv", header = TRUE, row.names = "mol",
comment.char="")

#train.data <- matrix(rnorm(164*119), nrow = 164)
#test.data <- matrix(rnorm(35*119), nrow = 35)

yr <- train.data[,1]; xr <- train.data[,-1]
xr <- scale(xr)     # matrix <- scale(data.frame)
x.center <- attr(xr, "scaled:center"); x.scale <- attr(xr, "scaled:scale")
mask <- apply(xr, 2, function(x) any(is.na(x)))
xr <- xr[,!mask] # rm NA's
ys <- test.data[,1]; xs <- test.data[,-1]
xs <- scale(xs, center = x.center, scale = x.scale)
xs <- xs[,!mask]
train <- data.frame(y = yr, x = xr)
test <- data.frame(y = ys, x = xs)
model <- lm(y~., train)
length(predict(model, test))
::::

and execute it twice with: (S) simulated data and (R) "real" data I get:

::: for simulated data :::
dim(train) = 164 119 ; dim(test) = 35 119 
> length(predict(model, test))
[1] 35

::: for real data :::
dim(train) = 164 119 ; dim(test) = 35 119 
> length(predict(model, test))
Error in drop(X[, piv, drop = FALSE] %*% beta[piv]) : 
        subscript out of bounds

The shape of data seems to be the same in both cases and
the only difference (as far as I can tell) is in actual values

R

Ryszard Czerminski   phone: (781)994-0479
ArQule, Inc.         email:ryszard at arqule.com
19 Presidential Way  http://www.arqule.com
Woburn, MA 01801     fax: (781)994-0679


-----Original Message-----
From: Liaw, Andy [mailto:andy_liaw at merck.com]
Sent: Friday, June 21, 2002 1:06 PM
To: 'Peter Dalgaard BSA'
Cc: 'Czerminski, Ryszard'; r-help at stat.math.ethz.ch
Subject: RE: [R] problem with predict()


The problem is that xr and xs are both matrices in his example, not vectors.

Andy

> -----Original Message-----
> From: Peter Dalgaard BSA [mailto:p.dalgaard at biostat.ku.dk]
> Sent: Friday, June 21, 2002 1:03 PM
> To: Liaw, Andy
> Cc: 'Czerminski, Ryszard'; r-help at stat.math.ethz.ch
> Subject: Re: [R] problem with predict()
> 
> 
> "Liaw, Andy" <andy_liaw at merck.com> writes:
> 
> > You still don't get the point.  Please read Peter 
> Dalgaard's reply and the
> > help page for predict.lm carefully, and try to understand 
> the `Detail'
> > section.  See the example below:
> [snip]
> 
> > > This looks promissing; however I get an error:
> > > 
> > > > train <- data.frame(y=yr, x=xr)
> > > > test <- data.frame(y=ys, x=xs)
> > > > myfit <- lm(y ~ x, train)
> > > Error in eval(expr, envir, enclos) : Object "x" not found
> 
> But there's nothing wrong with that code as far as I can see?? I don't
> get an error from it:
> 
> > xr <- rnorm(10)
> > yr <- rnorm(10)
> > ys <- rnorm(5)
> > xs <- rnorm(5)
> > train <- data.frame(y=yr, x=xr)
> >  test <- data.frame(y=ys, x=xs)
> > myfit <- lm(y ~ x, train)
> > predict(myfit,test)
>           1           2           3           4           5 
> -0.03809295  0.11422384  0.35570765  0.55436954  0.22979523 
> 
> 
> Something must have gone wrong with the creation of "train".
> 
> -- 
>    O__  ---- Peter Dalgaard             Blegdamsvej 3  
>   c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
>  (*) \(*) -- University of Copenhagen   Denmark      Ph: 
> (+45) 35327918
> ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: 
> (+45) 35327907
> 

----------------------------------------------------------------------------
--
Notice: This e-mail message, together with any attachments, contains
information of Merck & Co., Inc. (Whitehouse Station, New Jersey, USA) that
may be confidential, proprietary copyrighted and/or legally privileged, and
is intended solely for the use of the individual or entity named on this
message. If you are not the intended recipient, and have received this
message in error, please immediately return this by e-mail and then delete
it.

============================================================================
==
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list