[BioC] Help with lm and predict
Nathan Haigh
n.haigh at sheffield.ac.uk
Tue Aug 21 17:57:18 CEST 2007
I'm trying to predict some values given a linear model. And I'm pulling
my hair out trying to get it to work!
I have a data file (several rows shown here):
-- start data file --
Mean M/Z SD M/Z N Mean Int (Log) SD Int (Log)
59.99434 0.004952 100 7.5543 0.3918
67.95295 0.006149 100 4.4477 0.5896
68.95164 0.006361 100 7.1036 0.5675
72.0453 0.006725 100 6.4998 0.1351
71.95349 0.006769 99 4.2937 0.3235
74.02768 0.006432 100 8.7142 0.2509
74.06413 0.006723 100 6.4504 0.1741
-- end data file --
I then execute the following code:
-- start code --
data<-read.table("./mydata.txt", header=TRUE,sep="\t")
data.lm<-lm(data[,2]~data[,1], data=data)
xy<-data.frame(Mean.M.Z = pretty(data$Mean.M.Z, 20))
yhat<-predict(data.lm, newdata=xy, interval="confidence")
-- end code --
I get the following warning:
-- start warning --
Warning message:
'newdata' had 30 rows but variable(s) found have 46 rows
-- end warning --
>From a quick google, I found similar threads about matching up columns
in the newdata object and those used for the fitting. However, I can't
figure out why this isn't working! I've included info about the relevant
objects at the bottom, so hopefully someone can help me out.
Cheers,
Nath
-- some data structures --
> str(data)
'data.frame': 46 obs. of 5 variables:
$ Mean.M.Z : num 60 68 69 72 72 ...
$ SD.M.Z : num 0.00495 0.00615 0.00636 0.00673 0.00677 ...
$ N : num 100 100 100 100 99 100 100 100 99 95 ...
$ Mean.Int..Log.: num 7.55 4.45 7.10 6.50 4.29 ...
$ SD.Int..Log. : num 0.392 0.590 0.568 0.135 0.324 ...
> str(xy)
'data.frame': 30 obs. of 1 variable:
$ Mean.M.Z: num 40 60 80 100 120 140 160 180 200 220 ...
> str(data.lm)
List of 12
$ coefficients : Named num [1:2] 9.33e-04 7.77e-05
..- attr(*, "names")= chr [1:2] "(Intercept)" "data[, 1]"
$ residuals : Named num [1:46] -6.44e-04 -6.58e-05 6.85e-05
1.92e-04 2.43e-04 ...
..- attr(*, "names")= chr [1:46] "1" "2" "3" "4" ...
$ effects : Named num [1:46] -0.167936 0.103933 0.000149
0.000273 0.000324 ...
..- attr(*, "names")= chr [1:46] "(Intercept)" "data[, 1]" "" "" ...
$ rank : int 2
$ fitted.values: Named num [1:46] 0.00560 0.00621 0.00629 0.00653
0.00653 ...
..- attr(*, "names")= chr [1:46] "1" "2" "3" "4" ...
$ assign : int [1:2] 0 1
$ qr :List of 5
..$ qr : num [1:46, 1:2] -6.782 0.147 0.147 0.147 0.147 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:46] "1" "2" "3" "4" ...
.. .. ..$ : chr [1:2] "(Intercept)" "data[, 1]"
.. ..- attr(*, "assign")= int [1:2] 0 1
..$ qraux: num [1:2] 1.15 1.15
..$ pivot: int [1:2] 1 2
..$ tol : num 1e-07
..$ rank : int 2
..- attr(*, "class")= chr "qr"
$ df.residual : int 44
$ xlevels : list()
$ call : language lm(formula = data[, 2] ~ data[, 1], data = data)
$ terms :Classes 'terms', 'formula' length 3 data[, 2] ~ data[, 1]
.. ..- attr(*, "variables")= language list(data[, 2], data[, 1])
.. ..- attr(*, "factors")= int [1:2, 1] 0 1
.. .. ..- attr(*, "dimnames")=List of 2
.. .. .. ..$ : chr [1:2] "data[, 2]" "data[, 1]"
.. .. .. ..$ : chr "data[, 1]"
.. ..- attr(*, "term.labels")= chr "data[, 1]"
.. ..- attr(*, "order")= int 1
.. ..- attr(*, "intercept")= int 1
.. ..- attr(*, "response")= int 1
.. ..- attr(*, ".Environment")=<R_GlobalEnv>
.. ..- attr(*, "predvars")= language list(data[, 2], data[, 1])
.. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "numeric"
.. .. ..- attr(*, "names")= chr [1:2] "data[, 2]" "data[, 1]"
$ model :'data.frame': 46 obs. of 2 variables:
..$ data[, 2]: num [1:46] 0.00495 0.00615 0.00636 0.00673 0.00677 ...
..$ data[, 1]: num [1:46] 60 68 69 72 72 ...
..- attr(*, "terms")=Classes 'terms', 'formula' length 3 data[, 2] ~
data[, 1]
.. .. ..- attr(*, "variables")= language list(data[, 2], data[, 1])
.. .. ..- attr(*, "factors")= int [1:2, 1] 0 1
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : chr [1:2] "data[, 2]" "data[, 1]"
.. .. .. .. ..$ : chr "data[, 1]"
.. .. ..- attr(*, "term.labels")= chr "data[, 1]"
.. .. ..- attr(*, "order")= int 1
.. .. ..- attr(*, "intercept")= int 1
.. .. ..- attr(*, "response")= int 1
.. .. ..- attr(*, ".Environment")=<R_GlobalEnv>
.. .. ..- attr(*, "predvars")= language list(data[, 2], data[, 1])
.. .. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "numeric"
.. .. .. ..- attr(*, "names")= chr [1:2] "data[, 2]" "data[, 1]"
- attr(*, "class")= chr "lm"
-- end some data structures --
More information about the Bioconductor
mailing list