[R] Help on predict.lm

Peter Ehlers ehlers at ucalgary.ca
Tue Mar 27 20:44:42 CEST 2012


R tries hard to keep you from committing scientific abuse.
As stated, your problem seems to me akin to

1. Given that a man's age can be modelled as a function
    of the grayness of his hair,
2. predict a man's age from the temperature in Barcelona.

Your calibration relates 'abs' and 'conc'. Now you want
to predict 'abs' from _'hours'_ (I think). I suspect that
concentration is actually related to time and this is
the missing link that you'll have to provide.

BTW, I'm surprised that you didn't find the requirement
for 'newdata' to be a data frame on the predict.lm help
page - it's pretty clearly stated there.

Peter Ehlers


On 2012-03-27 10:24, Nederjaard wrote:
> Hello,
>
> I'm new here, but will try to be as specific and complete as possible. I'm
> trying to use “lm“ to first estimate parameter values from a set of
> calibration measurements, and then later to use those estimates to calculate
> another set of values with “predict.lm”.
>
> First I have a calibration dataset of absorbance values measured from
> standard solutions with known concentration of Bromide:
>
>> stds
>        abs conc
> 1 -0.0021    0
> 2  0.1003  200
> 3  0.2395  500
> 4  0.3293  800
>
> On this small calibration series, I perform a linear regression to find the
> parameter estimates of the relationship between absorbance (abs) and
> concentration (conc):
>
>> linear1<- lm(abs~conc, data=stds)
>> summary(linear1)
>
> Call:
> lm(formula = abs ~ conc, data = stds)
>
> Residuals:
>          1         2         3         4
> -0.012600  0.006467  0.020667 -0.014533
>
> Coefficients:
>               Estimate Std. Error t value Pr(>|t|)
> (Intercept) 1.050e-02  1.629e-02   0.645  0.58527
> conc        4.167e-04  3.378e-05  12.333  0.00651 **
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Residual standard error: 0.02048 on 2 degrees of freedom
> Multiple R-squared: 0.987,      Adjusted R-squared: 0.9805
> F-statistic: 152.1 on 1 and 2 DF,  p-value: 0.00651
>
>
>
>
>
> Now I come with another dataset, which contains measured absorbance values
> of Bromide in solution:
>
>> brom
>      hours     abs
> 1    -1.0  0.0633
> 2     1.0  0.2686
> 3     5.0  0.2446
> 4    18.0  0.2274
> 5    29.0  0.2091
> 6    42.0  0.1961
> 7    53.0  0.1310
> 8    76.0  0.1504
> 9    91.0  0.1317
> 10   95.5  0.1169
> 11  101.0  0.0977
> 12  115.0  0.1023
> 13  123.5  0.0879
> 14  138.5  0.0724
> 15  147.5  0.0564
> 16  163.0  0.0495
> 17  171.0  0.0325
> 18  189.0  0.0182
> 19  211.0  0.0047
> 20  212.5      NA
> 21  815.5 -0.2112
> 22  816.5 -0.1896
> 23  817.5 -0.0783
> 24  818.5  0.2963
> 25  819.5  0.1448
> 26  839.5  0.0936
> 27  864.0  0.0560
> 28  888.0  0.0310
> 29  960.5  0.0056
> 30 1009.0 -0.0163
>
> The values in column brom$abs, measured on 30 subsequent points in time need
> to be calculated to Bromide concentrations, using the previously established
> relationship “linear1”.
> At first, I thought it could be done by:
>
>> predict.lm(linear1, brom$abs)
> Error in eval(predvars, data, env) :
>    numeric 'envir' arg not of length one
>
> But, R gives the above error message. Then, after some searching around on
> different fora and R-communities (including this one), I learned that the
> “newdata” in “predict.lm” actually needs to be coerced into a separate
> dataframe. Thus:
>
>> mabs<- data.frame(Abs = brom$abs)
>> predict.lm(linear1, mabs)
> Error in eval(expr, envir, enclos) : object 'conc' not found
>
> Again, R gives an error...probably because I made an error, but I truly fail
> to see where. I hope somebody can explain to me clearly what I'm doing wrong
> and what I should do to instead.
> Any help is greatly appreciated, thanks !
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Help-on-predict-lm-tp4509586p4509586.html
> Sent from the R help mailing list archive at Nabble.com.
> 	[[alternative HTML version deleted]]
>



More information about the R-help mailing list