[R] Help on predict.lm

Peter Ehlers ehlers at ucalgary.ca
Tue Mar 27 21:02:39 CEST 2012

```R tries hard to keep you from committing scientific abuse.
As stated, your problem seems to me akin to

1. Given that a man's age can be modelled as a function
of the grayness of his hair,
2. predict a man's age from the temperature in Barcelona.

Your calibration relates 'abs' and 'conc'. Now you want
to predict 'abs' from 'hours' (I think). I suspect that
concentration is actually related to time and this is

BTW, I'm surprised that you didn't find the requirement
for 'newdata' to be a data frame on the predict.lm help
page - it's pretty clearly stated there.

Peter Ehlers

On 2012-03-27 10:24, Nederjaard wrote:
> Hello,
>
> I'm new here, but will try to be as specific and complete as possible. I'm
> trying to use “lm“ to first estimate parameter values from a set of
> calibration measurements, and then later to use those estimates to calculate
> another set of values with “predict.lm”.
>
> First I have a calibration dataset of absorbance values measured from
> standard solutions with known concentration of Bromide:
>
>> stds
>        abs conc
> 1 -0.0021    0
> 2  0.1003  200
> 3  0.2395  500
> 4  0.3293  800
>
> On this small calibration series, I perform a linear regression to find the
> parameter estimates of the relationship between absorbance (abs) and
> concentration (conc):
>
>> linear1<- lm(abs~conc, data=stds)
>> summary(linear1)
>
> Call:
> lm(formula = abs ~ conc, data = stds)
>
> Residuals:
>          1         2         3         4
> -0.012600  0.006467  0.020667 -0.014533
>
> Coefficients:
>               Estimate Std. Error t value Pr(>|t|)
> (Intercept) 1.050e-02  1.629e-02   0.645  0.58527
> conc        4.167e-04  3.378e-05  12.333  0.00651 **
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Residual standard error: 0.02048 on 2 degrees of freedom
> Multiple R-squared: 0.987,      Adjusted R-squared: 0.9805
> F-statistic: 152.1 on 1 and 2 DF,  p-value: 0.00651
>
>
>
>
>
> Now I come with another dataset, which contains measured absorbance values
> of Bromide in solution:
>
>> brom
>      hours     abs
> 1    -1.0  0.0633
> 2     1.0  0.2686
> 3     5.0  0.2446
> 4    18.0  0.2274
> 5    29.0  0.2091
> 6    42.0  0.1961
> 7    53.0  0.1310
> 8    76.0  0.1504
> 9    91.0  0.1317
> 10   95.5  0.1169
> 11  101.0  0.0977
> 12  115.0  0.1023
> 13  123.5  0.0879
> 14  138.5  0.0724
> 15  147.5  0.0564
> 16  163.0  0.0495
> 17  171.0  0.0325
> 18  189.0  0.0182
> 19  211.0  0.0047
> 20  212.5      NA
> 21  815.5 -0.2112
> 22  816.5 -0.1896
> 23  817.5 -0.0783
> 24  818.5  0.2963
> 25  819.5  0.1448
> 26  839.5  0.0936
> 27  864.0  0.0560
> 28  888.0  0.0310
> 29  960.5  0.0056
> 30 1009.0 -0.0163
>
> The values in column brom\$abs, measured on 30 subsequent points in time need
> to be calculated to Bromide concentrations, using the previously established
> relationship “linear1”.
> At first, I thought it could be done by:
>
>> predict.lm(linear1, brom\$abs)
> Error in eval(predvars, data, env) :
>    numeric 'envir' arg not of length one
>
> But, R gives the above error message. Then, after some searching around on
> different fora and R-communities (including this one), I learned that the
> “newdata” in “predict.lm” actually needs to be coerced into a separate
> dataframe. Thus:
>
>> mabs<- data.frame(Abs = brom\$abs)
>> predict.lm(linear1, mabs)
>
> Again, R gives an error...probably because I made an error, but I truly fail
> to see where. I hope somebody can explain to me clearly what I'm doing wrong
> and what I should do to instead.
> Any help is greatly appreciated, thanks !
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Help-on-predict-lm-tp4509586p4509586.html
> Sent from the R help mailing list archive at Nabble.com.
> 	[[alternative HTML version deleted]]
>

```