[R] predict.lm(...,type="terms") question

Rui Barradas ruipbarradas at sapo.pt
Wed Aug 29 20:45:19 CEST 2012


Hello,

Ok, got it, thanks.

Apparently I was right, the estimator is still x.pred = (y.new - b0)/b1 
which can be obtained from predict.lm with the inverse regression model 
formula. According to the litterature, the main differences are in the 
confidence intervals for the true value of x.pred, for which there are 
several ways of computing the limits. So, I'll simply say, use lm() and 
predict.lm() but not the standard errors if you want CI's.

(The formulae for the CI's don't seem very hard to program, by the way. 
But there are several of them.)

Rui Barradas

Em 29-08-2012 18:58, Peter Ehlers escreveu:
> I think that what the OP is looking for comes under the heading of
> "inverse regression" or the "calibration" problem. One reference
> with a simple explanation including confidence intervals is "Applied
> regression analysis" by Draper and Smith. (It's in section 3.2 in
> my 3rd edition).
>
> Peter Ehlers
>
> On 2012-08-29 10:03, Rui Barradas wrote:
>> Hello,
>>
>> Inline.
>> Em 29-08-2012 16:06, John Thaden escreveu:
>>> Could it be that my newdata object needs to include a column for the
>>> concn term even though I'm asking for concn to be predicted? If so,
>>> what numbers would I fill it with? Or should my newdata object include
>>> the original data, too? Is there another mailing list I can ask?
>>
>> stackoverflow.com
>>
>> There's an R tag.
>>
>> Rui Barradas
>>> Thanks,
>>> -John
>>>
>>> On Wed, Aug 29, 2012 at 9:16 AM, John Thaden <johnthaden at gmail.com> 
>>> wrote:
>>>> I think I may be misreading the help pages, too, but misreading how?
>>>>
>>>> I agree that inverting the fitted model is simpler, but I worry 
>>>> that I'm
>>>> misusing ordinary least squares regression by treating my response, 
>>>> with its
>>>> error distribution, as a predictor with no such error. In practice, 
>>>> with my
>>>> real data that includes about six independent peak area 
>>>> measurements per
>>>> known concentration level, the diagnostic plots from 
>>>> plot.lm(inv.model) look
>>>> very strange and worrisome.
>>>>
>>>> Certainly predict.lm(..., type = "terms") must be able to do what I 
>>>> need.
>>>>
>>>> -John
>>>>
>>>> On Wed, Aug 29, 2012 at 6:50 AM, Rui Barradas 
>>>> <ruipbarradas at sapo.pt> wrote:
>>>>> Hello,
>>>>>
>>>>> You seem to be misreading the help pages for lm and predict.lm, 
>>>>> argument
>>>>> 'terms'.
>>>>> A much simpler way of solving your problem should be to invert the 
>>>>> fitted
>>>>> model using lm():
>>>>>
>>>>>
>>>>> model <- lm(area ~ concn, data)  # Your original model
>>>>> inv.model <- lm(concn ~ area, data = data)  # Your problem's model.
>>>>>
>>>>> # predicts from original data
>>>>> pred1 <- predict(inv.model)
>>>>> # predict from new data
>>>>> pred2 <- predict(inv.model, newdata = new)
>>>>>
>>>>> # Let's see it.
>>>>> plot(concn ~ area, data = data)
>>>>> abline(inv.model)
>>>>> points(data$area, pred1, col="blue", pch="+")
>>>>> points(new$area, pred2, col="red", pch=16)
>>>>>
>>>>>
>>>>> Also, 'data' is a really bad variable name, it's already an R 
>>>>> function.
>>>>>
>>>>> Hope this helps,
>>>>>
>>>>> Rui Barradas
>>>>>
>>>>> Em 28-08-2012 23:30, John Thaden escreveu:
>>>>>> Hello all,
>>>>>>
>>>>>> How do I actually use the output of predict.lm(..., type="terms") to
>>>>>> predict new term values from new response values?
>>>>>>
>>>>>> I'm a  chromatographer trying to use R (2.15.1) for one of the most
>>>>>> common calculations in that business:
>>>>>>
>>>>>>        - Given several chromatographic peak areas measured for 
>>>>>> control
>>>>>> samples containing a molecule at known (increasing) concentrations,
>>>>>>          first derive a linear regression model relating the known
>>>>>> concentration (predictor) to the observed peak area (response)
>>>>>>        - Then, given peak areas from new (real) samples containing
>>>>>> unknown amounts of the molecule, use the model to predict
>>>>>> concentrations of the
>>>>>>          molecule in the unknowns.
>>>>>>
>>>>>> In other words, given y = mx +b, I need to solve x' = (y'-b)/m 
>>>>>> for new
>>>>>> data y'
>>>>>>
>>>>>> and in R, I'm trying something like this
>>>>>>
>>>>>> require(stats)
>>>>>> data <- data.frame(area = c(4875, 8172, 18065, 34555), concn = c(25,
>>>>>> 50, 125, 250))
>>>>>> new <- data.frame(area = c(8172, 10220, 11570, 24150))
>>>>>> model <- lm(area ~ concn, data)
>>>>>> pred <- predict(model, type = "terms")
>>>>>> #predicts from original data
>>>>>> pred <- predict(model, type = "terms", newdata = new)
>>>>>>                    #error
>>>>>> pred <- predict(model, type = "terms", newdata = new, se.fit = TRUE)
>>>>>>              #error
>>>>>> pred <- predict(model, type = "terms", newdata = new, interval =
>>>>>> "prediction")  #error
>>>>>> new2 <- data.frame(area = c(8172, 10220, 11570, 24150), concn = 0)
>>>>>> new2
>>>>>> pred <- predict(model, type = "terms", newdata = new2)
>>>>>>                   #wrong results
>>>>>>
>>>>>> Can someone please show me what I'm doing wrong?
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-help at r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide
>>>>>> http://www.R-project.org/posting-guide.html
>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>




More information about the R-help mailing list