[R] survreg with measurement uncertainties
Kyle Penner
kpenner at as.arizona.edu
Fri Jun 14 20:24:14 CEST 2013
Hrm, thanks. The uncertainties are what they are, though (and the
model is what it is, too) -- is there an alternative to modifying
them? Maybe another type of analysis that handles upper limits?
Kyle
On Thu, Jun 13, 2013 at 5:46 AM, Andrews, Chris <chrisaa at med.umich.edu> wrote:
> It seems a line through the origin doesn't fit the data very well. That may be throwing off the fitting routine.
>
> data = c(144.53, 1687.68, 5397.91)
> err = c(8.32, 471.22, 796.67)
> model = c(71.60, 859.23, 1699.19)
> id = c(1, 2, 3)
>
> # display
> plot(data ~ model)
> library("Hmisc")
> errbar(model, add=TRUE, y = data, yplus= data+err, yminus=data-err, errbar.col=2)
> abline(lm(data ~ model - 1))
> abline(lm(data ~ model - 1, weights = 1/err^2), col=2)
>
> Making err[1] larger, allows convergence:
>
> err[1] <- 80.32
>
> survreg(Surv(time = data)~model-1+cluster(id), weights=1/(err^2), dist='gaussian', init=c(2.1))
>
> Call:
> survreg(formula = Surv(time = data) ~ model - 1 + cluster(id),
> weights = 1/(err^2), dist = "gaussian", init = c(2.1))
>
> Coefficients:
> model
> 2.6055
>
> Scale= 139.299
>
> Loglik(model)= 0 Loglik(intercept only)= 0
> n= 3
>
>
> And this matches the standard linear model call:
>
> lm(data ~ model - 1, weights = 1/err^2)
>
> Call:
> lm(formula = data ~ model - 1, weights = 1/err^2)
>
> Coefficients:
> model
> 2.606
>
>
> -----Original Message-----
> From: Kyle Penner [mailto:kpenner at as.arizona.edu]
> Sent: Wednesday, June 12, 2013 3:49 PM
> To: Terry Therneau
> Cc: r-help at r-project.org
> Subject: Re: [R] survreg with measurement uncertainties
>
> Hi Terry,
>
> Thanks for your quick reply. I am talking about uncertainty in the response. I have 2 follow up questions:
>
> 1) my understanding from the documentation is that 'id' in cluster(id) should be the same when the predictors are not independent. Is this correct? (To be more concrete: my data are brightnesses at different wavelengths. Each brightness is an independent measurement, so the elements of id should all be different?)
>
> 2) I tested survreg with uncertainties on an example where I already know the answer (and where I am not using limits), and it does not converge. Below is the code I used, does anything jump out as incorrect?
>
> data = c(144.53, 1687.68, 5397.91)
> err = c(8.32, 471.22, 796.67)
> model = c(71.60, 859.23, 1699.19)
> id = c(1, 2, 3)
>
> This works (2.9 is the answer from simple chi_sq fitting):
>
> survreg(Surv(time = data, event = c(1,1,1))~model-1, dist='gaussian',
> init=c(2.9))
>
> This does not converge (2.1 is the answer from chi_sq fitting):
>
> survreg(Surv(time = data, event = c(1,1,1))~model-1+cluster(id), weights=1/(err^2), dist='gaussian', init=c(2.1))
>
> And this does, but the answer it returns is wonky:
>
> data[2] = 3*err[2] # data[2] is very close to 3*err[2] already survreg(Surv(time = data, event = c(1,2,1))~model-1+cluster(id), weights=1/(err^2), dist='gaussian', init=c(2.1))
>
> Thanks,
>
> Kyle
>
> On Wed, Jun 12, 2013 at 6:51 AM, Terry Therneau <therneau at mayo.edu> wrote:
>> I will assume that you are talking about uncertainty in the response.
>> Then one simple way to fit the model is to use case weights that are
>> proprional to 1/variance, along with +cluster(id) in the model
>> statement to get a correct variance for this case. In linear models
>> this would be called the "White" or "Horvitz-Thompsen" or "GEE working
>> independence" variance estimate, depending on which literature you
>> happen to be reading (economics, survey sampling, or biostat).
>>
>> Now if you are talking about errors in the predictor variables, that
>> is a much harder problem.
>>
>> Terry Therneau
>>
>>
>>
>> On 06/12/2013 05:00 AM, Kyle Penner wrote:
>>>
>>> Hello,
>>>
>>> I have some measurements that I am trying to fit a model to. I also
>>> have uncertainties for these measurements. Some of the measurements
>>> are not well detected, so I'd like to use a limit instead of the
>>> actual measurement. (I am always dealing with upper limits, i.e.
>>> left censored data.)
>>>
>>> I have successfully run survreg using the combination of well
>>> detected measurements and limits, but I would like to include the
>>> measurement uncertainty (for the well detected measurements) in the
>>> fitting. As far as I can tell, survreg doesn't support this. Does
>>> anyone have a suggestion for how to accomplish this?
>>>
>>> Thanks,
>>>
>>> Kyle
>
>
> **********************************************************
> Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
More information about the R-help
mailing list