[R] survreg with measurement uncertainties
Andrews, Chris
chrisaa at med.umich.edu
Thu Jun 13 14:46:26 CEST 2013
It seems a line through the origin doesn't fit the data very well. That may be throwing off the fitting routine.
data = c(144.53, 1687.68, 5397.91)
err = c(8.32, 471.22, 796.67)
model = c(71.60, 859.23, 1699.19)
id = c(1, 2, 3)
# display
plot(data ~ model)
library("Hmisc")
errbar(model, add=TRUE, y = data, yplus= data+err, yminus=data-err, errbar.col=2)
abline(lm(data ~ model - 1))
abline(lm(data ~ model - 1, weights = 1/err^2), col=2)
Making err[1] larger, allows convergence:
err[1] <- 80.32
survreg(Surv(time = data)~model-1+cluster(id), weights=1/(err^2), dist='gaussian', init=c(2.1))
Call:
survreg(formula = Surv(time = data) ~ model - 1 + cluster(id),
weights = 1/(err^2), dist = "gaussian", init = c(2.1))
Coefficients:
model
2.6055
Scale= 139.299
Loglik(model)= 0 Loglik(intercept only)= 0
n= 3
And this matches the standard linear model call:
lm(data ~ model - 1, weights = 1/err^2)
Call:
lm(formula = data ~ model - 1, weights = 1/err^2)
Coefficients:
model
2.606
-----Original Message-----
From: Kyle Penner [mailto:kpenner at as.arizona.edu]
Sent: Wednesday, June 12, 2013 3:49 PM
To: Terry Therneau
Cc: r-help at r-project.org
Subject: Re: [R] survreg with measurement uncertainties
Hi Terry,
Thanks for your quick reply. I am talking about uncertainty in the response. I have 2 follow up questions:
1) my understanding from the documentation is that 'id' in cluster(id) should be the same when the predictors are not independent. Is this correct? (To be more concrete: my data are brightnesses at different wavelengths. Each brightness is an independent measurement, so the elements of id should all be different?)
2) I tested survreg with uncertainties on an example where I already know the answer (and where I am not using limits), and it does not converge. Below is the code I used, does anything jump out as incorrect?
data = c(144.53, 1687.68, 5397.91)
err = c(8.32, 471.22, 796.67)
model = c(71.60, 859.23, 1699.19)
id = c(1, 2, 3)
This works (2.9 is the answer from simple chi_sq fitting):
survreg(Surv(time = data, event = c(1,1,1))~model-1, dist='gaussian',
init=c(2.9))
This does not converge (2.1 is the answer from chi_sq fitting):
survreg(Surv(time = data, event = c(1,1,1))~model-1+cluster(id), weights=1/(err^2), dist='gaussian', init=c(2.1))
And this does, but the answer it returns is wonky:
data[2] = 3*err[2] # data[2] is very close to 3*err[2] already survreg(Surv(time = data, event = c(1,2,1))~model-1+cluster(id), weights=1/(err^2), dist='gaussian', init=c(2.1))
Thanks,
Kyle
On Wed, Jun 12, 2013 at 6:51 AM, Terry Therneau <therneau at mayo.edu> wrote:
> I will assume that you are talking about uncertainty in the response.
> Then one simple way to fit the model is to use case weights that are
> proprional to 1/variance, along with +cluster(id) in the model
> statement to get a correct variance for this case. In linear models
> this would be called the "White" or "Horvitz-Thompsen" or "GEE working
> independence" variance estimate, depending on which literature you
> happen to be reading (economics, survey sampling, or biostat).
>
> Now if you are talking about errors in the predictor variables, that
> is a much harder problem.
>
> Terry Therneau
>
>
>
> On 06/12/2013 05:00 AM, Kyle Penner wrote:
>>
>> Hello,
>>
>> I have some measurements that I am trying to fit a model to. I also
>> have uncertainties for these measurements. Some of the measurements
>> are not well detected, so I'd like to use a limit instead of the
>> actual measurement. (I am always dealing with upper limits, i.e.
>> left censored data.)
>>
>> I have successfully run survreg using the combination of well
>> detected measurements and limits, but I would like to include the
>> measurement uncertainty (for the well detected measurements) in the
>> fitting. As far as I can tell, survreg doesn't support this. Does
>> anyone have a suggestion for how to accomplish this?
>>
>> Thanks,
>>
>> Kyle
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
More information about the R-help
mailing list