[R] relative risk regression with survey data
Ravi Varadhan
rvaradhan at jhmi.edu
Wed Sep 15 15:37:59 CEST 2010
Dear Thomas,
You said, "the log-binomial model is very non-robust when the fitted values
get close to 1, and there is some controversy over the best approach."
Could you please point me to a paper that discusses the issues?
I have written some code to do maximum likelihood estimation for relative,
additive, and mixed risk regression models with binomial model. I have been
able to obtain good convergence. I have used bootstrap to get standard
errors. However, I am not sure if these standard errors are valid when
fitted values were close to 0 or 1. It seems to me that when the fitted
probabilities are close to 0 or 1, there is not a good way to estimate
standard errors.
Thanks,
Ravi.
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Thomas Lumley
Sent: Monday, September 13, 2010 10:41 PM
To: Daniel Nordlund
Cc: r-help at r-project.org
Subject: Re: [R] relative risk regression with survey data
On Mon, 13 Sep 2010, Daniel Nordlund wrote:
> I have been asked to look at options for doing relative risk regression on
> some survey data. I have a binary DV and several predictor / adjustment
> variables. In R, would this be as "simple" as using the survey package to
> set up an appropriate design object and then running svyglm with
> family=binomial(log) ? Any other suggestions for covariate adjustment of
> relative risk estimates? Any and all suggestions welcomed.
If the fitted values don't get too close to 1 then svyglm(
,family=quasibinomial(log)) will do it.
The log-binomial model is very non-robust when the fitted values get close
to 1, and there is some controversy over the best approach. You can still
use svyglm( ,family=quasibinomial(log)) but you will probably need to set
the number of iterations much higher (perhaps 200).
Alternatively, you can use nonlinear least squares [svyglm(,
family=gaussian(log))] or other quasilikelihood approaches, such as
family=quasipoisson(log). These are all consistent for the same parameter
if the model is correctly specified and are much more robust to x-outliers.
I rather like nonlinear least squares, because it's easy to explain.
-thomas
Thomas Lumley
Professor of Biostatistics
University of Washington, Seattle
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list