[R] mgcv: beta coefficient and 95%CI

David Winsemius dwinsemius at comcast.net
Wed Feb 23 18:29:31 CET 2011


In addition to Simon Wood's reading of the methods I am adding a minor  
note on terminology:

What was reported was the ratio in risk per a difference equal to the  
interquartile range (of the predictor variable), and the 95% CI for  
such an estimate. They were not reporting a 95% CI for the  
interquartile range (of anything). Since it was a linear estimate, it  
would apply anywhere along the range of observed values of the  
predictor.

Followed by a cautionary note on methods;

Further I would question their approach to construction of that  
estimate. They apparently did a simple multiplication of the beta term  
by the IQR range and called that the "per cent change per IQR". They  
should have used predict.gam() at the locations of the IQR to get risk  
predictions since the beta coefficient is estimated on the log-risk  
scale.

I have seen similar errors in reporting results from Cox models and  
the justification given was "the SAS manual said I could do that". It  
is true that the effects of predictors can be  linearly approximated  
by simple transformations of beta coefficients, but it is not true  
that they can be extended across a range as large as that offered by  
the IQR. Not everything that appears in the published literature  
reflects good statistical practice.

-- 
David.

On Feb 23, 2011, at 9:40 AM, clc wrote:

>
> In one of the papers...
>
> We developed core models with a generalized additive Poisson  
> regression
> allowing for over-dispersion in the model (Wood, 2006). For each  
> mortality
> outcome, variations in seasonality, trends, mean temperature, and mean
> humidity of current and previous days (lag 0–1) were fitted with  
> penalized
> cubic regression splines. Dummy variables were used to control the
> variations for days of the week, holidays, and influenza epidemics.  
> We added
> a dummy variable for the 2003 severe acute respiratory syndrome (SARS)
> epidemic. We chose 4 degrees of freedom (df) per year for smoothing  
> function
> of the trends and 3 df for temperature and humidity. The choice of  
> df for
> each smoothing function in the core models was made on the basis of  
> observed
> residual autocorrelations using partial autocorrelation function  
> (PACF). For
> the core models fitted to the mortality data, time variant confounding
> factors were considered as adequately controlled if absolute values  
> of PACF
> coefficients were <0.1 for the first two lag days and there were no
> systematic patterns in the PACF plots.
>
> Following the construction of an adequate core model for each  
> mortality
> outcome, we entered visibility as a linear term into the regression  
> model
> and examined the effects of visibility on mortality for single day  
> lags 0–5
> days, lag 0–1, and distributed lag 0–4 days ([Schwartz, 2000] and  
> [Zanobetti
> et al., 2000]). The distributed lag effect take into account the  
> possibility
> that visibility can affect deaths occurring on the same day and on  
> several
> subsequent days. The net effect of visibility was the sum of the  
> effect
> estimates for all six days. We expressed the effect of visibility as  
> the
> percentage change in daily mortality with a decrease in the  
> interquartile
> range (IQR) of visibility as 100%×IQR×β, where β is the estimated  
> Poisson
> regression coefficient, and referred to as the excess risk (ER%).
>
>
>
> in one of the figures, they reported "Estimated excess risks (ER%)  
> for daily
> mortality and associated 95% confidence intervals per interquartile  
> range
> decrease in visibility (6.5 km) at single lags 0–5, mean lag 0–1  
> (0–1) and
> distributed lag (DL) for lag 0–4 days"
>
>
> What do they mean??! Thanks a lot!
> -- 
> View this message in context: http://r.789695.n4.nabble.com/mgcv-beta-coefficient-and-95-CI-tp3320491p3321099.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list