[R] Problem using GLM in a loop

Peter Dalgaard BSA p.dalgaard at biostat.ku.dk
Fri Aug 24 14:35:15 CEST 2001


Isabelle ZABALZA <isabelle.zabalza-mezghani at ifp.fr> writes:

> Hello,
> 
> I am try to perform a modeling which is relevant in a strongly
> heteroscedastic context.
> So I perform a dual modeling (modeling of both mean and variance of a
> response) in using the following loop:
> 
> jointmod <- function(formula, data, itercrit=10,devcrit=0.0001)
> {
> #
> # Init step
> #
> init <- glm(formula=formula,family=gaussian, data=data)
> response <- resid(init,type="response") + predict(init,type="response")
> mu <- predict(init,type="response")
> iter <- 1
> dev <- init$deviance
> if (dev != 0) vardev <- 1
> else stop(message="Null deviance in initialisation step")
> endformula_ strsplit(formula,"~")[[3]]
> formulavar_paste("d","~")
> formulavar_as.formula(paste(formulavar,endformula))
> #
> # Modeling loop
> #
> while ((iter <= itercrit)| (vardev > devcrit)) {
> d_(response - mu)^2
> modvar_glm(formula=formulavar,family=Gamma(link=log),data=data)
> sig <- predict(modvar,type="response")
> weights <- 1/sig
> modesp <- glm(formula=formula,
> family=gaussian,data=data,weights=weights)
> mu <- predict(modesp,type="response")
> devold <- dev
> dev <- modesp$deviance
> vardev <- (devold -dev)/devold
> iter <- iter + 1
> }
> 
> list(modesp=modesp, modvar=modvar, iter=iter, vardev=vardev)
> }
> 
> This program evaluation always stops when evaluating the "modesp" line :
> 
>          modesp <- glm(formula=formula,
> family=gaussian,data=data,weights=weights)
> with the following error:
> error in
> model.frame(formula,rownames,variables,varnames,extras,extranames, :
>            invalid variable type.
> 
> The same program without specifying weights runs.
> I don't understand this behavior since when I execute the glm command
> with weights out of the program, it runs.

This has to do with a rather awkward environment issue, which I'm not
sure we have solved correctly. Your problem is really not the loop,
but the fact that you are using glm inside a function which is passed
the formula from another environment.

Variables in a formula are taken from the supplied data frame and if
not found in the environment of the formula. This is so that it would
work to say e.g. y^lambda ~ x, where lambda is a free variable.

However, the same treatment applies to special arguments like weight
and offset, i.e. they are treated as names and not subjected to the
usual parameter passing conventions. So the weights variable in the
current environment is never found. 

Probably the easiest way out is to set 

environment(formula) <- environment()

which might kill you if you're actually using a free variable as
indicated, but chances are that you're not. Alternatively, but more
than a bit icky, assign "weights" in the formula environment.

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list