[R] weighted regression inside FOREACH loop

William Dunlap wdunlap at tibco.com
Fri Oct 7 17:18:25 CEST 2016


A more general way is to change the environment of your formula to
a child of its original environment and add variables like 'weights' or
'subset' to the child environment.  Since you change the environment
inside a function call it won't affect the formula outside of the function
call.
E.g.

fmla <- as.formula("y ~ .")

models <- foreach(d=1:10, .combine=rbind, .errorhandling='remove') %dopar% {
  datdf <- data.frame(y = 1:100+2*rnorm(100), x = 1:100+rnorm(100))
  localEnvir <- new.env(parent=environment(fmla))
  environment(fmla) <- localEnvir
  localEnvir$weights <- rep(c(1,2), 50)
  mod <- lm(fmla, data=datdf, weights=weights)
  return(mod$coef)
}
models
#          (Intercept)         x
#result.1  -0.16910860 1.0022022
#result.2   0.03326814 0.9968325
#result.3  -0.08177174 1.0022907
#...
environment(fmla)
#<environment: R_GlobalEnv>



Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Fri, Oct 7, 2016 at 7:44 AM, Bos, Roger <roger.bos at rothschild.com> wrote:

> All,
>
> I figured out how to get it to work, so I am posting the solution in case
> anyone is interested.  I had to use attr to set the weights as an attribute
> of the data object for the linear model.  Seems convoluted, but anytime I
> tried to pass a named vector as the weights the foreach loop could not find
> the variable, even if I tried exporting it.  If anybody knows of a better
> way please let me know as this does not seem ideal to me, but it works.
>
> library(doParallel)
> cl <- makeCluster(4)
> registerDoParallel(cl)
> fmla <- as.formula("y ~ .")
> models <- foreach(d=1:10, .combine=rbind, .errorhandling='pass') %dopar% {
>   datdf <- data.frame(y = 1:100+2*rnorm(100), x = 1:100+rnorm(100))
>   attr(datdf, "weights") <- rep(c(1,2), 50)
>   mod <- lm(fmla, data=datdf, weights=attr(data, "weights"))
>   return(mod$coef)
> }
> Models
>
>
>
>
>
> -----Original Message-----
> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Bos, Roger
> Sent: Friday, October 07, 2016 9:25 AM
> To: R-help
> Subject: [R] weighted regression inside FOREACH loop
>
> I have a foreach loop that runs regressions in parallel and works fine,
> but when I try to add the weights parameter to the regression the
> coefficients don’t get stored in the “models” variable like they are
> supposed to.  Below is my reproducible example:
>
> library(doParallel)
> cl <- makeCluster(4)
> registerDoParallel(cl)
> fmla <- as.formula("y ~ .")
> models <- foreach(d=1:10, .combine=rbind, .errorhandling='remove') %dopar%
> {
>   datdf <- data.frame(y = 1:100+2*rnorm(100), x = 1:100+rnorm(100))
>   weights <- rep(c(1,2), 50)
>   mod <- lm(fmla, data=datdf, weights=weights)
>   #mod <- lm(fmla, data=datdf)
>   return(mod$coef)
> }
> models
>
> You can change the commenting on the two “mod <-“ lines to see that the
> non-weighted one works and the weighted regression doesn’t work.  I tried
> using .export="weights" in the foreach line, but R says that weights is
> already being exported.
>
> Thanks in advance for any suggestions.
>
>
>
>
>
> ***************************************************************
> This message and any attachments are for the intended recipient's use only.
> This message may contain confidential, proprietary or legally privileged
> information. No right to confidential or privileged treatment of this
> message is waived or lost by an error in transmission.
> If you have received this message in error, please immediately notify the
> sender by e-mail, delete the message, any attachments and all copies from
> your system and destroy any hard copies.  You must not, directly or
> indirectly, use, disclose, distribute, print or copy any part of this
> message or any attachments if you are not the intended recipient.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

	[[alternative HTML version deleted]]



More information about the R-help mailing list