[R] Using 'sapply' and 'by' in one function

Gabor Grothendieck ggrothendieck at gmail.com
Sun Feb 10 15:25:43 CET 2008


Actually thinking about this, not only do you not need sapply but you
don't even need by:

new2 <- transform(new, sex = factor(sex))
coef(lm(as.matrix(new2[1:2]) ~ sex/Pred - 1, new2))


On Feb 10, 2008 8:43 AM, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
> By passing new to fxa via the second argument of fxa, new is not being
> subsetted hence the error.  Try this:
>
> by(new, new$sex, function(x) sapply(x[1:2], function(y) coef(lm(y ~ Pred, x)))
>
> Actually, you can do the above without sapply as lm can take a matrix
> for the dependent variable:
>
> by(new, new$sex, function(x) coef(lm(as.matrix(x[1:2]) ~ Pred, x)))
>
>
> On Feb 10, 2008 8:19 AM, David & Natalia <3.14david at gmail.com> wrote:
> > Greetings,
> >
> > I'm having a problem with something that I think is very simple - I'd
> > like to be able to use the 'sapply' and 'by' functions in 1 function
> > to be able (for example) to get regression coefficients from multiple
> > models by a grouping variable.  I think that I'm missing something
> > that is probably obvious to experienced users.
> >
> > Here's a simple (trivial) example of what I'd like to do:
> >
> > new <- data.frame(Outcome.1=rnorm(10),Outcome.2=rnorm(10),sex=rep(0:1,5),Pred=rnorm(10))
> > fxa <- function(x,data)   { lm(x~Pred,data=data)$coef }
> > sapply(new[,1:2],fxa,new)  # this yields coefficients for the
> > predictor in separate models
> >
> > fxb <- function(x)   {lm(Outcome.1~Pred,da=x)$coef};
> > by(new,new$sex,fxb) #yields the coefficient for Outcome.1 for each sex
> >
> > ## I'd like to be able to combine 'sapply' and 'by' to be able to get
> > the regression coefficients for Outome.1 and Outcome.2 by each sex,
> > rather than running fxb a second time predicting 'Outcome.2' or by
> > subsetting the data - by sex - before I run the function, but the
> > following doesn't work -
> >
> > by(new,new$sex,FUN=function(x)sapply(x[,1:2],fxa,new))
> > 'Error in model.frame.default(formula = x ~ Pred, data = data,
> > drop.unused.levels = TRUE) :
> >  variable lengths differ (found for 'Pred')'
> >
> > ##I understand the error message - the length of 'Pred' is 10 while
> > the length of each sex group is 5, but I'm not sure how to correctly
> > write the 'by' function to use 'sapply' inside it.   Could someone
> > please point me in the right direction?  Thanks very much in advance
> >
> > David S Freedman, CDC (Atlanta USA) [definitely not the well-know
> > statistician, David A Freedman, in Berkeley]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>



More information about the R-help mailing list