[Rd] aggregate(as.formula("some formula"), data, function) error when called from in a function

Gabor Grothendieck ggrothendieck at gmail.com
Wed Jan 26 20:31:56 CET 2011


On Wed, Jan 26, 2011 at 2:04 PM, Paul Bailey <pdbailey at umd.edu> wrote:
> I'm having a problem with aggregate.formula when I call it in a function and the function is converted from a string in the funtion
>
> I think my problem may also only occur when the left hand side of the formula is cbind(...)
>
> Here is example code that generates a dataset and then the error.
>
> The first function "agg2" fails
>
>> agg2(FALSE)
> do agg 2
> Error in m[[2L]][[2L]] : object of type 'symbol' is not subsettable
>
> but, if I run it have it return what it is going to pass to aggregate and pass it myself, it works. I can use this for a workaround (agg3) where one function does this itself.
>
> I'm confused by the behavior. Is there some way to not have to use a separate function to make the call ?
>
>
> ======================
> # start R code
> # idea: in a function, count the number of instances
> # of some factor (y) associated with another
> # factor (x). aggregate.formula appears to be
> # able to do this... but I have a problem if all of the following:
> # (1) It is called in a function
> # (2) the formula is created using as.formula(character)
> # calling aggregate with the same formula (created with as.formula)
> # outside the function works fine.
> agg2 <- function(test=FALSE) {
>  # create a factor y
>  dat <- data.frame(y=sample(LETTERS[1:3],100,replace=TRUE))
>  # create a factor x
>  dat$x <- sample(letters[1:4],100,replace=TRUE)
>  # make a column of 1s and zeros
>  # 1 when that row has that level of y
>  # 0 otherwise
>  lvls <- levels(dat$y)
>  dat$ya <- 1*(dat[,1] == lvls[1])
>  dat$yb <- 1*(dat[,1] == lvls[2])
>  dat$yc <- 1*(dat[,1] == lvls[3])
>  # this works fine if you give the exact function
>  agg1 <- aggregate(cbind(ya,yb,yc)~x,data=dat,sum)
>  # and fine if you accept
>  fo <- as.formula("cbind(ya,yb,yc)~x")
>  if(test) {
>        return(list(fo=fo,data=dat))
>  }
>  cat("do agg 2\n")
>  agg2 <- aggregate(fo,data=dat,sum)
>  list(agg1,agg2)
> }
> agg2(FALSE)
> ag <- agg2(TRUE)
> ag$fo
> aggregate(ag$fo,ag$data,sum)
>
>
> agg3 <- function() {
>  ag <- agg2(TRUE)
>  ag$fo
>  aggregate(ag$fo,ag$data,sum)
> }
> agg3()
>
> # end R code
> ==============
> Paul Bailey
> University of Maryland

The problem is that the aggregate statement:

agg2 <- aggregate(fo, data = dat, sum)

is using non-standard evaluation and is literally looking at fo rather
than fo's value.  This may be a bug in aggregate.formula but at any
rate you could try replacing that statement with the following to
force fo to be evaluated:

agg2 <- do.call(aggregate, list(fo, data = dat, FUN = sum))

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-devel mailing list