[R] problem with lapply(x, subset, ...) and variable select argument

Gabor Grothendieck ggrothendieck at gmail.com
Tue Oct 11 16:36:39 CEST 2005


Just one simple shortening of DR's solution:

tt <- function (n) {
   x <- list(data.frame(a=1,b=2), data.frame(a=3,b=4))
   print(sapply(x, function(...) subset(...), select = n))
}

n <- "b"
tt("a")


On 10/11/05, Dimitris Rizopoulos <dimitris.rizopoulos at med.kuleuven.be> wrote:
> As Gabor said, the issue here is that subset.data.frame() evaluates
> the value of the `select' argument in the parent.frame(); Thus, if you
> create a local function within lapply() (or sapply()) it works:
>
> tt <- function (n) {
>    x <- list(data.frame(a = 1, b = 2), data.frame(a = 3, b = 4))
>    print(lapply(x, function(y, n) subset(y, select = n), n = n))
>    print(sapply(x, function(y, n) subset(y, select = n), n = n))
> }
>
> tt("a")
>
>
> I hope it helps.
>
> Best,
> Dimitris
>
> ----
> Dimitris Rizopoulos
> Ph.D. Student
> Biostatistical Centre
> School of Public Health
> Catholic University of Leuven
>
> Address: Kapucijnenvoer 35, Leuven, Belgium
> Tel: +32/(0)16/336899
> Fax: +32/(0)16/337015
> Web: http://www.med.kuleuven.be/biostat/
>     http://www.student.kuleuven.be/~m0390867/dimitris.htm
>
>
>
> ----- Original Message -----
> From: "joerg van den hoff" <j.van_den_hoff at fz-rossendorf.de>
> To: "Gabor Grothendieck" <ggrothendieck at gmail.com>; "Thomas Lumley"
> <tlumley at u.washington.edu>
> Cc: "r-help" <r-help at stat.math.ethz.ch>
> Sent: Tuesday, October 11, 2005 10:18 AM
> Subject: Re: [R] problem with lapply(x, subset,...) and variable
> select argument
>
>
> > Gabor Grothendieck wrote:
> >> The problem is that subset looks into its parent frame but in this
> >> case the parent frame is not the environment in tt but the
> >> environment
> >> in lapply since tt does not call subset directly but rather lapply
> >> does.
> >>
> >> Try this which is similar except we have added the line beginning
> >> with environment before the print statement.
> >>
> >> tt <- function (n) {
> >>    x <- list(data.frame(a=1,b=2), data.frame(a=3,b=4))
> >>    environment(lapply) <- environment()
> >>    print(lapply(x, subset, select = n))
> >> }
> >>
> >> n <- "b"
> >> tt("a")
> >>
> >> What this does is create a new version of lapply whose
> >> parent is the environment in tt.
> >>
> >>
> >> On 10/10/05, joerg van den hoff <j.van_den_hoff at fz-rossendorf.de>
> >> wrote:
> >>
> >>>I need to extract identically named columns from several data
> >>>frames in
> >>>a list. the column name is a variable (i.e. not known in advance).
> >>>the
> >>>whole thing occurs within a function body. I'd like to use lapply
> >>>with a
> >>>variable 'select' argument.
> >>>
> >>>
> >>>example:
> >>>
> >>>tt <- function (n) {
> >>>   x <- list(data.frame(a=1,b=2), data.frame(a=3,b=4))
> >>>   for (xx in x) print(subset(xx, select = n))   ### works
> >>>   print (lapply(x, subset, select = a))   ### works
> >>>   print (lapply(x, subset, select = "a"))  ### works
> >>>   print (lapply(x, subset, select = n))  ### does not work as
> >>> intended
> >>>}
> >>>n = "b"
> >>>tt("a")  #works (but selects not the intended column)
> >>>rm(n)
> >>>tt("a")   #no longer works in the lapply call including variable
> >>>'n'
> >>>
> >>>
> >>>question: how  can I enforce evaluation of the variable n such that
> >>>the lapply call works? I suspect it has something to do with eval
> >>>and
> >>>specifying the correct evaluation frame, but how? ....
> >>>
> >>>
> >>>many thanks
> >>>
> >>>joerg
> >>>
> >>>______________________________________________
> >>>R-help at stat.math.ethz.ch mailing list
> >>>https://stat.ethz.ch/mailman/listinfo/r-help
> >>>PLEASE do read the posting guide!
> >>>http://www.R-project.org/posting-guide.html
> >>>
> >>
> >>
> >
> > many thanks to thomas and gabor for their help. both solutions solve
> > my
> > problem perfectly.
> >
> > but just as an attempt to improve my understanding of the inner
> > workings
> > of R (similar problems are sure to come up ...) two more question:
> >
> > 1.
> > why does the call of the "[" function (thomas' solution) behave
> > different from "subset" in that the look up of the variable "n"
> > works
> > without providing lapply with the current environment (which is
> > nice)?
> >
> > 2.
> > using 'subset' in this context becomes more cumbersome, if sapply is
> > used. it seems that than I need
> > ...
> > environment(sapply) <- environment(lapply) <- environment()
> > sapply(x, subset, select = n))
> > ...
> > to get it working (and that means you must know, that sapply uses
> > lapply). or can I somehow avoid the additional explicit definition
> > of
> > the lapply-environment?
> >
> >
> > again: many thanks
> >
> > joerg
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
> >
>
>
> Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>




More information about the R-help mailing list