[R] problem with lapply(x, subset, ...) and variable select argument

Dimitris Rizopoulos dimitris.rizopoulos at med.kuleuven.be
Tue Oct 11 10:57:39 CEST 2005


As Gabor said, the issue here is that subset.data.frame() evaluates 
the value of the `select' argument in the parent.frame(); Thus, if you 
create a local function within lapply() (or sapply()) it works:

tt <- function (n) {
    x <- list(data.frame(a = 1, b = 2), data.frame(a = 3, b = 4))
    print(lapply(x, function(y, n) subset(y, select = n), n = n))
    print(sapply(x, function(y, n) subset(y, select = n), n = n))
}

tt("a")


I hope it helps.

Best,
Dimitris

----
Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://www.med.kuleuven.be/biostat/
     http://www.student.kuleuven.be/~m0390867/dimitris.htm



----- Original Message ----- 
From: "joerg van den hoff" <j.van_den_hoff at fz-rossendorf.de>
To: "Gabor Grothendieck" <ggrothendieck at gmail.com>; "Thomas Lumley" 
<tlumley at u.washington.edu>
Cc: "r-help" <r-help at stat.math.ethz.ch>
Sent: Tuesday, October 11, 2005 10:18 AM
Subject: Re: [R] problem with lapply(x, subset,...) and variable 
select argument


> Gabor Grothendieck wrote:
>> The problem is that subset looks into its parent frame but in this
>> case the parent frame is not the environment in tt but the 
>> environment
>> in lapply since tt does not call subset directly but rather lapply 
>> does.
>>
>> Try this which is similar except we have added the line beginning
>> with environment before the print statement.
>>
>> tt <- function (n) {
>>    x <- list(data.frame(a=1,b=2), data.frame(a=3,b=4))
>>    environment(lapply) <- environment()
>>    print(lapply(x, subset, select = n))
>> }
>>
>> n <- "b"
>> tt("a")
>>
>> What this does is create a new version of lapply whose
>> parent is the environment in tt.
>>
>>
>> On 10/10/05, joerg van den hoff <j.van_den_hoff at fz-rossendorf.de> 
>> wrote:
>>
>>>I need to extract identically named columns from several data 
>>>frames in
>>>a list. the column name is a variable (i.e. not known in advance). 
>>>the
>>>whole thing occurs within a function body. I'd like to use lapply 
>>>with a
>>>variable 'select' argument.
>>>
>>>
>>>example:
>>>
>>>tt <- function (n) {
>>>   x <- list(data.frame(a=1,b=2), data.frame(a=3,b=4))
>>>   for (xx in x) print(subset(xx, select = n))   ### works
>>>   print (lapply(x, subset, select = a))   ### works
>>>   print (lapply(x, subset, select = "a"))  ### works
>>>   print (lapply(x, subset, select = n))  ### does not work as 
>>> intended
>>>}
>>>n = "b"
>>>tt("a")  #works (but selects not the intended column)
>>>rm(n)
>>>tt("a")   #no longer works in the lapply call including variable 
>>>'n'
>>>
>>>
>>>question: how  can I enforce evaluation of the variable n such that
>>>the lapply call works? I suspect it has something to do with eval 
>>>and
>>>specifying the correct evaluation frame, but how? ....
>>>
>>>
>>>many thanks
>>>
>>>joerg
>>>
>>>______________________________________________
>>>R-help at stat.math.ethz.ch mailing list
>>>https://stat.ethz.ch/mailman/listinfo/r-help
>>>PLEASE do read the posting guide! 
>>>http://www.R-project.org/posting-guide.html
>>>
>>
>>
>
> many thanks to thomas and gabor for their help. both solutions solve 
> my
> problem perfectly.
>
> but just as an attempt to improve my understanding of the inner 
> workings
> of R (similar problems are sure to come up ...) two more question:
>
> 1.
> why does the call of the "[" function (thomas' solution) behave
> different from "subset" in that the look up of the variable "n" 
> works
> without providing lapply with the current environment (which is 
> nice)?
>
> 2.
> using 'subset' in this context becomes more cumbersome, if sapply is
> used. it seems that than I need
> ...
> environment(sapply) <- environment(lapply) <- environment()
> sapply(x, subset, select = n))
> ...
> to get it working (and that means you must know, that sapply uses
> lapply). or can I somehow avoid the additional explicit definition 
> of
> the lapply-environment?
>
>
> again: many thanks
>
> joerg
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm




More information about the R-help mailing list