[R] Write a function that allows access to columns of a passeddataframe.

Rui Barradas ruipbarradas at sapo.pt
Tue Dec 6 15:44:16 CET 2016


Ok, that's a way of seeing it.

Rui Barradas

Em 06-12-2016 14:28, John Sorkin escreveu:
> Over my almost 50 years programming, I have come to believe that if one
> wants a program to be useful, one should write the program to do as much
> work as possible and demand as little as possible from the user of the
> program. In my opinion, one should not ask the person who uses my
> function to remember to put the name of the data frame column in
> quotation marks. The function should be written so that all that needs
> to be passed is the name of the column; the function should take care of
> the quotation marks.
> Jihny
>
>> John David Sorkin M.D., Ph.D.
>> Professor of Medicine
>> Chief, Biostatistics and Informatics
>> University of Maryland School of Medicine Division of Gerontology and
>> Geriatric Medicine
>> Baltimore VA Medical Center
>> 10 North Greene Street <x-apple-data-detectors://12>
>> GRECC <x-apple-data-detectors://12> (BT/18/GR)
>> Baltimore, MD 21201-1524 <x-apple-data-detectors://13/0>
>> (Phone) 410-605-711 <tel:410-605-7119>9
>> (Fax)410-605-7913 <tel:410-605-7913> (Please call phone number above
>> prior to faxing)
>
> On Dec 6, 2016, at 3:17 AM, Rui Barradas <ruipbarradas at sapo.pt
> <mailto:ruipbarradas at sapo.pt>> wrote:
>
>> Hello,
>>
>> Just to say that I wouldn't write the function as John did. I would get
>> rid of all the deparse/substitute stuff and instinctively use a quoted
>> argument as a column name. Something like the following.
>>
>> myfun <- function(frame, var){
>>    [...]
>>    col <- frame[, var]  # or frame[[var]]
>>    [...]
>> }
>>
>> myfun(mydf, "age")  # much better, simpler, no promises.
>>
>> Rui Barradas
>>
>> Em 05-12-2016 21:49, Bert Gunter escreveu:
>>> Typo: "lazy evaluation" not "lay evaluation."
>>>
>>> -- Bert
>>>
>>>
>>>
>>> Bert Gunter
>>>
>>> "The trouble with having an open mind is that people keep coming along
>>> and sticking things into it."
>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>>
>>>
>>> On Mon, Dec 5, 2016 at 1:46 PM, Bert Gunter <bgunter.4567 at gmail.com
>>> <mailto:bgunter.4567 at gmail.com>> wrote:
>>>> Sorry, hit "Send" by mistake.
>>>>
>>>> Inline.
>>>>
>>>>
>>>>
>>>> On Mon, Dec 5, 2016 at 1:34 PM, Bert Gunter <bgunter.4567 at gmail.com
>>>> <mailto:bgunter.4567 at gmail.com>> wrote:
>>>>> Inline.
>>>>>
>>>>> -- Bert
>>>>>
>>>>>
>>>>> Bert Gunter
>>>>>
>>>>> "The trouble with having an open mind is that people keep coming along
>>>>> and sticking things into it."
>>>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>>>>
>>>>>
>>>>> On Mon, Dec 5, 2016 at 9:53 AM, Rui Barradas <ruipbarradas at sapo.pt
>>>>> <mailto:ruipbarradas at sapo.pt>> wrote:
>>>>>> Hello,
>>>>>>
>>>>>> Inline.
>>>>>>
>>>>>> Em 05-12-2016 17:09, David Winsemius escreveu:
>>>>>>>
>>>>>>>
>>>>>>>> On Dec 5, 2016, at 7:29 AM, John Sorkin
>>>>>>>> <jsorkin at grecc.umaryland.edu <mailto:jsorkin at grecc.umaryland.edu>>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Rui,
>>>>>>>> I appreciate your suggestion, but eliminating the deparse
>>>>>>>> statement does
>>>>>>>> not solve my problem. Do you have any other suggestions? See
>>>>>>>> code below.
>>>>>>>> Thank you,
>>>>>>>> John
>>>>>>>>
>>>>>>>>
>>>>>>>> mydf <-
>>>>>>>> data.frame(id=c(1,2,3,4,5),sex=c("M","M","M","F","F"),age=c(20,34,43,32,21))
>>>>>>>> mydf
>>>>>>>> class(mydf)
>>>>>>>>
>>>>>>>>
>>>>>>>> myfun <- function(frame,var){
>>>>>>>>   call <- match.call()
>>>>>>>>   print(call)
>>>>>>>>
>>>>>>>>
>>>>>>>>   indx <- match(c("frame","var"),names(call),nomatch=0)
>>>>>>>>   print(indx)
>>>>>>>>   if(indx[1]==0) stop("Function called without sufficient
>>>>>>>> arguments!")
>>>>>>>>
>>>>>>>>
>>>>>>>>   cat("I can get the name of the dataframe as a text string!\n")
>>>>>>>>   #xx <- deparse(substitute(frame))
>>>>>>>>   print(xx)
>>>>>>>>
>>>>>>>>
>>>>>>>>   cat("I can get the name of the column as a text string!\n")
>>>>>>>>   #yy <- deparse(substitute(var))
>>>>>>>>   print(yy)
>>>>>>>>
>>>>>>>>
>>>>>>>>   # This does not work.
>>>>>>>>   print(frame[,var])
>>>>>>>>
>>>>>>>>
>>>>>>>>   # This does not work.
>>>>>>>>   print(frame[,"var"])
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>   # This does not work.
>>>>>>>>   col <- xx[,"yy"]
>>>>>>>>
>>>>>>>>
>>>>>>>>   # Nor does this work.
>>>>>>>>   col <- xx[,yy]
>>>>>>>>   print(col)
>>>>>>>> }
>>>>>>>>
>>>>>>>>
>>>>>>>> myfun(mydf,age)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> When you use that calling syntax, the system will supply the
>>>>>>> values of
>>>>>>> whatever the `age` variable contains. (And if there is no `age`-named
>>>>>>> object, you get an error at the time of the call to `myfun`.
>>>>>>
>>>>>>
>>>>>> Actually, no, which was very surprising to me but John's code
>>>>>> worked (not
>>>>>> the function, the call). And with the change I've proposed, it worked
>>>>>> flawlessly. No errors. Why I don't know.
>>>>
>>>> See ?substitute and in particular the example highlighted there.
>>>>
>>>> The technical details are explained in the R Language Definition
>>>> manual. The key here is the use of promises for lay evaluations. In
>>>> fact, the expression in the call *is* available within the functions,
>>>> as is (a pointer to) the environment in which to evaluate the
>>>> expression. That is how substitute() works. Specifically, quoting from
>>>> the manual,
>>>>
>>>> *****
>>>> It is possible to access the actual (not default) expressions used as
>>>> arguments inside the function. The mechanism is implemented via
>>>> promises. When a function is being evaluated the actual expression
>>>> used as an argument is stored in the promise together with a pointer
>>>> to the environment the function was called from. When (if) the
>>>> argument is evaluated the stored expression is evaluated in the
>>>> environment that the function was called from. Since only a pointer to
>>>> the environment is used any changes made to that environment will be
>>>> in effect during this evaluation. The resulting value is then also
>>>> stored in a separate spot in the promise. Subsequent evaluations
>>>> retrieve this stored value (a second evaluation is not carried out).
>>>> Access to the unevaluated expression is also available using
>>>> substitute.
>>>> ********
>>>>
>>>> -- Bert
>>>>
>>>>
>>>>
>>>>
>>>>>>
>>>>>> Rui Barradas
>>>>>>
>>>>>>  You need either to call it as:
>>>>>>>
>>>>>>>
>>>>>>> myfun( mydf , "age")
>>>>>>>
>>>>>>>
>>>>>>> # Or:
>>>>>>>
>>>>>>> age <- "age"
>>>>>>> myfun( mydf, age)
>>>>>>>
>>>>>>> Unless your value of the `age`-named variable was "age" in the
>>>>>>> calling
>>>>>>> environment (and you did not give us that value in either of your
>>>>>>> postings),
>>>>>>> you would fail.
>>>>>>>
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-help at r-project.org <mailto:R-help at r-project.org> mailing list --
>>>>>> To UNSUBSCRIBE and more, see
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide
>>>>>> http://www.R-project.org/posting-guide.html
>>>>>> and provide commented, minimal, self-contained, reproducible code.
>
> _*Confidentiality Statement:*_
>
> This email message, including any attachments, is for ...{{dropped:7}}



More information about the R-help mailing list