[R] Write a function that allows access to columns of a passeddataframe.
Rui Barradas
ruipbarradas at sapo.pt
Tue Dec 6 16:33:40 CET 2016
Perhaps the best way is the one used by library(), where both
library(package) and library("package") work. It uses
as.charecter/substitute, not deparse/substitute, as follows.
mydf <-
data.frame(id=c(1,2,3,4,5),sex=c("M","M","M","F","F"),age=c(20,34,43,32,21))
mydf
class(mydf)
str(mydf)
myfun <- function(frame,var){
yy <- as.character(substitute(var))
frame[, yy]
}
myfun(mydf, age)
myfun(mydf, "age")
Rui Barradas
Em 06-12-2016 15:03, William Dunlap escreveu:
> I basically agree with Rui - using substitute will cause trouble. E.g., how
> would the user iterate over the columns, calling your function for each?
> for(column in dataFrame) func(column)
> would fail because dataFrame$column does not exist. You need to provide
> an extra argument to handle this case. something like the following:
> func <- function(df,
> columnAsName,,
> columnAsString = deparse(substitute(columnAsName))[1])
> ...
> }
> The default value of columnAsString should also deal with the case that
> the user supplied something like log(Conc.) instead of Conc.
>
> I think that using a formula for the lazily evaluated argument
> (columnAsName)
> works well. The user then knows exactly how it gets evaluated.
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com <http://tibco.com>
>
> On Tue, Dec 6, 2016 at 6:28 AM, John Sorkin <jsorkin at grecc.umaryland.edu
> <mailto:jsorkin at grecc.umaryland.edu>> wrote:
>
> Over my almost 50 years programming, I have come to believe that if
> one wants a program to be useful, one should write the program to do
> as much work as possible and demand as little as possible from the
> user of the program. In my opinion, one should not ask the person
> who uses my function to remember to put the name of the data frame
> column in quotation marks. The function should be written so that
> all that needs to be passed is the name of the column; the function
> should take care of the quotation marks.
> Jihny
>
> > John David Sorkin M.D., Ph.D.
> > Professor of Medicine
> > Chief, Biostatistics and Informatics
> > University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine
> > Baltimore VA Medical Center
> > 10 North Greene Street
> > GRECC (BT/18/GR)
> > Baltimore, MD 21201-1524
> > (Phone)410-605-7119 <tel:410-605-7119>
> > (Fax)410-605-7913 <tel:410-605-7913> (Please call phone number above
> prior to faxing)
>
>
> > On Dec 6, 2016, at 3:17 AM, Rui Barradas <ruipbarradas at sapo.pt
> <mailto:ruipbarradas at sapo.pt>> wrote:
> >
> > Hello,
> >
> > Just to say that I wouldn't write the function as John did. I
> would get
> > rid of all the deparse/substitute stuff and instinctively use a
> quoted
> > argument as a column name. Something like the following.
> >
> > myfun <- function(frame, var){
> > [...]
> > col <- frame[, var] # or frame[[var]]
> > [...]
> > }
> >
> > myfun(mydf, "age") # much better, simpler, no promises.
> >
> > Rui Barradas
> >
> > Em 05-12-2016 21:49, Bert Gunter escreveu:
> >> Typo: "lazy evaluation" not "lay evaluation."
> >>
> >> -- Bert
> >>
> >>
> >>
> >> Bert Gunter
> >>
> >> "The trouble with having an open mind is that people keep coming
> along
> >> and sticking things into it."
> >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >>
> >>
> >>> On Mon, Dec 5, 2016 at 1:46 PM, Bert Gunter
> <bgunter.4567 at gmail.com <mailto:bgunter.4567 at gmail.com>> wrote:
> >>> Sorry, hit "Send" by mistake.
> >>>
> >>> Inline.
> >>>
> >>>
> >>>
> >>>> On Mon, Dec 5, 2016 at 1:34 PM, Bert Gunter
> <bgunter.4567 at gmail.com <mailto:bgunter.4567 at gmail.com>> wrote:
> >>>> Inline.
> >>>>
> >>>> -- Bert
> >>>>
> >>>>
> >>>> Bert Gunter
> >>>>
> >>>> "The trouble with having an open mind is that people keep
> coming along
> >>>> and sticking things into it."
> >>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >>>>
> >>>>
> >>>>> On Mon, Dec 5, 2016 at 9:53 AM, Rui Barradas
> <ruipbarradas at sapo.pt <mailto:ruipbarradas at sapo.pt>> wrote:
> >>>>> Hello,
> >>>>>
> >>>>> Inline.
> >>>>>
> >>>>> Em 05-12-2016 17:09, David Winsemius escreveu:
> >>>>>>
> >>>>>>
> >>>>>>> On Dec 5, 2016, at 7:29 AM, John Sorkin
> <jsorkin at grecc.umaryland.edu <mailto:jsorkin at grecc.umaryland.edu>>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>> Rui,
> >>>>>>> I appreciate your suggestion, but eliminating the deparse
> statement does
> >>>>>>> not solve my problem. Do you have any other suggestions?
> See code below.
> >>>>>>> Thank you,
> >>>>>>> John
> >>>>>>>
> >>>>>>>
> >>>>>>> mydf <-
> >>>>>>>
> data.frame(id=c(1,2,3,4,5),sex=c("M","M","M","F","F"),age=c(20,34,43,32,21))
> >>>>>>> mydf
> >>>>>>> class(mydf)
> >>>>>>>
> >>>>>>>
> >>>>>>> myfun <- function(frame,var){
> >>>>>>> call <- match.call()
> >>>>>>> print(call)
> >>>>>>>
> >>>>>>>
> >>>>>>> indx <- match(c("frame","var"),names(call),nomatch=0)
> >>>>>>> print(indx)
> >>>>>>> if(indx[1]==0) stop("Function called without sufficient
> arguments!")
> >>>>>>>
> >>>>>>>
> >>>>>>> cat("I can get the name of the dataframe as a text
> string!\n")
> >>>>>>> #xx <- deparse(substitute(frame))
> >>>>>>> print(xx)
> >>>>>>>
> >>>>>>>
> >>>>>>> cat("I can get the name of the column as a text string!\n")
> >>>>>>> #yy <- deparse(substitute(var))
> >>>>>>> print(yy)
> >>>>>>>
> >>>>>>>
> >>>>>>> # This does not work.
> >>>>>>> print(frame[,var])
> >>>>>>>
> >>>>>>>
> >>>>>>> # This does not work.
> >>>>>>> print(frame[,"var"])
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> # This does not work.
> >>>>>>> col <- xx[,"yy"]
> >>>>>>>
> >>>>>>>
> >>>>>>> # Nor does this work.
> >>>>>>> col <- xx[,yy]
> >>>>>>> print(col)
> >>>>>>> }
> >>>>>>>
> >>>>>>>
> >>>>>>> myfun(mydf,age)
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> When you use that calling syntax, the system will supply the
> values of
> >>>>>> whatever the `age` variable contains. (And if there is no
> `age`-named
> >>>>>> object, you get an error at the time of the call to `myfun`.
> >>>>>
> >>>>>
> >>>>> Actually, no, which was very surprising to me but John's code
> worked (not
> >>>>> the function, the call). And with the change I've proposed,
> it worked
> >>>>> flawlessly. No errors. Why I don't know.
> >>>
> >>> See ?substitute and in particular the example highlighted there.
> >>>
> >>> The technical details are explained in the R Language Definition
> >>> manual. The key here is the use of promises for lay evaluations. In
> >>> fact, the expression in the call *is* available within the
> functions,
> >>> as is (a pointer to) the environment in which to evaluate the
> >>> expression. That is how substitute() works. Specifically,
> quoting from
> >>> the manual,
> >>>
> >>> *****
> >>> It is possible to access the actual (not default) expressions
> used as
> >>> arguments inside the function. The mechanism is implemented via
> >>> promises. When a function is being evaluated the actual expression
> >>> used as an argument is stored in the promise together with a
> pointer
> >>> to the environment the function was called from. When (if) the
> >>> argument is evaluated the stored expression is evaluated in the
> >>> environment that the function was called from. Since only a
> pointer to
> >>> the environment is used any changes made to that environment
> will be
> >>> in effect during this evaluation. The resulting value is then also
> >>> stored in a separate spot in the promise. Subsequent evaluations
> >>> retrieve this stored value (a second evaluation is not carried
> out).
> >>> Access to the unevaluated expression is also available using
> >>> substitute.
> >>> ********
> >>>
> >>> -- Bert
> >>>
> >>>
> >>>
> >>>
> >>>>>
> >>>>> Rui Barradas
> >>>>>
> >>>>> You need either to call it as:
> >>>>>>
> >>>>>>
> >>>>>> myfun( mydf , "age")
> >>>>>>
> >>>>>>
> >>>>>> # Or:
> >>>>>>
> >>>>>> age <- "age"
> >>>>>> myfun( mydf, age)
> >>>>>>
> >>>>>> Unless your value of the `age`-named variable was "age" in
> the calling
> >>>>>> environment (and you did not give us that value in either of
> your postings),
> >>>>>> you would fail.
> >>>>>>
> >>>>>
> >>>>> ______________________________________________
> >>>>> R-help at r-project.org <mailto:R-help at r-project.org> mailing
> list -- To UNSUBSCRIBE and more, see
> >>>>> https://stat.ethz.ch/mailman/listinfo/r-help
> <https://stat.ethz.ch/mailman/listinfo/r-help>
> >>>>> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> <http://www.R-project.org/posting-guide.html>
> >>>>> and provide commented, minimal, self-contained, reproducible
> code.
>
> Confidentiality Statement:
> This email message, including any attachments, is for the sole use
> of the intended recipient(s) and may contain confidential and
> privileged information. Any unauthorized use, disclosure or
> distribution is prohibited. If you are not the intended recipient,
> please contact the sender by reply email and destroy all copies of
> the original message.
> ______________________________________________
> R-help at r-project.org <mailto:R-help at r-project.org> mailing list --
> To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> <https://stat.ethz.ch/mailman/listinfo/r-help>
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> <http://www.R-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>
>
More information about the R-help
mailing list