[R] Accessing specific data.frame columns within function
Greg Snow
538280 at gmail.com
Fri Feb 5 19:42:07 CET 2016
You are trying to use shortcuts where shortcuts are not appropriate
and having to go a lot longer around than if you did not use the
shortcut, see fortune(312).
You should really reread the help page: help("[[") and section 6.1 of
An Introduction to R.
Basically you should be able to do something like:
f <- function(data, oldnames) {
data <- data[ data[[oldnames[2] ]] == 4, ]
data[['d']] <- data[[ oldnames[1] ]]^2 + data[[ oldnames[2] ]]
data
}
Or maybe a little more readable (but not as good a golf score):
f <- function(data, oldnames) {
aa <- oldnames[1]
cc <- oldnames[2]
data <- data[ data[[ cc ]] == 4, ]
data[['d']] <- data[[ aa ]]^2 + data[[ cc ]]
data
}
I could have used a and c instead of aa and cc, but the doubled
letters mean less confusion with the `c` function in R.
Also you should read (and heed) the Warning section on the help page
for subset (?subset).
On Thu, Feb 4, 2016 at 9:13 PM, Clark Kogan <kogan.clark at gmail.com> wrote:
> Hello,
>
> I am trying to write a function that adds a few columns to a data.frame. The
> function uses the columns in a specific way. For instance, it might take a^2
> + c to produce a column d. Or it might do more complex manipulations that I
> don't think I need to discuss here. I want to keep x as a data.frame when I
> pass it into the function, as I want to use some data.frame functionality on
> x.
>
> Furthermore, I don't want the names in x to have to be specific. I want to
> be able to specify which columns the function should treat as "a" and "c".
>
> The way I am currently doing it, is that I pass the names of the columns
> that I want to treat as a and c.
>
> f <- function(data,oldnames) {
> newnames <- c("a","c")
> ix <- match(oldnames,names(y))
> names(y)[ix] <- newnames
> y <- subset(y,c==4)
> y$d <- y$a^2 + y$c
> ix <- match(newnames,names(y))
> names(y)[ix] <- oldnames
> y
> }
>
> y <- data.frame(k=c(1,1,1),l=c(2,2,5),m=c(4,2,4))
> f(y,c("k","m"))
>
> The way that I am doing it does not seem all that elegent or standard
> practice. My question is: are there potential problems programming with
> data.frames in this way, and are their standard practice methods of
> referencing data.frame names that deal with these problems?
>
> Thanks!
>
> Clark
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Gregory (Greg) L. Snow Ph.D.
538280 at gmail.com
More information about the R-help
mailing list