[R] lapply to change variable names and variable values
Sarah Goslee
sarah.goslee at gmail.com
Mon Mar 12 19:52:53 CET 2012
Hi Simon,
On Mon, Mar 12, 2012 at 2:37 PM, Simon Kiss <sjkiss at gmail.com> wrote:
> Hi: I'm sure this is a very easy problem. I've consulted Data Manipulation With R and the R Book and can't find an answer.
>
> Sample list of data frames looks as follows:
>
> .xx<-list(df<-data.frame(Var1=rep('Alabama', 400), Var2=rep(c(2004, 2005, 2006, 2007), 400)), df2<-data.frame(Var1=rep('Tennessee', 400), Var2=rep(c(2004,2005,2006,2007), 400)), df3<-data.frame(Var1=rep('Alaska', 400), Var2=rep(c(2004,2005,2006,2007), 400)) )
I tweaked this a bit so that it doesn't actually create df, df2, df3 as well as
making a list of them, and so that xx doesn't begin with a . and shows up with
ls(). I don't need invisible objects in my testing session.
xx<-list(df=data.frame(Var1=rep('Alabama', 400), Var2=rep(c(2004,
2005, 2006, 2007), 400)), df2=data.frame(Var1=rep('Tennessee', 400),
Var2=rep(c(2004,2005,2006,2007), 400)),
df3=data.frame(Var1=rep('Alaska', 400),
Var2=rep(c(2004,2005,2006,2007), 400)) )
> I would like to accomplish the following two tasks.
> First, I'd like to go through and change the names of each of the data frames within the list
> to be 'State' and 'Year'
>
> Second, I'd like to go through and add one year to each of the 'Var2' variables.
>
> Third, I'd like to then delete those cases in the data frames that have values of Var2 (or Year) values of 2008.
>
> I could do this manually, but my data are actually bigger than this, plus I'd really like to learn. I've been trying to use lapply, but I can't get my head around how it works:
> .xx<- lapply(.xx, function(x) colnames(x)<-c('State', 'Year')
> just changes the actual list of data frames to a list of the character string ('State' and 'Year') How do I actually change the underlying variable names?
Your function doesn't return the right thing. To see how it works, it's often a
good idea to write a stand-alone function and see what it does. For instance,
rename <- function(x) {
colnames(x)<-c('State', 'Year')
x
}
To me at least, as soon as it's written as a stand-alone it's obvious that
you have to return x in the last line. You can either use rename() in your
lapply statement:
xx<- lapply(xx, rename)
or you can write the full function into the lapply statement:
> xx<-list(df=data.frame(Var1=rep('Alabama', 400), Var2=rep(c(2004, 2005, 2006, 2007), 400)), df2=data.frame(Var1=rep('Tennessee', 400), Var2=rep(c(2004,2005,2006,2007), 400)), df3=data.frame(Var1=rep('Alaska', 400), Var2=rep(c(2004,2005,2006,2007), 400)) )
> xx <- lapply(xx, function(x){ colnames(x)<-c('State', 'Year'); x} )
> colnames(xx[[1]])
[1] "State" "Year"
The same strategy should work for your other needs as well.
Sarah
--
Sarah Goslee
http://www.functionaldiversity.org
More information about the R-help
mailing list