[R] lapply to change variable names and variable values

Steve Lianoglou mailinglist.honeypot at gmail.com
Mon Mar 12 19:49:19 CET 2012


Hi,

On Mon, Mar 12, 2012 at 2:37 PM, Simon Kiss <sjkiss at gmail.com> wrote:
> Hi: I'm sure this is a very easy problem. I've consulted Data Manipulation With R and the R Book and can't find an answer.
>
> Sample list of data frames looks as follows:
>
> .xx<-list(df<-data.frame(Var1=rep('Alabama', 400), Var2=rep(c(2004, 2005, 2006, 2007), 400)), df2<-data.frame(Var1=rep('Tennessee', 400), Var2=rep(c(2004,2005,2006,2007), 400)), df3<-data.frame(Var1=rep('Alaska', 400), Var2=rep(c(2004,2005,2006,2007), 400)) )
>
> I would like to accomplish the following two tasks.
> First, I'd like to go through and change the names of each of the data frames within the list
> to be 'State' and 'Year'
>
> Second, I'd like to go through and add one year to each of the 'Var2'  variables.
>
> Third, I'd like to then delete those cases in the data frames that have values of Var2 (or Year) values of 2008.
>
> I could do this manually, but my data are actually bigger than this, plus I'd really like to learn. I've been trying to use lapply, but I can't get my head around how it works:
>  .xx<- lapply(.xx, function(x) colnames(x)<-c('State', 'Year')
> just changes the actual list of data frames to a list of the character string ('State' and 'Year')  How do I actually change the underlying variable names?

Almost there, you have to return the data.frame you've just changed, eg:

xx <- lapply(.xx, function(x) {
  colnames(x) <- c('state', 'year')
  x
})

If you want to remove the rows that correspond to 2008, you can do this:

xx <- lapply(.xx, function(x) {
  colnames(x) <- c('state', 'year')
  subset(x, year != 2008)
})

HTH,
-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the R-help mailing list