[R] Recoding Multiple Variables in a Data Frame in One Step

Peter Ehlers ehlers at ucalgary.ca
Tue Jul 26 07:06:11 CEST 2011


On 2011-07-25 15:48, William Dunlap wrote:
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of David Winsemius
>> Sent: Monday, July 25, 2011 3:39 PM
>> To: Anthony Damico
>> Cc: r-help at r-project.org
>> Subject: Re: [R] Recoding Multiple Variables in a Data Frame in One Step
>>
>>
>> On Jul 21, 2011, at 8:06 PM, Anthony Damico wrote:
>>
>>> Hi, I can't for the life of me find how to do this in base R, but
>>> I'd be
>>> surprised if it's not possible.
>>>
>>> I'm just trying to replace multiple columns at once in a data frame.
>>>
>>> #load example data
>>> data(api)
>>>
>>> #this displays the three columns and eight rows i'd like to replace
>>> apiclus1[ apiclus1$meals>  98 , c( "pcttest" , "api00" ,
>>> "sch.wide" ) ]
>>>
>>>
>>> #the goal is to replace pcttest with 100, api100 with NA, and
>>> sch.wide with
>>> "Maybe"
>>>
>>> #this doesn't work--
>>> apiclus1[ apiclus1$meals>  98 , c( "pcttest" , "api00" ,
>>> "sch.wide" ) ]<-
>>> c( 100 , NA , "Maybe" )
>
> Try list(pcttest=100, api00=NA, sch.wide="Maybe") instead
> of c(100, NA, "Maybe") as the new value.
>
> Here is a self-contained example
>   >  df<- data.frame(Size=sin(1:10), Name=state.name[11:20], Value=11:20, stringsAsFactors=FALSE)
>   >  df[df$Size<0, c("Name", "Value")]<- list(Name="JUNK", Value=-99)
>   >  df
>            Size      Name Value
>   1   0.8414710    Hawaii    11
>   2   0.9092974     Idaho    12
>   3   0.1411200  Illinois    13
>   4  -0.7568025      JUNK   -99
>   5  -0.9589243      JUNK   -99
>   6  -0.2794155      JUNK   -99
>   7   0.6569866  Kentucky    17
>   8   0.9893582 Louisiana    18
>   9   0.4121185     Maine    19
>   10 -0.5440211      JUNK   -99
>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com

Here's another solution, using within():

   within(df, {Name[Size<0] <- 'JUNK'
              Value[Size<0] <- -99})

But I like William's simple list() solution.

Peter Ehlers

>
>>>
>>> #the results replace downward instead of across
>>> apiclus1[ apiclus1$meals>  98 , c( "pcttest" , "api00" ,
>>> "sch.wide" ) ]
>>
>> If I had noted that I would have tried this:
>>
>> apiclus1[ apiclus1$meals>  98 , rep( c( "pcttest" , "api00" ,
>> "sch.wide" ),
>>                                          each =  sum(apiclus1$meals>  98)
>>                                       ) ]
>>
>> Should be pretty easy to test, but since _you_ are the one responsible
>> for providing examples for testing when posting to rhelp,  I am going
>> to throw an untested theory back at you.
>>
>>
>>>
>>> I know I can do this with a few more steps (like one variable at a
>>> time or
>>> by counting the number of rows to replace and then using rep() ..but
>>> I'm
>>> hoping there's a quicker way?
>>>
>>>
>>> Thanks!!
>>>
>>> Anthony Damico
>>
>>
>> David Winsemius, MD
>> West Hartford, CT
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list