[R] Add column to dataframe based on code in other column
Bert Gunter
gunter.berton at gene.com
Thu Aug 8 17:06:36 CEST 2013
Dark:
1. In future, please use dput() to post data to enable us to more
easily read them from your email.
2. As Berend demonstrates, using a more appropriate data structure is
what's required. Here is a slightly shorter, but perhaps trickier
alternative to his solution:
> df ## Your example data frame
Name State_Code
1 Tom 20
2 Harry 56
3 Ben 5
4 Sally 4
> l <-list(MidWest=MidWest,South=South,NorthEast=NorthEast,Other=Other,West=West)
> df <- within(df,regions <- rep(names(l),sapply(l,length))[match(State_Code,unlist(l))])
> df
Name State_Code regions
1 Tom 20 NorthEast
2 Harry 56 Other
3 Ben 5 West
4 Sally 4 South
3. Need I say that there may be other alternatives that might be better.
Cheers,
Bert
On Thu, Aug 8, 2013 at 7:14 AM, Berend Hasselman <bhh at xs4all.nl> wrote:
>
> On 08-08-2013, at 11:33, Dark <info at software-solutions.nl> wrote:
>
>> Hi all,
>>
>> I have a dataframe of users which contain US-state codes.
>> Now I want to add a column named REGION based on the state code. I have
>> already done a mapping:
>>
>> NorthEast <- c(07, 20, 22, 30, 31, 33, 39, 41, 47)
>> MidWest <- c(14, 15, 16, 17, 23, 24, 26, 28, 35, 36, 43, 52)
>> South <- c(01, 04, 08, 09, 10, 11, 18, 19, 21, 25, 34, 37, 42, 44, 45, 49,
>> 51)
>> West <- c(02, 03, 05, 06, 12, 13, 27, 29, 32, 38, 46, 50, 53)
>> Other <- c(40, 48, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 94,
>> 98, 99)
>>
>> So for example:
>> Name State_Code
>> Tom 20
>> Harry 56
>> Ben 05
>> Sally 04
>>
>> Should become like:
>> So for example:
>> Name State_Code REGION
>> Tom 20 NorthEast
>> Harry 56 Other
>> Ben 05 West
>> Sally 04 South
>>
>
> dd <- read.table(text="Name State_Code
> Tom 20
> Harry 56
> Ben 05
> Sally 04", header=TRUE, stringsAsFactors=FALSE)
>
> # Create table for regions indexed by state_code
>
> region.table <- rep("UNKNOWN",99)
> region.table[NorthEast] <- "NorthEast"
> region.table[MidWest] <- "MidWest"
> region.table[South] <- "South"
> region.table[West] <- "West"
> region.table[Other] <- "Other"
> region.table
>
> # then this is easy
>
> dd[,"REGION"] <- region.table[dd$State_Code]
>
>
> Berend
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Bert Gunter
Genentech Nonclinical Biostatistics
Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
More information about the R-help
mailing list