[R] Seeking to Dummify Categorical Variables
David Winsemius
dwinsemius at comcast.net
Sun Apr 2 21:19:15 CEST 2017
> On Apr 2, 2017, at 11:48 AM, BR_email <br at dmstat1.com> wrote:
>
> Hi R'ers:
> I need a jump start to obtain my objective.
> Assistance is greatly appreciated.
> Bruce
>
> *******
> #Given Gender Dataset
> r1 <- c( 1, 2, 3)
> c1 <- c( "male", "female", "NA")
> GENDER <- data.frame(r1,c1)
> names(d1_3) <- c("ID","Gender")
#ITYM:
names(GENDER) <- c("ID","Gender")
> GENDER
> --------------
> _OBJECTIVE_: To dummify GENDER,
> i.e., to generate two new numeric columns,
> Gender_male and Gender_female,
> such that:
> when Gender="male" then Gender_male=1 and Gender_female=0
> when Gender="female" then Gender_male=0 and Gender_female=1
> when Gender="NA" then Gender_male=0 and Gender_female=0
>
> So, with the given dataset, the resultant dataset would be as follows:
> Desired Extended Gender Dataset
> ID Gender Gender_male Gender_female
> 1 male 1 0
> 2 female 0 1
> 3 NA 0 0
With that correction I think you might want:
> model.matrix( ID ~ Gender+0, data=GENDER )
Genderfemale Gendermale GenderNA
1 0 1 0
2 1 0 0
3 0 0 1
attr(,"assign")
[1] 1 1 1
attr(,"contrasts")
attr(,"contrasts")$Gender
[1] "contr.treatment"
If you assigned that to an object name, say "obj" you could get your desired result with:
> obj <- model.matrix( ID ~ Gender+0, data=GENDER )
> cbind(GENDER[ , 1, drop=FALSE], obj[,-3] )
ID Genderfemale Gendermale
1 1 0 1
2 2 1 0
3 3 0 0
I get the sense that you are trying to replicate a workflow that you developed in some other language and I think it would be more efficient for you to actually learn R rather than trying to write SAS or SPSS in R. If you like getting "into the weeds" of the language then I suggest trying to read the code in the `lm` function. It might help to refer back to Venables and Ripley's "S Programming" or reading Wickham's "Advanced R" pages on the web.
--
> Bruce Ratner, Ph.D.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius
Alameda, CA, USA
More information about the R-help
mailing list