[R] reduce three columns to one with the colnames
William Dunlap
wdunlap at tibco.com
Mon May 13 18:13:30 CEST 2013
If the dataset is large you may prefer to process it by column instead of by row. E.g.,
> m <- matrix(0, nrow=1e6, ncol=3, dimnames=list(NULL,c("Red","Green","Blue")))
> m[cbind(seq_len(nrow(m)), sample(ncol(m), size=nrow(m), replace=TRUE))] <- 1
> head(d)
Red Green Blue
1 0 0 1
2 0 1 0
3 1 0 0
4 0 0 1
5 0 1 0
6 0 0 1
> system.time(byRow <- colnames(d)[apply(d, 1, function(x)which(x==1))])
user system elapsed
73.81 0.19 74.64
> system.time(byCol <- with(d, ifelse(Red==1, "Red", ifelse(Green==1, "Green", "Blue"))))
user system elapsed
0.85 0.00 1.00
> identical(byRow, byCol)
[1] TRUE
Also, you ought to add checks that the data looks like what you think it does
stopifnot(all(as.matrix(d) %in% c(0, 1)), all(rowSums(d)==1))
or both of the above methods will silently give misleading results.
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of arun
> Sent: Monday, May 13, 2013 8:37 AM
> To: studerov at gmail.com
> Cc: R help
> Subject: Re: [R] reduce three columns to one with the colnames
>
> HI,
> May be:
> dat1<- read.table(text="
> male female transsexuals
> 0 1 0
> 1 0 0
> 0 0 1
> 0 1 0
> 1 0 0
> 1 0 0
> 0 1 0
> ",sep="",header=TRUE)
>
> dat1$sex<-colnames(dat1)[apply(dat1,1,function(x) which(x==1))]
> dat1
> # male female transsexuals sex
> #1 0 1 0 female
> #2 1 0 0 male
> #3 0 0 1 transsexuals
> #4 0 1 0 female
> #5 1 0 0 male
> #6 1 0 0 male
> #7 0 1 0 female
>
>
> A.K.
>
>
>
> ----- Original Message -----
> From: David Studer <studerov at gmail.com>
> To: Bert Gunter <gunter.berton at gene.com>
> Cc: r-help at r-project.org
> Sent: Monday, May 13, 2013 11:22 AM
> Subject: Re: [R] reduce three columns to one with the colnames
>
> OK, seems like nobody understood my question ;-)
>
> Let's make another example:
>
> I have three variables:
> data$male and data$female and data$transsexuals
>
> All the three of them contain the values 0 and 1.
>
> Now I'd like to create another variable data$sex. Now in all cases where
> data$female==1 the variable data$sex should be set to 'female', all in all
> cases
> where data$male==1 the variable data$sex should be set to 'male' and so
> on...
>
> Thank you!
>
> David
>
>
>
>
> 2013/5/13 Bert Gunter <gunter.berton at gene.com>
>
> > No -- my answer is wrong. I'll leave it to others to correct. Obvious
> > question to OP: What if more than one of your colors variables
> > simultaneously have a 1?
> >
> > -- Bert
> >
> > On Mon, May 13, 2013 at 8:09 AM, Bert Gunter <bgunter at gene.com> wrote:
> > > Cute answer, Pascal. It may even be the answer to the question the OP
> > > should have asked, but I don't think it answered the question that was
> > > asked. That might be:
> > >
> > > c("red"[red], "green"[green], "blue"[blue])
> > >
> > > Cheers,
> > > Bert
> > >
> > > On Mon, May 13, 2013 at 7:36 AM, Pascal Oettli <kridox at ymail.com> wrote:
> > >> Hi,
> > >>
> > >> ?rgb
> > >>
> > >> HTH
> > >> Pascal
> > >>
> > >>
> > >> 2013/5/13 David Studer <studerov at gmail.com>
> > >>
> > >>> Hello everybody,
> > >>>
> > >>> I have three variables "blue", "green" and "red" containing values 0
> > (no)
> > >>> and 1 (yes).
> > >>>
> > >>> How can I easily create another variable "colors" with the values
> > "blue",
> > >>> "green" and "red"?
> > >>>
> > >>> I hope that you can understand my question and appreciate any
> > solutions or
> > >>> hints!
> > >>>
> > >>> Thank you!
> > >>> David
> > >>>
> > >>> [[alternative HTML version deleted]]
> > >>>
> > >>> ______________________________________________
> > >>> R-help at r-project.org mailing list
> > >>> https://stat.ethz.ch/mailman/listinfo/r-help
> > >>> PLEASE do read the posting guide
> > >>> http://www.R-project.org/posting-guide.html
> > >>> and provide commented, minimal, self-contained, reproducible code.
> > >>>
> > >>
> > >> [[alternative HTML version deleted]]
> > >>
> > >> ______________________________________________
> > >> R-help at r-project.org mailing list
> > >> https://stat.ethz.ch/mailman/listinfo/r-help
> > >> PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > >> and provide commented, minimal, self-contained, reproducible code.
> > >
> > >
> > >
> > > --
> > >
> > > Bert Gunter
> > > Genentech Nonclinical Biostatistics
> > >
> > > Internal Contact Info:
> > > Phone: 467-7374
> > > Website:
> > >
> > http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-
> biostatistics/pdb-ncb-home.htm
> >
> >
> >
> > --
> >
> > Bert Gunter
> > Genentech Nonclinical Biostatistics
> >
> > Internal Contact Info:
> > Phone: 467-7374
> > Website:
> >
> > http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-
> biostatistics/pdb-ncb-home.htm
> >
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list