[R] coded to categorical variables in a large dataset
Chuck Cleland
ccleland at optonline.net
Fri Dec 29 19:27:57 CET 2006
sj wrote:
> I am working with a dataset where there are 5 possible outcomes (coded 1:5),
> I would like to create 5 categorical variables (event1...event5). I am using
> a for loop an if statements, but I have a large dataset( approx 100,000
> rows) it takes quite a bit of time, is there a way to speed this up? Here is
> some sample code of what I am currently doing.
Here is one way you might do it:
X <- sample(1:5, 100, replace=TRUE)
# Your 5 event variables in a matrix
model.matrix(lm(rnorm(length(X)) ~ as.factor(X) - 1))
Also, along the lines of your approach below, the following using
ifelse() might be better:
event3 <- ifelse(test2 == 3, 1, 0)
I'm sure other people will post different solutions probably more
elegant than these.
> test2 <-rep(seq(1:5),2000)
>
> event1 <- rep(0,nrow(test2))
> event2 <- rep(0,nrow(test2))
> event3 <- rep(0,nrow(test2))
> event4 <- rep(0,nrow(test2))
> event5 <- rep(0,nrow(test2))
>
> for(i in 1:length(event1))
> {
> if (test2[i]==1)
> {
> event1[i]=1
> }
>
> if (test2[i]==2)
> {
> event2[i]=1
> }
>
> if (test2[i]==3)
> {
> event3[i]=1
> }
>
> if (test2[i]==4)
> {
> event4[i]=1
> }
>
> if (test2[i]==5)
> {
> event5[i]=1
> }
> }
>
>
>
> thanks,
>
> Spencer
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894
More information about the R-help
mailing list