[R] For-loop dummy variables?
Adrian Dusa
dusa.adrian at gmail.com
Tue Oct 19 09:02:34 CEST 2010
gravityflyer <gravityflyer <at> yahoo.com> writes:
>
> Hi everyone,
>
> I've got a dataset with 12,000 observations. One of the variables
> (cleary$D1) is for an individual's country, coded 1 - 15. I'd like to create
> a dummy variable for the Baltic states which are coded 4,6, and 7. In other
> words, as a dummy variable Baltic states would be coded 1, else 0. I've
> attempted the following for loop:
>
> dummy <- matrix(NA, nrow=nrow(cleary), ncol=1)
> for (i in 1:length(cleary$D1)){
> if (cleary$D1 == 4){dummy[i] = 1}
> else {dummy[i] = 0}
> }
>
> Unfortunately it generates the following error:
>
> 1: In if (cleary$D1 == 4) { ... :
> the condition has length > 1 and only the first element will be used
>
> Another options I've tried is the following:
>
> binary <- vector(length=length(cleary$D1))
> for (i in 1:length(cleary$D1)) {
> if (cleary$D1 == 4 | cleary$D1 == 6 | cleary$D1 == 7 ) {binary[i] = 1}
> else {binary[i] = 0}
> }
>
> Unfortunately it simply responds with "syntax error".
>
> Any thoughts would be greatly appreciated!
>
Be aware that R is a vectorised programming language, therefore your for loop in
completely unnecessary.
This is what I'd do:
dummy <- rep(0, nrow(cleary))
dummy[cleary$D1 %in% c(4,6,7)] <- 1
This is your dummy variable.
Below is your working (though VERY inefficient) version of the for loop:
binary <- vector(length=length(cleary$D1))
for (i in 1:length(cleary$D1)) {
if (cleary$D1[i] == 4 | cleary$D1[i] == 6 | cleary$D1[i] == 7 ) {
binary[i] = 1
} else {
binary[i] = 0
}
}
Now try to figure out:
- what is the difference between your for() loop and mine?
- which code is more simple (and better), the vectorised or the for() loop?
I hope it helps,
Adrian
More information about the R-help
mailing list