[R] subset of factors in a regression
Ben Bolker
bbolker at gmail.com
Tue Jul 2 06:39:59 CEST 2013
Philip A. Viton <viton.1 <at> osu.edu> writes:
> suppose "state" is a variable in a dataframe containing abbreviations
> of the US states, as a factor. What I'd like to do is to include
> dummy variables for a few of the states, (say, CA and MA) among the
> independent variables in my regression formula. (This would be the
> equivalent of, creating, eg, ca<-state=="CA") and then including
> that). I know I can create all the necessary dummy variables by using
> the "outer" function on the factor and then renaming them
> appropriately; but is there a solution that's more direct, ie that
> doesn't involve a lot of new variables?
>
> Thanks!
You could use model.matrix(~state-1) and select the columns
you want, e.g.
state <- state.abb; m <- model.matrix(~state-1)
m[,colnames(m) %in% c("stateCA","stateMA")]
-- but this will actually create a bunch of vectors you
want before throwing them away.
more compactly:
m <- sapply(cstates,"==",state)
storage.mode(m) <- "numeric"
## or m[] <- as.numeric(m)
More information about the R-help
mailing list