[R] interaction between categorical variables

peter dalgaard pdalgd at gmail.com
Thu Jun 23 10:46:59 CEST 2011


On Jun 21, 2011, at 08:39 , taby gathoni wrote:

> 
> Dear R-users,
> 
> I need some  assistance.
> 
> I am running some interactive variables for categorical variables.
> 
> I have dgen(2 levels converted to dummy variables)  and dtoe(4-levels also converted to  dummy variables). So I have worked with them in two ways:
> i created a variable X1 = dgen*dtoe  and I get an error "Error in dgen * dtoe : non-conformable arrays"then i run a glm, binomial using that interaction variable and i get :                  logit_x = glm(samp2$STATUS ~ dgen*dtoe, data=samp2,family = binomial("logit"))
>> summary(logit_x)
> 
> Call:
> glm(formula = samp2$STATUS ~ dgen * dtoe, family = binomial("logit"), 
>     data = samp2)
> 
> Deviance Residuals: 
>     Min       1Q   Median       3Q      Max  
> -2.6594   0.2431   0.2563   0.2563   0.2563  
> 
> Coefficients: (5 not defined because of singularities)
>                               Estimate Std. Error z value Pr(>|z|)
> (Intercept)                  1.857e+01  4.612e+03   0.004    0.997
> dgendfemale                  1.024e-09  3.766e+03   0.000    1.000
> dgendmale                           NA         NA      NA       NA
> dtoedpermanent1             -1.517e+01  4.612e+03  -0.003    0.997
> dtoedcontract1              -1.511e+01  4.612e+03  -0.003    0.997
> dtoedprobation1              2.229e-09  4.982e+03   0.000    1.000
> dgendfemale:dtoedpermanent1  1.069e-01  3.766e+03   0.000    1.000
> dgendmale:dtoedpermanent1           NA         NA      NA       NA
> dgendfemale:dtoedcontract1   1.511e+01  3.962e+03   0.004    0.997
> dgendmale:dtoedcontract1            NA         NA      NA       NA
> dgendfemale:dtoedprobation1         NA         NA      NA       NA
> dgendmale:dtoedprobation1           NA         NA      NA       NA
> 
> (Dispersion parameter for binomial family taken to be 1)
> 
>     Null deviance: 269.48  on 999  degrees of freedom
> Residual deviance: 266.56  on 993  degrees of freedom
> AIC: 280.56
> 
> Number of Fisher Scoring iterations: 17
> 
> The thing is I need the coefficients, the p-values and t-values of all the variables. In other words i do not want an output of NAs. How can I achieve this?


Something is odd here. What do you mean "converted to dummy variables"? Normally you'd use factor variables and let the modelling machinery do the rest. Why do you have two dummies for dgen but only 3 for the four-level dtoe? Notice that you can't have e.g. both a "female" and a "male" dummy when there is an intercept in the model (unless you have a 3rd sex in your data) since the two dummies will sum to 1. 

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-help mailing list