[R] interaction between categorical variables

Daniel Malter daniel at umd.edu
Tue Jun 21 09:18:36 CEST 2011


The reason you get NAs is the rank deficiency. It even says that five
coefficients are not defined because of singularities. It is likely the case
that certain categories do not exist in the data. Note that in the example
below y is ALWAYS zero when x is zero. This makes an interaction inestimable
and leads to a singularity as you experience them. The answer to your
question then is you cannot get estimates for these coefficients, etc.

x<-rep(c(0,1),each=10)
y<-c(rep(0,10),rep(c(0,1),each=5))
data.frame(x,y)

p<-1/(1+exp(-x-y))
z<-rbinom(20,1,p)

reg<-summary(glm(z~y*x,binomial))

HTH,
Daniel




taby gathoni wrote:
> 
> Dear R-users,
> 
> I need some  assistance.
> 
> I am running some interactive variables for categorical variables.
> 
> I have dgen(2 levels converted to dummy variables)  and dtoe(4-levels
> also converted to  dummy variables). So I have worked with them in two
> ways:
> i created a variable X1 = dgen*dtoe  and I get an error "Error in dgen *
> dtoe : non-conformable arrays"then i run a glm, binomial using that
> interaction variable and i get :                  logit_x
> = glm(samp2$STATUS ~ dgen*dtoe, data=samp2,family = binomial("logit"))
>> summary(logit_x)
> 
> Call:
> glm(formula = samp2$STATUS ~ dgen * dtoe, family = binomial("logit"), 
>     data = samp2)
> 
> Deviance Residuals: 
>     Min       1Q   Median       3Q      Max  
> -2.6594   0.2431   0.2563   0.2563   0.2563  
> 
> Coefficients: (5 not defined because of singularities)
>                               Estimate Std.
> Error z value Pr(>|z|)
> (Intercept)                  1.857e+01  4.612e+03  
> 0.004    0.997
> dgendfemale                  1.024e-09  3.766e+03  
> 0.000    1.000
> dgendmale                          
> NA         NA      NA       NA
> dtoedpermanent1             -1.517e+01  4.612e+03 
> -0.003    0.997
> dtoedcontract1              -1.511e+01  4.612e+03 
> -0.003    0.997
> dtoedprobation1              2.229e-09  4.982e+03  
> 0.000    1.000
> dgendfemale:dtoedpermanent1  1.069e-01  3.766e+03   0.000    1.000
> dgendmale:dtoedpermanent1           NA        
> NA      NA       NA
> dgendfemale:dtoedcontract1   1.511e+01  3.962e+03   0.004    0.997
> dgendmale:dtoedcontract1            NA        
> NA      NA       NA
> dgendfemale:dtoedprobation1         NA        
> NA      NA       NA
> dgendmale:dtoedprobation1           NA        
> NA      NA       NA
> 
> (Dispersion parameter for binomial family taken to be 1)
> 
>     Null deviance: 269.48  on 999  degrees of freedom
> Residual deviance: 266.56  on 993  degrees of freedom
> AIC: 280.56
> 
> Number of Fisher Scoring iterations: 17
> 
> The thing is I need the coefficients, the p-values and t-values of all the
> variables. In other words i do not want an output of NAs. How can I
> achieve this?
> 
> Thanks alot.
> 
> Taby
> 
> 
> An idea not coupled with action will never get any bigger than the brain
> cell it occupied.
> Arnold Glasgow
> ......
> "Attempt something large enough that failure is guaranteed…unless God
> steps in!"
> 
> 	[[alternative HTML version deleted]]
> 
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

--
View this message in context: http://r.789695.n4.nabble.com/interaction-between-categorical-variables-tp3613312p3613371.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list