[R] interaction between categorical variables
Daniel Malter
daniel at umd.edu
Tue Jun 21 09:18:36 CEST 2011
The reason you get NAs is the rank deficiency. It even says that five
coefficients are not defined because of singularities. It is likely the case
that certain categories do not exist in the data. Note that in the example
below y is ALWAYS zero when x is zero. This makes an interaction inestimable
and leads to a singularity as you experience them. The answer to your
question then is you cannot get estimates for these coefficients, etc.
x<-rep(c(0,1),each=10)
y<-c(rep(0,10),rep(c(0,1),each=5))
data.frame(x,y)
p<-1/(1+exp(-x-y))
z<-rbinom(20,1,p)
reg<-summary(glm(z~y*x,binomial))
HTH,
Daniel
taby gathoni wrote:
>
> Dear R-users,
>
> I need some assistance.
>
> I am running some interactive variables for categorical variables.
>
> I have dgen(2 levels converted to dummy variables)Â and dtoe(4-levels
> also converted to dummy variables). So I have worked with them in two
> ways:
> i created a variable X1 = dgen*dtoe and I get an error "Error in dgen *
> dtoe : non-conformable arrays"then i run a glm, binomial using that
> interaction variable and i get : Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â logit_x
> = glm(samp2$STATUS ~ dgen*dtoe, data=samp2,family = binomial("logit"))
>> summary(logit_x)
>
> Call:
> glm(formula = samp2$STATUS ~ dgen * dtoe, family = binomial("logit"),
> Â Â Â data = samp2)
>
> Deviance Residuals:
>    Min      1Q  Median      3Q     MaxÂ
> -2.6594Â Â 0.2431Â Â 0.2563Â Â 0.2563Â Â 0.2563Â
>
> Coefficients: (5 not defined because of singularities)
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Estimate Std.
> Error z value Pr(>|z|)
> (Intercept)Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â 1.857e+01Â 4.612e+03Â Â
> 0.004Â Â Â 0.997
> dgendfemale                 1.024e-09 3.766e+03 Â
> 0.000Â Â Â 1.000
> dgendmale                         Â
> NAÂ Â Â Â Â Â Â Â NAÂ Â Â Â Â NAÂ Â Â Â Â Â NA
> dtoedpermanent1Â Â Â Â Â Â Â Â Â Â Â Â -1.517e+01Â 4.612e+03Â
> -0.003Â Â Â 0.997
> dtoedcontract1Â Â Â Â Â Â Â Â Â Â Â Â Â -1.511e+01Â 4.612e+03Â
> -0.003Â Â Â 0.997
> dtoedprobation1Â Â Â Â Â Â Â Â Â Â Â Â Â 2.229e-09Â 4.982e+03Â Â
> 0.000Â Â Â 1.000
> dgendfemale:dtoedpermanent1Â 1.069e-01Â 3.766e+03Â Â 0.000Â Â Â 1.000
> dgendmale:dtoedpermanent1Â Â Â Â Â Â Â Â Â Â NAÂ Â Â Â Â Â Â Â
> NAÂ Â Â Â Â NAÂ Â Â Â Â Â NA
> dgendfemale:dtoedcontract1Â Â 1.511e+01Â 3.962e+03Â Â 0.004Â Â Â 0.997
> dgendmale:dtoedcontract1Â Â Â Â Â Â Â Â Â Â Â NAÂ Â Â Â Â Â Â Â
> NAÂ Â Â Â Â NAÂ Â Â Â Â Â NA
> dgendfemale:dtoedprobation1Â Â Â Â Â Â Â Â NAÂ Â Â Â Â Â Â Â
> NAÂ Â Â Â Â NAÂ Â Â Â Â Â NA
> dgendmale:dtoedprobation1Â Â Â Â Â Â Â Â Â Â NAÂ Â Â Â Â Â Â Â
> NAÂ Â Â Â Â NAÂ Â Â Â Â Â NA
>
> (Dispersion parameter for binomial family taken to be 1)
>
> Â Â Â Null deviance: 269.48Â on 999Â degrees of freedom
> Residual deviance: 266.56Â on 993Â degrees of freedom
> AIC: 280.56
>
> Number of Fisher Scoring iterations: 17
>
> The thing is I need the coefficients, the p-values and t-values of all the
> variables. In other words i do not want an output of NAs. How can I
> achieve this?
>
> Thanks alot.
>
> Taby
>
>
> An idea not coupled with action will never get any bigger than the brain
> cell it occupied.
> Arnold Glasgow
> ......
> "Attempt something large enough that failure is guaranteedâ¦unless God
> steps in!"
>
> [[alternative HTML version deleted]]
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
View this message in context: http://r.789695.n4.nabble.com/interaction-between-categorical-variables-tp3613312p3613371.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list