[R] interaction between categorical variables
peter dalgaard
pdalgd at gmail.com
Thu Jun 23 10:46:59 CEST 2011
On Jun 21, 2011, at 08:39 , taby gathoni wrote:
>
> Dear R-users,
>
> I need some assistance.
>
> I am running some interactive variables for categorical variables.
>
> I have dgen(2 levels converted to dummy variables) and dtoe(4-levels also converted to dummy variables). So I have worked with them in two ways:
> i created a variable X1 = dgen*dtoe and I get an error "Error in dgen * dtoe : non-conformable arrays"then i run a glm, binomial using that interaction variable and i get : logit_x = glm(samp2$STATUS ~ dgen*dtoe, data=samp2,family = binomial("logit"))
>> summary(logit_x)
>
> Call:
> glm(formula = samp2$STATUS ~ dgen * dtoe, family = binomial("logit"),
> data = samp2)
>
> Deviance Residuals:
> Min 1Q Median 3Q Max
> -2.6594 0.2431 0.2563 0.2563 0.2563
>
> Coefficients: (5 not defined because of singularities)
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) 1.857e+01 4.612e+03 0.004 0.997
> dgendfemale 1.024e-09 3.766e+03 0.000 1.000
> dgendmale NA NA NA NA
> dtoedpermanent1 -1.517e+01 4.612e+03 -0.003 0.997
> dtoedcontract1 -1.511e+01 4.612e+03 -0.003 0.997
> dtoedprobation1 2.229e-09 4.982e+03 0.000 1.000
> dgendfemale:dtoedpermanent1 1.069e-01 3.766e+03 0.000 1.000
> dgendmale:dtoedpermanent1 NA NA NA NA
> dgendfemale:dtoedcontract1 1.511e+01 3.962e+03 0.004 0.997
> dgendmale:dtoedcontract1 NA NA NA NA
> dgendfemale:dtoedprobation1 NA NA NA NA
> dgendmale:dtoedprobation1 NA NA NA NA
>
> (Dispersion parameter for binomial family taken to be 1)
>
> Null deviance: 269.48 on 999 degrees of freedom
> Residual deviance: 266.56 on 993 degrees of freedom
> AIC: 280.56
>
> Number of Fisher Scoring iterations: 17
>
> The thing is I need the coefficients, the p-values and t-values of all the variables. In other words i do not want an output of NAs. How can I achieve this?
Something is odd here. What do you mean "converted to dummy variables"? Normally you'd use factor variables and let the modelling machinery do the rest. Why do you have two dummies for dgen but only 3 for the four-level dtoe? Notice that you can't have e.g. both a "female" and a "male" dummy when there is an intercept in the model (unless you have a 3rd sex in your data) since the two dummies will sum to 1.
--
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
More information about the R-help
mailing list