[R] Problems with weight
Milan Bouchet-Valat
nalimilan at club.fr
Tue Nov 27 22:54:48 CET 2012
Le mardi 27 novembre 2012 à 18:33 -0300, Pablo Menese a écrit :
> I can't ... I don't know why but I can't
>
> When I use it:
>
> logit <- glm(bach ~ egp4 + programa, weight=wst7,
> family=quasibinomial(link"logit"))
You were advised to use svyglm(), not glm(). It's usually considered
polite to read carefully the anwsers you get to your questions...
Regards
> I reach the same betas that in STATA, but the hypothesis test, the t value,
> and the std. error is different.
>
> I think that the solution can't be so far from this...
>
>
> On Fri, Nov 23, 2012 at 9:49 PM, Anthony Damico <ajdamico at gmail.com> wrote:
>
> > from your stata output, it looks like you need to use the survey package
> > in R
> >
> > for step-by-step instructions about how to do this (and comparisons to
> > stata), see
> >
> > http://journal.r-project.org/archive/2009-2/RJournal_2009-2_Damico.pdf
> >
> > once you're ready to run the regression, use svyglm() instead of glm() and
> > drop the weights argument (since it will already be part of the survey
> > design) :)
> >
> >
> >
> > On Fri, Nov 23, 2012 at 3:13 PM, Pablo Menese <pmenese at gmail.com> wrote:
> >
> >> Until a weeks ago I used stata for everything.
> >> Now I'm learning R and trying to move. But, in this stage I'm testing R
> >> trying to do the same things than I used to do in stata whit the same
> >> outputs.
> >> I have a problem with the logit, applying weights.
> >>
> >> in stata I have this output
> >> . svy: logit bach job2 mujer i.egp4 programa delay mdeo i.str evprivate
> >> (running logit on estimation sample)
> >>
> >> Survey: Logistic regression
> >>
> >> Number of strata = 1 Number of obs =
> >> 248
> >> Number of PSUs = 248 Population size =
> >> 5290.1639
> >> Design df = 247
> >> F( 11, 237) = 4.39
> >> Prob > F = 0.0000
> >>
> >>
> >> Linearized
> >> bach Coef. Std. Err. t P>t [95% Conf. Interval]
> >>
> >> job2 -.4437446 .4385934 -1.01 0.313 -1.307605 .4201154
> >> mujer 1.070595 .4169919 2.57 0.011 .2492812 1.891908
> >>
> >> egp4
> >> 2 -.4839342 .539808 -0.90 0.371 -1.547148 .5792796
> >> 3 -1.288947 .5347344 -2.41 0.017 -2.342168 -.2357263
> >> 4 -.8569793 .5106425 -1.68 0.095 -1.862748 .1487898
> >>
> >> programa .9694352 .5677642 1.71 0.089 -.1488415 2.087712
> >> delay -1.552582 .5714967 -2.72 0.007 -2.678211 -.426954
> >> mdeo -.7938904 .3727571 -2.13 0.034 -1.528078 -.0597025
> >>
> >> str
> >> 2 -1.122691 .5731879 -1.96 0.051 -2.25165 .0062682
> >> 3 -2.056682 .6350485 -3.24 0.001 -3.307483 -.8058812
> >>
> >> evprivate -1.962431 .5674143 -3.46 0.001 -3.080018 -.8448431
> >> _cons 2.308699 .7274924 3.17 0.002 .8758187 3.741578
> >>
> >>
> >> the best that i get in R was:
> >>
> >> glm(formula = bach ~ job2 + mujer + egp4 + programa + delay +
> >> mdeo + str + evprivate, family = quasibinomial(link = "logit"),
> >> weights = wst7)
> >>
> >> Deviance Residuals:
> >> Min 1Q Median 3Q Max
> >> -12.5951 -3.9034 -0.9412 3.8268 11.2750
> >>
> >> Coefficients:
> >> Estimate Std. Error t value Pr(>|t|)
> >> (Intercept) 2.3087 0.7173 3.218 0.00147 **
> >> job2 -0.4437 0.4355 -1.019 0.30926
> >> mujer 1.0706 0.3558 3.009 0.00290 **
> >> egp4intermediate (iii, iv) -0.4839 0.4946 -0.978 0.32890
> >> egp4skilled manual workers -1.2889 0.5268 -2.447 0.01514 *
> >> egp4working class -0.8570 0.4625 -1.853 0.06514 .
> >> programa 0.9694 0.4951 1.958 0.05141 .
> >> delay -1.5526 0.4878 -3.183 0.00166 **
> >> mdeo -0.7939 0.4207 -1.887 0.06037 .
> >> strest. ii -1.1227 0.4809 -2.334 0.02042 *
> >> strestr. iii -2.0567 0.5134 -4.006 8.28e-05 ***
> >> evprivate -1.9624 0.6490 -3.024 0.00277 **
> >> ---
> >> Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
> >>
> >> (Dispersion parameter for quasibinomial family taken to be 23.14436)
> >>
> >> Null deviance: 7318.5 on 246 degrees of freedom
> >> Residual deviance: 5692.8 on 235 degrees of freedom
> >> (103 observations deleted due to missingness)
> >> AIC: NA
> >>
> >> Number of Fisher Scoring iterations: 6
> >>
> >> Warning message:
> >> In summary.glm(logit) :
> >> observations with zero weight not used for calculating dispersion
> >>
> >> this has the same betas but the hypothesis test has differents values...
> >>
> >>
> >> HELP!!!!
> >>
> >> [[alternative HTML version deleted]]
> >>
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >>
> >
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list