[R] Clogit R and Stata
peter dalgaard
pdalgd at gmail.com
Sat Jun 8 10:28:19 CEST 2013
The "n = 1404" vs. "Number of obs = 468" looks like the giveaway.
You are passing the subset selection logic as the 3rd positional argument, but according to the documentation, that is the weights argument.
So, clogit(..., data = dframe, subset = sample==1 & glb_ind=="Y")
On Jun 7, 2013, at 18:51 , Richard Beckett wrote:
>
>
> From: peter dalgaard <pdalgd at gmail.com>
> To: Richard Beckett <rbeckett81 at yahoo.com>
> Cc: "r-help at r-project.org" <r-help at r-project.org>
> Sent: Friday, June 7, 2013 11:12 AM
> Subject: Re: [R] Clogit R and Stata
>
> Here is the R output:
>
> Call:
> coxph(formula = Surv(rep(1, 1404L), sftpcons) ~ sftptv2a3 + sftptv2a4 +
> sftptv2a5 + sftptv2a2 + sftptv2a6 + logim + maccat + disp4cat +
> strata(stratida), data = dframe, method = "exact")
>
> n= 1404, number of events= 351
>
> coef exp(coef) se(coef) z Pr(>|z|)
> sftptv2a3 1.4552 4.2852 0.2273 6.401 1.54e-10 ***
> sftptv2a4 3.1118 22.4609 0.2265 13.739 < 2e-16 ***
> sftptv2a5 1.0717 2.9204 0.2522 4.249 2.15e-05 ***
> sftptv2a2 0.7185 2.0514 0.3300 2.177 0.0295 *
> sftptv2a6 2.7341 15.3965 0.5050 5.414 6.17e-08 ***
> logim 0.7579 2.1338 0.1347 5.625 1.85e-08 ***
> maccat 3.0809 21.7771 0.4005 7.693 1.43e-14 ***
> disp4cat 0.7061 2.0261 0.1524 4.634 3.59e-06 ***
> ---
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> exp(coef) exp(-coef) lower .95 upper .95
> sftptv2a3 4.285 0.23336 2.745 6.691
> sftptv2a4 22.461 0.04452 14.409 35.013
> sftptv2a5 2.920 0.34241 1.781 4.788
> sftptv2a2 2.051 0.48747 1.074 3.917
> sftptv2a6 15.397 0.06495 5.722 41.429
> logim 2.134 0.46866 1.639 2.779
> maccat 21.777 0.04592 9.934 47.739
> disp4cat 2.026 0.49355 1.503 2.731
>
> Rsquare= 0.239 (max possible= 0.623 )
> Likelihood ratio test= 383.2 on 8 df, p=0
> Wald test = 264.7 on 8 df, p=0
> Score (logrank) test = 396.2 on 8 df, p=0
>
>
> And the STATA output:
>
> Iteration 0: log likelihood = -95.537697
> Iteration 1: log likelihood = -91.465581
> Iteration 2: log likelihood = -91.402366
> Iteration 3: log likelihood = -91.402264
> Iteration 4: log likelihood = -91.402264
>
> Conditional (fixed-effects) logistic regression Number of obs = 468
> LR chi2(8) = 141.59
> Prob > chi2 = 0.0000
> Log likelihood = -91.402264 Pseudo R2 = 0.4365
>
> ------------------------------------------------------------------------------
> sftpcons | Coef. Std. Err. z P>|z| [95% Conf. Interval]
> -------------+----------------------------------------------------------------
> sftptv2a3 | 2.042827 .4741327 4.31 0.000 1.113544 2.97211
> sftptv2a4 | 4.10828 .5593723 7.34 0.000 3.01193 5.204629
> sftptv2a5 | 1.766492 .5585173 3.16 0.002 .6718177 2.861165
> sftptv2a2 | 1.366568 .6540307 2.09 0.037 .084691 2.648444
> sftptv2a6 | 2.307152 .8225835 2.80 0.005 .6949178 3.919386
> logim | 1.404135 .3480976 4.03 0.000 .7218764 2.086394
> maccat | 2.8423 .7008588 4.06 0.000 1.468642 4.215958
> disp4cat | .6347805 .2872258 2.21 0.027 .0718283 1.197733
> ------------------------------------------------------------------------------
>
> Also tried changing method=approximate with no noticeable change
>
> On Jun 7, 2013, at 15:34 , Richard Beckett wrote:
>
> > Sorry to once again write a message but I'm once again stumped and am having no luck finding a solution anywhere else.
> >
> >
> > This question requires some finesse in both R and STATA so hopefully I will be able to get an answer here. I am much more adept in R and am trying to replicate the results of a STATA file in R. Hopefully this is a proper forum for such questions.
> >
> >
> > This is the code for the clogit in STATA
> > clogit sftpcons sftptv2a3 sftptv2a4 sftptv2a5 sftptv2a2 sftptv2a6 logim maccat disp4cat if sample==1 & glb_ind=="Y", group(stratida)
> > and I tried to replicate it using
> > clogit1<-clogit(sftpcons~sftptv2a3+sftptv2a4+sftptv2a5+sftptv2a2+sftptv2a6+logim+maccat+disp4cat+strata(stratida), dframe, sample==1 | glb_ind=="Y")
> > but got different results
> > What did I do wrong here? I interpreted the STATA clogit as run this logit as long as the sample is 1 and glb_ind="Y" What should I be doing instead?
>
>
> An "&" rather than "|" in the R version might help. Other than that, we're a bit short on clues unless you provide some output.
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
>
>
>
>
>
>
>
>
>
>
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
More information about the R-help
mailing list