[R] Clogit R and Stata

Sat Jun 8 10:28:19 CEST 2013

The "n = 1404"  vs. "Number of obs   =        468" looks like the giveaway.

You are passing the subset selection logic as the 3rd positional argument, but according to the documentation, that is the weights argument.

So,  clogit(..., data = dframe, subset = sample==1 & glb_ind=="Y")

On Jun 7, 2013, at 18:51 , Richard Beckett wrote:

> 
> 
> From: peter dalgaard <pdalgd at gmail.com>
> To: Richard Beckett <rbeckett81 at yahoo.com> 
> Cc: "r-help at r-project.org" <r-help at r-project.org> 
> Sent: Friday, June 7, 2013 11:12 AM
> Subject: Re: [R] Clogit R and Stata
> 
> Here is the R output:
> 
> Call:
> coxph(formula = Surv(rep(1, 1404L), sftpcons) ~ sftptv2a3 + sftptv2a4 + 
>     sftptv2a5 + sftptv2a2 + sftptv2a6 + logim + maccat + disp4cat + 
>     strata(stratida), data = dframe, method = "exact")
> 
>   n= 1404, number of events= 351 
> 
>              coef           exp(coef) se(coef)      z Pr(>|z|)    
> sftptv2a3  1.4552    4.2852   0.2273  6.401 1.54e-10 ***
> sftptv2a4  3.1118   22.4609   0.2265 13.739  < 2e-16 ***
> sftptv2a5  1.0717    2.9204   0.2522  4.249 2.15e-05 ***
> sftptv2a2  0.7185    2.0514   0.3300  2.177   0.0295 *  
> sftptv2a6  2.7341   15.3965   0.5050  5.414 6.17e-08 ***
> logim      0.7579    2.1338   0.1347  5.625 1.85e-08 ***
> maccat     3.0809   21.7771   0.4005  7.693 1.43e-14 ***
> disp4cat   0.7061    2.0261   0.1524  4.634 3.59e-06 ***
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
> 
>                    exp(coef) exp(-coef) lower .95 upper .95
> sftptv2a3     4.285    0.23336     2.745     6.691
> sftptv2a4    22.461    0.04452    14.409    35.013
> sftptv2a5     2.920    0.34241     1.781     4.788
> sftptv2a2     2.051    0.48747     1.074     3.917
> sftptv2a6    15.397    0.06495     5.722    41.429
> logim         2.134    0.46866     1.639     2.779
> maccat       21.777    0.04592     9.934    47.739
> disp4cat      2.026    0.49355     1.503     2.731
> 
> Rsquare= 0.239   (max possible= 0.623 )
> Likelihood ratio test= 383.2  on 8 df,   p=0
> Wald test            = 264.7  on 8 df,   p=0
> Score (logrank) test = 396.2  on 8 df,   p=0
> 
> 
> And the STATA output:
> 
> Iteration 0:   log likelihood = -95.537697  
> Iteration 1:   log likelihood = -91.465581  
> Iteration 2:   log likelihood = -91.402366  
> Iteration 3:   log likelihood = -91.402264  
> Iteration 4:   log likelihood = -91.402264  
> 
> Conditional (fixed-effects) logistic regression   Number of obs   =        468
>                                                   LR chi2(8)      =     141.59
>                                                   Prob > chi2     =     0.0000
> Log likelihood = -91.402264                       Pseudo R2       =     0.4365
> 
> ------------------------------------------------------------------------------
>     sftpcons |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
> -------------+----------------------------------------------------------------
>    sftptv2a3 |   2.042827   .4741327     4.31   0.000     1.113544     2.97211
>    sftptv2a4 |    4.10828   .5593723     7.34   0.000      3.01193    5.204629
>    sftptv2a5 |   1.766492   .5585173     3.16   0.002     .6718177    2.861165
>    sftptv2a2 |   1.366568   .6540307     2.09   0.037      .084691    2.648444
>    sftptv2a6 |   2.307152   .8225835     2.80   0.005     .6949178    3.919386
>        logim |   1.404135   .3480976     4.03   0.000     .7218764    2.086394
>       maccat |     2.8423   .7008588     4.06   0.000     1.468642    4.215958
>     disp4cat |   .6347805   .2872258     2.21   0.027     .0718283    1.197733
> ------------------------------------------------------------------------------
> 
> Also tried changing method=approximate with no noticeable change 
> 
> On Jun 7, 2013, at 15:34 , Richard Beckett wrote:
> 
> > Sorry to once again write a message but I'm once again stumped and am having no luck finding a solution anywhere else.
> > 
> > 
> > This question requires some finesse in both R and STATA so hopefully I will be able to get an answer here. I am much more adept in R and am trying to replicate the results of a STATA file in R. Hopefully this is a proper forum for such questions. 
> > 
> > 
> > This is the code for the clogit in STATA
> > clogit sftpcons sftptv2a3 sftptv2a4 sftptv2a5 sftptv2a2 sftptv2a6 logim maccat disp4cat if sample==1 & glb_ind=="Y", group(stratida)
> > and I tried to replicate it using
> > clogit1<-clogit(sftpcons~sftptv2a3+sftptv2a4+sftptv2a5+sftptv2a2+sftptv2a6+logim+maccat+disp4cat+strata(stratida), dframe, sample==1 | glb_ind=="Y")
> > but got different results
> > What did I do wrong here? I interpreted the STATA clogit as run this logit as long as the sample is 1 and glb_ind="Y" What should I be doing instead?
> 
> 
> An "&" rather than "|" in the R version might help. Other than that, we're a bit short on clues unless you provide some output.
> 
> -- 
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com