[R] Error when running Conditional Logit Model

Hien Nguyen hien.nmsu at gmail.com
Fri Dec 18 20:46:52 CET 2009


Dear Drs Winsemius and Berry,

Thanks a lot for your comment and suggestions on running my model. I am 
not just new to R but new to CLM as well. :( With your suggestions, I 
figure out that I have huge misunderstandings on the model and data 
arrangement.

After my finals, I have read again related materials on CLM and 
rearranged in an appropriate way before running the model in R. This 
time, I have a data of more than 250,000 observations (created from more 
than 4000 response) and a model of 15 predictors.

My question is that how long should it takes for the clogit command to 
run because it has been running for more 10 hours on a quad-core 
computer and still doesn't show any sign of done or almost done. Is it 
OK or my command just does not work.

Thanks a lot for your response

Hien


Charles C. Berry wrote:
> On Fri, 4 Dec 2009, David Winsemius wrote:
>
>>
>> On Dec 4, 2009, at 5:49 PM, Hien Nguyen wrote:
>>
>>> Dear Dr. Winsemius,
>>>
>>> Thank you very much for your reply.
>>>
>>> I have tried many possible combinations (even with the model of only 
>>> 2 predictors) but it produces the same message. With more than 4000 
>>> observations, I think 14 predictors might not be too many.
>>
>> It is what happens in the factor combinations that concern me. I am 
>> guessing that some of those predictors are factors. You really should 
>> not ask r-help questions without providing better descriptions of 
>> both the outcomes and the predictor variables.
>>
>>>
>>> Although my dependent variable (Pin) is not discrete  (it ranges 
>>> from 0 to 1), I do not think it will create problems to the 
>>> estimation but I'm not sure
>>
>> I would think it _would_ cause problems. As I understand it, 
>> conditional methods create contingency tables. Why are you using an 
>> outcome type that is not consistent with the fundamental regression 
>> assumptions of the clogit function?
>>
>> I do not get that particular error when I munge the infert dataset to 
>> have case be a random uniform value, but I do get an error.
>>>  infert$case <- runif(nrow(infert))
>>>  clogit(case~spontaneous+induced+strata(stratum),data=infert)
>> Error in Surv(rep(1, 248L), case) : Invalid status value
>>
>
> David, I think you were on the right track. I get this:
>
> -----------
>> clogit(I(case*runif(length(case)))~spontaneous+induced+strata(ifelse(stratum>40,NA,stratum)),data=infert) 
>>
> Error in fitter(X, Y, strats, offset, init, control, weights = 
> weights,  :
>   NA/NaN/Inf in foreign function call (arg 6)
> In addition: Warning messages:
> 1: In Surv(rep(1, 248L), I(case * runif(length(case)))) :
>   Invalid status value, converted to NA
> 2: In fitter(X, Y, strats, offset, init, control, weights = weights,  :
>   Ran out of iterations and did not converge
>>
> ------------
>
> which looks pretty much the same as Hien's error msg
>
> So Hien needs to create a logical status value.
>
> Chuck
>
> p.s.
>
>> sessionInfo()
> R version 2.10.0 (2009-10-26)
> i386-pc-mingw32
>
> locale:
> [1] LC_COLLATE=English_United States.1252
> [2] LC_CTYPE=English_United States.1252
> [3] LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
>
> attached base packages:
> [1] splines   stats     graphics  grDevices utils     datasets  methods
> [8] base
>
> other attached packages:
> [1] survival_2.35-7
>
> loaded via a namespace (and not attached):
> [1] tools_2.10.0
>>
>
>
>> So I certainly would not have proceeded to submit a full analysis to 
>> clogit if I could not get a test case to run under the situation you 
>> propose.
>>
>> -- 
>> David
>>
>>>
>>> I have checked the collinearity among predictors and they are all < 
>>> 0.5 (which I think is OK). Do you know what else could make this 
>>> errors?
>>>
>>> Thanks a lot
>>>
>>> Hien Nguyen
>>>
>>> David Winsemius wrote:
>>> > > On Dec 4, 2009, at 9:22 AM, Hien Nguyen wrote:
>>> > > > Dear R-helpers,
>>> > > > > I am very new to R and trying to run the conditional logit 
>>> model using
>>> > > "clogit " command.
>>> > > I have more than 4000 observations in my dataset and try to 
>>> predict the
>>> > > dependent variable from 14 independent variables. My command is 
>>> as > > follows
>>> > > > > clmtest1 <-
>>> > > 
>>> clogit(Pin~Income+Bus+Pop+Urbpro+Health+Student+Grad+NE+NW+NCC+SCC+CH+SE+MRD+strata(IDD),data=clmdata) 
>>>
>>> > > > > > > However, it produces the following errors:
>>> > > > > Error in fitter(X, Y, strats, offset, init, control, weights 
>>> = weights, > > :
>>> > > NA/NaN/Inf in foreign function call (arg 6)
>>> > > In addition: Warning messages:
>>> > > 1: In Surv(rep(1, 4096L), Pinmig) : Invalid status value, 
>>> converted to > > NA
>>> > > 2: In fitter(X, Y, strats, offset, init, control, weights = 
>>> weights,  :
>>> > > Ran out of iterations and did not converge
>>> > > > > I search the error message from R forums but it does not say 
>>> anything
>>> > > for Conditional Logit Model.
>>> > > With that many predictors in a small dataset, you may have 
>>> created matrix > singularities. Perhaps you created a stratum where 
>>> all of the subjects > experience the event and others where none did 
>>> so. The coefficients might > be driven to infinities. Try 
>>> simplifying the model.
>>> > > > > > > Please check for me what it says and what should I do to 
>>> solve it.
>>> > > 
>>
>> David Winsemius, MD
>> Heritage Laboratories
>> West Hartford, CT
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> Charles C. Berry                            (858) 534-2098
>                                             Dept of Family/Preventive 
> Medicine
> E mailto:cberry at tajo.ucsd.edu                UC San Diego
> http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 
> 92093-0901
>
>




More information about the R-help mailing list