[R] Error when running Conditional Logit Model
Hien Nguyen
hien.nmsu at gmail.com
Sat Dec 19 01:39:37 CET 2009
Thanks a lot for answering my questions.
I have tried to run the clogit for only 64 observations and 4
independent variables and the results are solved instantly. However,
when I run the same command (with only 4 dependent variables) for the
full data, it keeps running for 50 minutes now. :(
Thomas, what do you mean by "maximizing the unconditional likelihood is
fine when the stratum sizes are large"? What I put in "strata (__)" is
actually the possible choices (1-64). Each choices will be recored more
than 4000 times (which means I have more than 4000 values of 1, 4000
values of 2 and so on).
Does it sound right?
Thanks a lot
Hien
tlumley at u.washington.edu wrote:
> On Fri, 18 Dec 2009, Hien Nguyen wrote:
>
>> Dear Drs Winsemius and Berry,
>>
>> Thanks a lot for your comment and suggestions on running my model. I
>> am not just new to R but new to CLM as well. :( With your
>> suggestions, I figure out that I have huge misunderstandings on the
>> model and data arrangement.
>>
>> After my finals, I have read again related materials on CLM and
>> rearranged in an appropriate way before running the model in R. This
>> time, I have a data of more than 250,000 observations (created from
>> more than 4000 response) and a model of 15 predictors.
>>
>> My question is that how long should it takes for the clogit command
>> to run because it has been running for more 10 hours on a quad-core
>> computer and still doesn't show any sign of done or almost done. Is
>> it OK or my command just does not work.
>
> If you have a lot of records with case=1 in a stratum, conditional
> logistic regression will be extremely slow. And unnecessary:
> maximizing the unconditional likelihood is fine when the stratum sizes
> are large.
>
> Note that a quad-core computer won't help. Only one core will be used
> in the computations.
>
> -thomas
>
>
>
>
>> Thanks a lot for your response
>>
>> Hien
>>
>>
>> Charles C. Berry wrote:
>>> On Fri, 4 Dec 2009, David Winsemius wrote:
>>>
>>>>
>>>> On Dec 4, 2009, at 5:49 PM, Hien Nguyen wrote:
>>>>
>>>>> Dear Dr. Winsemius,
>>>>>
>>>>> Thank you very much for your reply.
>>>>>
>>>>> I have tried many possible combinations (even with the model of
>>>>> only 2 predictors) but it produces the same message. With more
>>>>> than 4000 observations, I think 14 predictors might not be too many.
>>>>
>>>> It is what happens in the factor combinations that concern me. I am
>>>> guessing that some of those predictors are factors. You really
>>>> should not ask r-help questions without providing better
>>>> descriptions of both the outcomes and the predictor variables.
>>>>
>>>>>
>>>>> Although my dependent variable (Pin) is not discrete (it ranges
>>>>> from 0 to 1), I do not think it will create problems to the
>>>>> estimation but I'm not sure
>>>>
>>>> I would think it _would_ cause problems. As I understand it,
>>>> conditional methods create contingency tables. Why are you using an
>>>> outcome type that is not consistent with the fundamental regression
>>>> assumptions of the clogit function?
>>>>
>>>> I do not get that particular error when I munge the infert dataset
>>>> to have case be a random uniform value, but I do get an error.
>>>>> infert$case <- runif(nrow(infert))
>>>>> clogit(case~spontaneous+induced+strata(stratum),data=infert)
>>>> Error in Surv(rep(1, 248L), case) : Invalid status value
>>>>
>>>
>>> David, I think you were on the right track. I get this:
>>>
>>> -----------
>>>> clogit(I(case*runif(length(case)))~spontaneous+induced+strata(ifelse(stratum>40,NA,stratum)),data=infert)
>>>
>>> Error in fitter(X, Y, strats, offset, init, control, weights =
>>> weights, :
>>> NA/NaN/Inf in foreign function call (arg 6)
>>> In addition: Warning messages:
>>> 1: In Surv(rep(1, 248L), I(case * runif(length(case)))) :
>>> Invalid status value, converted to NA
>>> 2: In fitter(X, Y, strats, offset, init, control, weights = weights, :
>>> Ran out of iterations and did not converge
>>>>
>>> ------------
>>>
>>> which looks pretty much the same as Hien's error msg
>>>
>>> So Hien needs to create a logical status value.
>>>
>>> Chuck
>>>
>>> p.s.
>>>
>>>> sessionInfo()
>>> R version 2.10.0 (2009-10-26)
>>> i386-pc-mingw32
>>>
>>> locale:
>>> [1] LC_COLLATE=English_United States.1252
>>> [2] LC_CTYPE=English_United States.1252
>>> [3] LC_MONETARY=English_United States.1252
>>> [4] LC_NUMERIC=C
>>> [5] LC_TIME=English_United States.1252
>>>
>>> attached base packages:
>>> [1] splines stats graphics grDevices utils datasets methods
>>> [8] base
>>>
>>> other attached packages:
>>> [1] survival_2.35-7
>>>
>>> loaded via a namespace (and not attached):
>>> [1] tools_2.10.0
>>>>
>>>
>>>
>>>> So I certainly would not have proceeded to submit a full analysis
>>>> to clogit if I could not get a test case to run under the situation
>>>> you propose.
>>>>
>>>> --
>>>> David
>>>>
>>>>>
>>>>> I have checked the collinearity among predictors and they are all
>>>>> < 0.5 (which I think is OK). Do you know what else could make this
>>>>> errors?
>>>>>
>>>>> Thanks a lot
>>>>>
>>>>> Hien Nguyen
>>>>>
>>>>> David Winsemius wrote:
>>>>> > > On Dec 4, 2009, at 9:22 AM, Hien Nguyen wrote:
>>>>> > > > Dear R-helpers,
>>>>> > > > > I am very new to R and trying to run the conditional logit
>>>>> model using
>>>>> > > "clogit " command.
>>>>> > > I have more than 4000 observations in my dataset and try to
>>>>> predict the
>>>>> > > dependent variable from 14 independent variables. My command
>>>>> is as > > follows
>>>>> > > > > clmtest1 <-
>>>>> > >
>>>>> clogit(Pin~Income+Bus+Pop+Urbpro+Health+Student+Grad+NE+NW+NCC+SCC+CH+SE+MRD+strata(IDD),data=clmdata)
>>>>> > > > > > > However, it produces the following errors:
>>>>> > > > > Error in fitter(X, Y, strats, offset, init, control,
>>>>> weights = weights, > > :
>>>>> > > NA/NaN/Inf in foreign function call (arg 6)
>>>>> > > In addition: Warning messages:
>>>>> > > 1: In Surv(rep(1, 4096L), Pinmig) : Invalid status value,
>>>>> converted to > > NA
>>>>> > > 2: In fitter(X, Y, strats, offset, init, control, weights =
>>>>> weights, :
>>>>> > > Ran out of iterations and did not converge
>>>>> > > > > I search the error message from R forums but it does not
>>>>> say anything
>>>>> > > for Conditional Logit Model.
>>>>> > > With that many predictors in a small dataset, you may have
>>>>> created matrix > singularities. Perhaps you created a stratum
>>>>> where all of the subjects > experience the event and others where
>>>>> none did so. The coefficients might > be driven to infinities. Try
>>>>> simplifying the model.
>>>>> > > > > > > Please check for me what it says and what should I do
>>>>> to solve it.
>>>>> > >
>>>>
>>>> David Winsemius, MD
>>>> Heritage Laboratories
>>>> West Hartford, CT
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>> Charles C. Berry (858) 534-2098
>>> Dept of
>>> Family/Preventive Medicine
>>> E mailto:cberry at tajo.ucsd.edu UC San Diego
>>> http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego
>>> 92093-0901
>>>
>>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> Thomas Lumley Assoc. Professor, Biostatistics
> tlumley at u.washington.edu University of Washington, Seattle
>
More information about the R-help
mailing list