[R] logistic regression

ripley@stats.ox.ac.uk ripley at stats.ox.ac.uk
Fri Mar 14 16:11:13 CET 2003


On Fri, 14 Mar 2003, orkun wrote:

> 1*
> I need to use logistic regression. But
> my data file is very huge( appx. 4 million line).
> R doesn't handle such a file.
> What can I do ?

R does handle such files (which are tiny by data-mining standards): you
just need to put 1GB or 2GB of memory in your computer.

> ------------------------
> 2*
> So, I thought whether I could perform sta. analyses on summarised
> data (count of yes/no values) of the huge file. Normally, summarised
> data file short and R could handle it.
> Then I used this command.
>  > lo <-glm(hey.count~as.factor(jeo)+as.factor(eg)+as.factor(kon)+
> as.factor(yol)+ as.factor(aks)+as.factor(fay),family=poisson,data=dt2)
> 
> as you see I used count value of yes/no data as independent data.
> 
> Is it good idea to use this method instead of binomial logistic regression ?

No, but it would be a good idea to use binomial logistic regression (and 
not Bernoulli logistic regression).  That is, to collapse the data to 
success/failure counts over the cross-classification of the factors,
and use family=binomial.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list