[R] logistic regression
ripley at stats.ox.ac.uk
Fri Mar 14 16:11:13 CET 2003
On Fri, 14 Mar 2003, orkun wrote:
> I need to use logistic regression. But
> my data file is very huge( appx. 4 million line).
> R doesn't handle such a file.
> What can I do ?
R does handle such files (which are tiny by data-mining standards): you
just need to put 1GB or 2GB of memory in your computer.
> So, I thought whether I could perform sta. analyses on summarised
> data (count of yes/no values) of the huge file. Normally, summarised
> data file short and R could handle it.
> Then I used this command.
> > lo <-glm(hey.count~as.factor(jeo)+as.factor(eg)+as.factor(kon)+
> as.factor(yol)+ as.factor(aks)+as.factor(fay),family=poisson,data=dt2)
> as you see I used count value of yes/no data as independent data.
> Is it good idea to use this method instead of binomial logistic regression ?
No, but it would be a good idea to use binomial logistic regression (and
not Bernoulli logistic regression). That is, to collapse the data to
success/failure counts over the cross-classification of the factors,
and use family=binomial.
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help