[R] logistic regression

Frank E Harrell Jr fharrell at virginia.edu
Fri Mar 14 17:42:02 CET 2003


On Fri, 14 Mar 2003 16:48:37 +0200
orkun <temiz at deprem.gov.tr> wrote:

> Hello
> 1*
> I need to use logistic regression. But
> my data file is very huge( appx. 4 million line).
> R doesn't handle such a file.
> What can I do ?
> ------------------------
> 2*
> So, I thought whether I could perform sta. analyses on summarised
> data (count of yes/no values) of the huge file. Normally, summarised
> data file short and R could handle it.
> Then I used this command.
>  > lo <-glm(hey.count~as.factor(jeo)+as.factor(eg)+as.factor(kon)+
> as.factor(yol)+ as.factor(aks)+as.factor(fay),family=poisson,data=dt2)
> 
> as you see I used count value of yes/no data as independent data.
> 
> Is it good idea to use this method instead of binomial logistic regression ?
> 
> what do you suggest more ?
> 
> thanks in advance
> 
> -- 
> Ahmet Temiz
> Geological Engineer
> General Directorate
> of Disaster Affairs
> TURKEY

If you have no more than one continuous variable you can pre-process (outside of R) to collapse the data into frequency counts.  I did not check to see if glm handles frequency case weights.  The lrm function in the Design package (http://hesweb1.med.virginia.edu/biostat/s/Design.html) does.
-- 
Frank E Harrell Jr              Prof. of Biostatistics & Statistics
Div. of Biostatistics & Epidem. Dept. of Health Evaluation Sciences
U. Virginia School of Medicine  http://hesweb1.med.virginia.edu/biostat



More information about the R-help mailing list