[R] regression on large file

Benilton Carvalho bcarvalh at jhsph.edu
Wed Oct 28 15:32:58 CET 2009


bigmemory and biglm packages may be of your interest.

b

On Oct 28, 2009, at 8:50 AM, Georg Ehret wrote:

> Dear R community,
>   I have a fairly large file with variables in rows. Every variable
> (thousands) needs to be regressed on a reference variable. The file  
> is too
> big to load into R (or R gets too slow having done it) and I do now  
> read in
> line by line with "scan" (see below) and write the results to out.  
> Although
> improved, this is still very slow... Can someone please help me and  
> suggest
> how I can make this faster?
>
> Thank you and best regards, Georg.
> *******************************************
> Georg Ehret, Johns Hopkins U, Baltimore MD, USA
>
>
> for (i in 16:nmax){
>
> line<- 
> scan(file=paste(file),nlines=1,skip=(i-1),what="integer",sep=",")
>        d<-as.numeric(line[-1])
>        name<-line[1]
>        modela <- lm(s1~a+a2+b+s+M+W)
>        modelb <- lm(s2~a+a2+b+s+M+W+d)
>        modelc <- lm(s3~a+2+b+s+M+W+d+d*s)
>        p_main <- anova(modela,modelb)$P[2]
>        p_main_i <- anova(modela,modelc)$P[2]
>        p_i <- anova(modelb,modelc)$P[2]
>
> cat 
> (c(name,p_main,p_main_i,p_i),file=paste("out",".txt",sep=""),append=T)
>        cat("\n",file=paste("out",".txt",sep=""),append=T)
> }
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list