[R] regression on large file
Benilton Carvalho
bcarvalh at jhsph.edu
Wed Oct 28 15:32:58 CET 2009
bigmemory and biglm packages may be of your interest.
b
On Oct 28, 2009, at 8:50 AM, Georg Ehret wrote:
> Dear R community,
> I have a fairly large file with variables in rows. Every variable
> (thousands) needs to be regressed on a reference variable. The file
> is too
> big to load into R (or R gets too slow having done it) and I do now
> read in
> line by line with "scan" (see below) and write the results to out.
> Although
> improved, this is still very slow... Can someone please help me and
> suggest
> how I can make this faster?
>
> Thank you and best regards, Georg.
> *******************************************
> Georg Ehret, Johns Hopkins U, Baltimore MD, USA
>
>
> for (i in 16:nmax){
>
> line<-
> scan(file=paste(file),nlines=1,skip=(i-1),what="integer",sep=",")
> d<-as.numeric(line[-1])
> name<-line[1]
> modela <- lm(s1~a+a2+b+s+M+W)
> modelb <- lm(s2~a+a2+b+s+M+W+d)
> modelc <- lm(s3~a+2+b+s+M+W+d+d*s)
> p_main <- anova(modela,modelb)$P[2]
> p_main_i <- anova(modela,modelc)$P[2]
> p_i <- anova(modelb,modelc)$P[2]
>
> cat
> (c(name,p_main,p_main_i,p_i),file=paste("out",".txt",sep=""),append=T)
> cat("\n",file=paste("out",".txt",sep=""),append=T)
> }
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list