[R] big data

Dirk Eddelbuettel edd at debian.org
Wed Sep 8 14:30:41 CEST 2010


On 8 September 2010 at 13:26, André de Boer wrote:
| I searched the internet but i didn't find the answer for the next problem:
| I want to do a glm on a csv file consisting of 25 columns and 4 mln rows.
| Not all the columns are relevant. My problem is to read the data into R.
| Manipulate the data and then do a glm.
| 
| I've tried with:
| 
| dd<-scan("myfile.csv",colClasses=classes)
| dat<-as.data.frame(dd)
| 
| My question is: what is the right way to do is?
| Can someone give me a hint?

Look at the biglm package by Thomas Lumley which will allow you to fit glm
models in "chunks".  

Dirk

-- 
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com



More information about the R-help mailing list