[R] Need help: Loading a large data set.

jim holtman jholtman at gmail.com
Sun Nov 23 22:04:26 CET 2008


What matrix package are you using?  I have not used sparse matrices,
but a quick look at the help file of Matrix talks about a file format
for reading in a sparse matrix.  I would assume that all you need to
do is to read in your files and write them out in that format.  You
can do it in using 'list.files' to read in the files and the 'cat' (or
any other command that will write an ASCII file) to output the data in
the correct format.

On Sun, Nov 23, 2008 at 3:19 PM, Atul Kulkarni <atulskulkarni at gmail.com> wrote:
> Hi All,
>
> I am dealing with a large data set which translates in to a sparse matrix, I
> want to load this data that is spread in approximately 17000+ files each
> defining a row and each file has variable number of records that come with
> its column number and the value that they store.
>
> I wish to load this data in memory from these files one by one. Is there
> anyway I can do this in R, before I start processing? I am sure this is not
> the first time R or the community is confronted with this kind of a problem
> but I could not find the documentation for loading data in to sparse matrix
> I found quite a few packages  for sparse matrix but they all were
> concentrating on how to do various operations with the matrix once the
> matrix is loaded. I need to first load the data in the system before I can
> think about analysing.
>
> Regards,
> Atul.
>
> Graduate Student,
> Department of Computer Science,
> University of Minnesota Duluth,
> Duluth, MN, 55812.
> --------
> www.d.umn.edu/~kulka053
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?



More information about the R-help mailing list