[R] Matrix Multiplication using R.

Praveen Surendran ps629 at medschl.cam.ac.uk
Thu Aug 15 12:30:29 CEST 2013


Dear Doran, Bert and Roger,

Thank you for attending my query and for your valuable responses.

The task is slightly more complex. Here's the real case... I have genetic variation data (40,000 single nucleotide polymorphisms) from 90,000 individuals. This makes the 90,000 (samples) rows/columns of the matrix and 40,000 (SNPs) rows/columns of the matrix. Matrix data are genetic variations with values 0,1,2 or 3 where 0 is missing. There will be very few individuals with missing data. 

The task is to identify the relatedness between these 90,000 individuals using their genetic data (0,1,2 or 3). These values needs to be standardised before matrix multiplication. This will make the matrix much larger compared to the 0/1/2/3 matrix and most of these will be real numbers with decimals. 

Bert, I will not be doing a 90,000 x 40,000 %*% 40,000 x 90,000. The plan is to load this 90000 x 40000 matrix into R, then standardise and multiply this in batches of 90,000 samples against 500 samples using these 40,000 variants and process these in parallel to get 90,000 x 90,000 comparisons. Does that sort of clarifies the situation?

I tried loading a 90,000 x 40,000 matrix as a matrix in R this morning on the cluster with specifications described in my previous e-mail. This crashed due to memory overflow. I am trying for possibilities 

Any comments or thoughts will be greatly appreciated.

Regards,

Praveen.

-----Original Message-----
From: Roger Koenker [mailto:rkoenker at illinois.edu] 
Sent: 14 August 2013 23:06
To: Praveen Surendran
Cc: r-help at r-project.org
Subject: Re: [R] Matrix Multiplication using R.

In the event that these are moderately sparse matrices, you could try Matrix or SparseM.


Roger Koenker
rkoenker at illinois.edu




On Aug 14, 2013, at 10:40 AM, Praveen Surendran wrote:

> Dear all,
> 
> I am exploring ways to perform multiplication of a 90000 x 40000 matrix with it's transpose.
> As expected even a 40000 x 100 %*% 100x40000 didn't work on my desktop... giving the error "Error: cannot allocate vector of length 1600000000"
> 
> However I am trying to run this on one node (64GB RAM; 2.60 GHz processor) of a high performance computing cluster.
> Appreciate if anyone has any comments on whether it's advisable to perform a matrix multiplication of this size using R and also on any better ways to handle this task.
> 
> Kind Regards,
> 
> Praveen.
> 
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list