[R] Error during working with wgcna and R

Peter Langfelder peter.langfelder at gmail.com
Fri May 9 20:25:05 CEST 2014


WGCNA maintainer here. When working with a large data set, you have a
few options.

1. Without being snarky, the best option is to get (or get access to)
a computer with large-enough RAM. Many universities, departments, and
other research institutes have computer clusters with nodes with at
least 64GB of memory. If you are working on your own computer under
Windows, make sure you run the 64-bit version of R and consider buying
additional RAM if you can.

2. Reduce the number of the features (usually probes/probesets). This
is not always possible in general applications, but for gene
expression studies in a single tissue one would not expect more than
about 10k genes to be expressed, so using all probes of a modern array
is probably an overkill - most of them will not be expressed. If you
do the reduction, I recommend filtering out probes whose expression
values are low in a suitable fraction of the samples (depending on
experiment design).

If you can't get a computer big enough to handle a reduced data set,
the next options are these:

3. Use blockwiseModules with an appropriately set argument
maxBlockSize. See WGCNA tutorial I, section 2c, at
http://labs.genetics.ucla.edu/horvath/htdocs/CoexpressionNetwork/Rpackages/WGCNA/Tutorials/index.html
, and pay careful attention to the paragraphs discussing the choice of
maxBlockSize. The function blockwiseModules can be instructed to save
the TOM matrices for each block to disk; you can load them later
one-by-one if you need to.

4. If you want to do the analysis on your own, use the function
projectiveKMeans to pre-cluster the genes into blocks, then run the
analysis in each block separately. Remember to remove all large
objects from memory and call garbage collection to free up enough
memory before going to the next block.

HTH,

Peter

On Fri, May 9, 2014 at 5:37 AM, KK <kidist.kibret at gmail.com> wrote:
> I am also working on co-expression analysis. It seems like there is no way
> to use TOMsimilarityFromExpr for large datasets.
> The option 'maxBlockSize' exists for module detection but not for
> TOMsimilarity? The only solution seems to reduce the dataset
>
> On Sunday, July 8, 2012 1:02:54 PM UTC+2, deeksha.malhan wrote:
>>
>> Hi
>> I am working on co-expression analysis of rice dataset with the help of
>> wgcna and R but now I am  at one point which is showing error as shown
>> below
>> :
>>
>>
>>  dissTOM = 1-TOMsimilarityFromExpr(datExpr, power = 8);
>> Error: cannot allocate vector of size 2.8 Gb
>> In addition: Warning messages:
>> 1: In matrix(0, nGenes, nGenes) :
>>   Reached total allocation of 2550Mb: see help(memory.size)
>> 2: In matrix(0, nGenes, nGenes) :
>>   Reached total allocation of 2550Mb: see help(memory.size)
>> 3: In matrix(0, nGenes, nGenes) :
>>   Reached total allocation of 2550Mb: see help(memory.size)
>> 4: In matrix(0, nGenes, nGenes) :
>>   Reached total allocation of 2550Mb: see help(memory.size)
>>
>> Help me to resolve this problem
>>
>> --
>> View this message in context:
>> http://r.789695.n4.nabble.com/Error-during-working-with-wgcna-and-R-tp4635768.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> R-h... at r-project.org <javascript:> mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list