[BioC] WGCNA RNAseq

Anna Esteve Codina aesteve at pcb.ub.cat
Wed Feb 26 18:15:20 CET 2014


Dear Peter, 

I successfully run the WGCNA software with RNAseq data after TMM
normalization and log-transformation. But I have some questions:

1.- Is it necessary to clean the input dataset of genes before running the
analysis? I have around 20,000 expressed genes, whereas in the Tutorial
there are only ~3000 genes. Should I take only the most variable ones?
2.- I correlated the modules with two continuous traits but I did not obtain
any significant at the module level, but I did at the gene level. Is it
possible to have significant genes associated with a trait and other not
significant within the same module?
3.- I have modules with up to 4000 genes, how can it be?

Thanks for the explanations,

Anna Esteve Codina, PhD
Functional Bioinformatics Team
Centre Nacional d’Anàlisi Genòmica (Parc Científic de Barcelona)
Baldiri Reixac 4-6, Torre I
www.cnag.cat
aesteve at pcb.ub.cat


-----Mensaje original-----
De: bioconductor-bounces at r-project.org
[mailto:bioconductor-bounces at r-project.org] En nombre de Peter Langfelder
Enviado el: lunes, 17 de febrero de 2014 4:02
Para: Martin Morgan
CC: bioconductor at r-project.org
Asunto: Re: [BioC] WGCNA

Hi Martin,

if you simply run pickSoftThreshold without calling enableWGCNAThreads
before, the function is run in a single-worker mode. To reproduce the error,
you have to call enableWGCNAThreads with an argument of 2 or more.

For a reproducible example, you can run the first two sections of WGCNA
Tutorial I at
http://labs.genetics.ucla.edu/horvath/htdocs/CoexpressionNetwork/Rpackages/W
GCNA/Tutorials/index.html.
Sorry, I don't have a quick simulated example ready but could cook one up.

I am actually not sure whether this problem also occurs on Windows, since
the cluster parallelization is very different from the forking on linux. I
am only aware of it through bug reports of WGCNA users (who usually email me
directly).

Peter



On Sun, Feb 16, 2014 at 6:30 PM, Martin Morgan <mtmorgan at fhcrc.org> wrote:
>
> but from my original mail pickSoftThreshold 'worked for me' under 
> Rstudio on Windows (output below, again); maybe I didn't invoke it in 
> a way that would trigger the error, so a reproducible illustration 
> would help (me). Also on Windows (platform of the original post) the 
> parallel package doesn't fork, e.g., from ?mcfork
>
>      These are low-level functions, not available on Windows, and not
>      exported from the namespace.
>
> or ?mclapply
>
>      It relies on forking and hence is not available on Windows unless
>      'mc.cores = 1'.
>
> Do you mean something less literally forking, like spawning a new process?
> Maybe from
>
>   http://stackoverflow.com/questions/20704235
>
> the solution is as simple as adding .packages="WGCNA" to your foreach
calls?
>
>
>> x = pickSoftThreshold(data)
> trace: .C("checkAvailableMemoryForR", size = as.double(size), PACKAGE 
> =
> "WGCNA")
> trace: .C("corFast", x = as.double(x), nrow = as.integer(nrow(x)), 
> ncolx = as.integer(ncol(x)),
>     y = as.double(y), ncoly = as.integer(ncol(y)), quick =
as.double(quick),
>     cosineX = as.integer(cosineX), cosineY = as.integer(cosineY),
>     res = as.double(bi), nNA = as.integer(nNA), err = as.integer(err),
>     nThreads = as.integer(nThreads), verbose = as.integer(verbose),
>     indent = as.integer(indent), DUP = FALSE, NAOK = TRUE, PACKAGE =
> "WGCNA")
> trace: .C("corFast", x = as.double(x), nrow = as.integer(nrow(x)), 
> ncolx = as.integer(ncol(x)),
>     y = as.double(y), ncoly = as.integer(ncol(y)), quick =
as.double(quick),
>     cosineX = as.integer(cosineX), cosineY = as.integer(cosineY),
>     res = as.double(bi), nNA = as.integer(nNA), err = as.integer(err),
>     nThreads = as.integer(nThreads), verbose = as.integer(verbose),
>     indent = as.integer(indent), DUP = FALSE, NAOK = TRUE, PACKAGE =
> "WGCNA")
>    Power SFT.R.sq  slope truncated.R.sq  mean.k. median.k.   max.k.
> 1      1    0.173 -11.00        0.98400 4.02e+01  4.02e+01 4.45e+01
> 2      2    0.326  -8.45        0.95700 5.04e+00  5.04e+00 6.12e+00
> 3      3    0.241  -4.30        0.95300 8.01e-01  8.02e-01 1.05e+00
> 4      4    0.348  -4.04        0.95000 1.49e-01  1.49e-01 2.25e-01
> 5      5    0.513  -3.74        0.90400 3.15e-02  3.09e-02 5.77e-02
> 6      6    0.697  -3.72        0.92600 7.32e-03  6.99e-03 1.70e-02
> 7      7    0.811  -3.26        0.93300 1.85e-03  1.70e-03 5.55e-03
> 8      8    0.893  -2.92        0.95700 4.99e-04  4.32e-04 1.97e-03
> 9      9    0.923  -2.67        0.94400 1.43e-04  1.15e-04 7.35e-04
> 10    10    0.958  -2.38        0.96500 4.34e-05  3.15e-05 2.85e-04
> 11    12    0.910  -2.07        0.88600 4.50e-06  2.59e-06 4.55e-05
> 12    14    0.325  -3.02        0.21800 5.31e-07  2.29e-07 7.58e-06
> 13    16    0.219  -3.01        0.00942 6.88e-08  2.06e-08 1.29e-06
> 14    18    0.206  -3.06        0.03550 9.53e-09  1.93e-09 2.29e-07
> 15    20    0.227  -2.97        0.02620 1.39e-09  1.85e-10 4.11e-08
>
>
>
>
>>
>> Hope this clears up any lingering confusion.
>>
>> Also see inline below
>>
>> On Sun, Feb 16, 2014 at 12:26 PM, Martin Morgan <mtmorgan at fhcrc.org>
>> wrote:
>>
>>>
>>>
>>> Also I don't see where the error message is coming from (who is 
>>> printing "task 1 failed - " ?)
>>
>>
>>
>> I believe this is printed by foreach/doParallel but here I could be
wrong.
>>
>>
>>
>> HTH,
>>
>> Peter
>>
>>>
>>> I have a vague recollection that the x[["Version"]] (which comes 
>>> from
>>> getAnywhere("print.sessionInfo")) has to do with multiple installed 
>>> versions of a package, but again it would be good to get to the 
>>> bottom of this problem.
>>>
>>> Martin
>
>
>
> --
> Computational Biology / Fred Hutchinson Cancer Research Center
>
> 1100 Fairview Ave. N.
> PO Box 19024 Seattle, WA 98109
>
> Location: Arnold Building M1 B861
> Phone: (206) 667-2793

_______________________________________________
Bioconductor mailing list
Bioconductor at r-project.org
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor

----------------------------------------------------------------------------
-----------------------
Texto aqadido por Panda Security for Desktops:

 Este mensaje NO ha sido clasificado como SPAM. Si se trata de un mensaje de
correo no solicitado (SPAM), haz clic en el siguiente vmnculo para
reclasificarlo:
http://localhost:6083/Panda?ID=pav_7520&SPAM=true&path=C:\Windows\system32\c
onfig\systemprofile\AppData\Local\Panda%20Software\AntiSpam
----------------------------------------------------------------------------



More information about the Bioconductor mailing list