[BioC] #Identify differentially expressed genes
Paolo Kunderfranco
paolo.kunderfranco at gmail.com
Tue Aug 28 17:22:04 CEST 2012
Hi,
Thanks for the fast reply,
Just another curiosity, I finally obtained list of genes significantly
differentially expressed between my conditions,
I would like now to build for each condition an heatmap. I constructed
for each one a data matrix based on a lumi.N.Q object (after VST
variance stabilizing transformation and normalization).
>dataMatrix <- exprs(lumi.N.Q)
>head(dataMatrix)
CMe_2 mES_2 CMa_1 CMp_1 CMe_1 mES_1 CMa_3
CMp_3 CMe_3 mES_3 CMa_2 CMp_2
T.kvUVfFyruaCeqZVs 7.306945 7.312214 7.302783 7.302670 7.294789
7.273550 7.381502 7.312916 7.301191 7.268140 7.272041 7.278205
Q96MoiqM_z49FU37tE 7.257576 7.278205 7.300322 7.226146 7.281574
7.281210 7.294743 7.393729 7.340853 7.315099 7.312916 7.270886
>heatmap(dataMatrix[1:5,])
I was wondering on the scale of the heatmap, usually values ranges
from -3 to 3, but data VST transformed in my example ranges from 7 to
14, with a median value of 7.5...
is it correct to plot a heatmap with VST transfmed data? or is
preferable to perfrom variance stabilization using log2 transform?
If is correct to use vst transformed data how do I refer to the scale?
Thanks for any suggestion,
Cheers,
Paolo
2012/8/23 Pan Du <dupan.mail at gmail.com>:
> Hi Paolo
>
> It's pretty normal that some probes cannot map to any genes because
> Human genome annotation keeps updating but the probe design is
> unchanged. So usually you can ignore those NA probes. If you are
> really interested in them, you can easily convert the nuID to probe
> sequence (use id2seq function) and map them to genome or refseq.
> I will update the vignette later to avoid such confusion.
>
> Pan
>
> On Thu, Aug 23, 2012 at 3:26 AM, Paolo Kunderfranco
> <paolo.kunderfranco at gmail.com> wrote:
>> Dear All,
>> I am working with lumi / limma package to detect differentially expressed
>> genes between two or more samples.
>> I was wondering why when I add geneSymbol and geneName to my Illumina
>> probelist most of them (around 10%)are not called and remained NA, for
>> instance (last row):
>>
>>
>> if (require(lumiMouseAll.db) & require(annotate)) {
>> geneSymbol <- getSYMBOL(probeList, 'lumiMouseAll.db')
>> geneName <- sapply(lookUp(probeList, 'lumiMouseAll.db',
>> 'GENENAME'), function(x) x[1])
>> fit1_2$genes <- data.frame(ID= probeList,
>> geneSymbol=geneSymbol, geneName=geneName, stringsAsFactors=FALSE)
>> }
>>
>>
>> 7671 69fpKOOuFduFbAjNVU Dppa5a developmental pluripotency
>> associated 5A 7.828381 9.333743 149.31710 2.773571e-18 6.144846e-14
>> 29.80858
>> 16014 QpWgiAmByT4gW7iui0 Pou5f1 POU domain, class 5, transcription
>> factor 1 5.305532 8.633706 103.85793 1.098423e-16 8.143726e-13
>> 27.72832
>> 20450 HlUzpCHheswfSZNdQo Trh thyrotropin
>> releasing hormone 5.603441 8.761965 103.81774 1.102739e-16
>> 8.143726e-13 27.72571
>> 7670 o7Ah_nzF7JdZOTtd9U Dppa4 developmental pluripotency
>> associated 4 5.300619 8.626239 99.82457 1.640729e-16 9.087587e-13
>> 27.45790
>> 7672 xjn0tTp4isUXmUkAKI Dppa5a developmental pluripotency
>> associated 5A 7.663922 9.439668 97.09091 2.173661e-16 9.631491e-13
>> 27.26346
>> 17719 ZXvxHuC6s3xogRFJfo Sall4 sal-like 4
>> (Drosophila) 4.456642 8.585243 90.39110 4.484584e-16 1.655932e-12
>> 26.74469
>> 14084 06jqfFxe5_X97NRXuk Myl3 myosin, light
>> polypeptide 3 -7.736059 13.128014 -88.39591 5.622067e-16 1.779384e-12
>> 26.57755
>> 8757 oii7mSFyrr_AMWODH0 <NA>
>> <NA> 4.608167 8.438770 78.65631 1.833459e-15 5.077535e-12
>> 25.66512
>>
>> any ideas?
>> thanks
>> paolo
>>
More information about the Bioconductor
mailing list