Hi Alejandro,

I just reran the code with only ENSG00000009780 in the dataset. The odd
thing is that this time it worked perfectly.

Regards,
Stefan


## Only ENSG00000009780

> counts(cds)       sample1 sample2 sample3 sample4 sample5 sample6 sample7 sample8 sample9 sample10 sample11 sample12 sample13 sample14
1           11       8       4       7       3       1       8       3
      1        1        0        3        4        1
1100         1       0       1       1       0       0       0       1
      2        0        0        0        0        1
13665       91     145      86      47     169      36     231     136
     78       64      130       55       66       27
13666       19      62      32      18      44      23      61      38
      7       20       51        6       43        8
13670       77     139      73      66      78      30     168      88
     75       85      186       79       86       58
22095       78      94      52      47      40      41     121      60
     64       64      149       51       91       37
22096       36      61      50      63      67      10     129      69
     90       79      287       66       97       29
108973       3       5       3       0       1       0       4       2
      1        0        0        1        1        0
136687       2      12       5       0      19       0      34      10
      5       28       31        6       27        5
138629      42      96      50      24      98      23     117      79
    119      111      233      100      129       59
       sample15 sample16 sample17 sample18 sample19 sample20 sample21
sample22 sample23
1             1        0        3        7        0        0        1
      0        0
1100          1        0        0        3        1        0        0
      0        0
13665        75       53       49      126       36       37       48
     53       24
13666        15       19       11       26        7       15       11
      5        7
13670        70       86       47      149       82       49       66
     38       27
22095        67       54       58      148       54       41       64
     43       25
22096        75       92       58      126       67       55       71
     51       21
108973        1        1        0        4        0        0        0
      0        1
136687        4       12        5       32        6        7        9
     16        1
138629      141      103       70      219       92       60       74
     67       54


> fData(cds)                geneID exonID testable dispBeforeSharing dispFitted dispersion       pvalue      padjust chr    start      end
1      ENSG00000009780      4     TRUE        0.40135710 0.40580351
0.40580351 4.256220e-04 8.512441e-04   1 28059114 28059168
1100   ENSG00000009780      3     TRUE        0.57127358 1.98755125
1.98755125 9.715722e-01 9.715722e-01   1 28056743 28056844
13665  ENSG00000009780      9     TRUE        0.01991574 0.03806432
0.03806432 2.537419e-09 1.268709e-08   1 28086037 28086138
13666  ENSG00000009780      7     TRUE        0.09257216 0.07072531
0.09257216 6.740254e-07 2.246751e-06   1 28075579 28075665
13670  ENSG00000009780      6     TRUE        0.01517262 0.03752086
0.03752086 7.138709e-01 7.931899e-01   1 28071165 28071322
22095  ENSG00000009780      8     TRUE        0.03299040 0.04028325
0.04028325 5.589838e-01 6.987297e-01   1 28081706 28081841
22096  ENSG00000009780      5     TRUE        0.04659993 0.03930375
0.04659993 1.632722e-04 4.081804e-04   1 28060542 28060694
108973 ENSG00000009780      2     TRUE        0.01382188 0.99934256
0.99934256 6.159293e-02 1.026549e-01   1 28053983 28054047
136687 ENSG00000009780      1     TRUE        0.38138204 0.12609379
0.38138204 1.302911e-01 1.861302e-01   1 28052568 28052672
138629 ENSG00000009780     10     TRUE        0.01782266 0.03578012
0.03578012 7.407408e-12 7.407408e-11   1 28087006 28087092
       strand transcripts log2fold(data/control)
1        <NA>        <NA>            -1.89528866
1100     <NA>        <NA>            -0.01821944
13665    <NA>        <NA>            -0.79494559
13666    <NA>        <NA>            -1.13738728
13670    <NA>        <NA>            -0.06639675
22095    <NA>        <NA>             0.07349758
22096    <NA>        <NA>             0.52608193
108973   <NA>        <NA>            -1.59373125
136687   <NA>        <NA>             0.56880558
138629   <NA>        <NA>             0.85158758



On Wed, Sep 19, 2012 at 5:20 PM, Alejandro Reyes <alejandro.reyes@embl.de>wrote:

> Dear Stefan Dentro,
>
> Thanks for your email.  It actually looks strange, could you send the
> output of the function "counts", and subset for this specific gene? Could
> you also do it for the "fData" function??
>
> Alejandro
>
>
>  Hello,
>>
>> I'm using DEXSeq to obtain differentially expressed exons between my 15
>> samples and 8 controls. But for some reason most exons are not testable
>> and
>> are not assigned a p-value. Do you know what I'm doing wrong?
>>
>> First I've created a conservative list of exons per gene in the human
>> genome through the canonical transcripts table in UCSC (194511 in total).
>> Next I've obtained read counts for each exon in each sample which are read
>> into one big data.frame. Then the following list of commands are executed:
>>
>> metadata = data.frame(
>>    row.names=colnames(dat)[7:29],
>>    condition=c(rep("control", 8), rep("data", 15)),
>>    replicate=c(c(1:8), c(1:15)),
>>    libType=rep("paired-end", 23))
>>
>> ## Add strand information that is not available in the read counts
>> data.frame
>> dat = cbind(dat, strand=rep(NA, nrow(dat)))
>>
>> cds = newExonCountSet(countData = dat[,c(7:29)], design = metadata,
>>                         geneIDs = dat[,5], exonIDs = dat[,6],
>>                         exonIntervals=dat[,c(2,3,4, 30)])
>>
>> cds = estimateSizeFactors(cds)
>> cds = estimateDispersions(cds, minCount=2, maxExon=90)
>> cds = fitDispersionFunction(cds)
>> cds = testForDEU(cds)
>> cds = estimatelog2FoldChanges(cds)
>>
>> Now lets see how many exons are expressed significantly different between
>> data and controls. Surprisingly only 1850 exons are described here, not
>> the
>> full 194511:
>>
>> res1 = DEUresultTable(cds)
>> table(res1$padjust < 0.05)
>>
>> FALSE  TRUE
>>   1424   426
>>
>> So I've zoomed in to an example gene. Below it can be seen that indeed all
>> exons are marked as not testable. But looking at the individual read
>> counts
>> per sample (see further below) it does not make sense that they are not
>> testable. Some exons indeed have a low read count and are excluded
>> rightfully, but others have enough reads to base dispersion on I would
>> think.
>>
>> If I have understood correctly testable can be false in either of these
>> cases:
>> - Gene has only one exon, dispersion cannot be calculated.
>> - Readcounts for an exon are too low.
>> - A combination of the above.
>>
>> But all do not apply here. Any ideas?
>>
>>
>> fData for the gene:
>>                  geneID exonID testable dispBeforeSharing dispFitted
>> dispersion pvalue padjust chr    start      end strand transcripts
>> 136687 ENSG00000009780      1    FALSE                NA  0.2026585
>> 0.2026585     NA      NA   1 28052568 28052672   <NA> <NA>
>> 108973 ENSG00000009780      2    FALSE                NA  0.9549670
>> 0.9549670     NA      NA   1 28053983 28054047   <NA> <NA>
>> 1100   ENSG00000009780      3    FALSE                NA  1.9560912
>> 1.9560912     NA      NA   1 28056743 28056844   <NA> <NA>
>> 1      ENSG00000009780      4    FALSE                NA  0.4722995
>> 0.4722995     NA      NA   1 28059114 28059168   <NA> <NA>
>> 22096  ENSG00000009780      5    FALSE                NA  0.1146813
>> 0.1146813     NA      NA   1 28060542 28060694   <NA> <NA>
>> 13670  ENSG00000009780      6    FALSE                NA  0.1127563
>> 0.1127563     NA      NA   1 28071165 28071322   <NA> <NA>
>> 13666  ENSG00000009780      7    FALSE                NA  0.1450013
>> 0.1450013     NA      NA   1 28075579 28075665   <NA> <NA>
>> 22095  ENSG00000009780      8    FALSE                NA  0.1160550
>> 0.1160550     NA      NA   1 28081706 28081841   <NA> <NA>
>> 13665  ENSG00000009780      9    FALSE                NA  0.1129432
>> 0.1129432     NA      NA   1 28086037 28086138   <NA> <NA>
>> 138629 ENSG00000009780     10    FALSE                NA  0.1112855
>> 0.1112855     NA      NA   1 28087006 28087092   <NA> <NA>
>>
>> Read counts for the gene per sample:
>>                 exon_id chr    start      end         gene_id exon_nr
>> sample1 sample2 sample3 sample4 sample5 sample6 sample7 sample8
>> 1      ENSE00000327880   1 28059114 28059168 ENSG00000009780 4
>> 11       8       4       7       3       1       8       3
>> 1100   ENSE00000571221   1 28056743 28056844 ENSG00000009780 3
>> 1       0       1       1       0       0       0       1
>> 13665  ENSE00000761751   1 28086037 28086138 ENSG00000009780 9      91
>> 145      86      47     169      36     231     136
>> 13666  ENSE00000761753   1 28075579 28075665 ENSG00000009780 7      19
>> 62      32      18      44      23      61      38
>> 13670  ENSE00000761780   1 28071165 28071322 ENSG00000009780 6      77
>> 139      73      66      78      30     168      88
>> 22095  ENSE00000866714   1 28081706 28081841 ENSG00000009780 8      78
>> 94      52      47      40      41     121      60
>> 22096  ENSE00000866716   1 28060542 28060694 ENSG00000009780 5      36
>> 61      50      63      67      10     129      69
>> 108973 ENSE00001625313   1 28053983 28054047 ENSG00000009780 2
>> 3       5       3       0       1       0       4       2
>> 136687 ENSE00001909353   1 28052568 28052672 ENSG00000009780 1       2
>> 12       5       0      19       0      34      10
>> 138629 ENSE00001955917   1 28087006 28087092 ENSG00000009780 10
>> 42      96      50      24      98      23     117      79
>>         sample9 sample10 sample11 sample12 sample13 sample14 sample15
>> sample16 sample17 sample18 sample19 sample20 sample21 sample22
>> 1            1        1        0        3        4        1 1
>> 0        3        7        0        0        1        0
>> 1100         2        0        0        0        0        1 1
>> 0        0        3        1        0        0        0
>> 13665       78       64      130       55       66       27 75
>> 53       49      126       36       37       48       53
>> 13666        7       20       51        6       43        8 15
>> 19       11       26        7       15       11        5
>> 13670       75       85      186       79       86       58 70
>> 86       47      149       82       49       66       38
>> 22095       64       64      149       51       91       37 67
>> 54       58      148       54       41       64       43
>> 22096       90       79      287       66       97       29 75
>> 92       58      126       67       55       71       51
>> 108973       1        0        0        1        1        0 1
>> 1        0        4        0        0        0        0
>> 136687       5       28       31        6       27        5 4
>> 12        5       32        6        7        9       16
>> 138629     119      111      233      100      129       59 141
>> 103       70      219       92       60       74       67
>>         sample23 strand
>> 1             0     NA
>> 1100          0     NA
>> 13665        24     NA
>> 13666         7     NA
>> 13670        27     NA
>> 22095        25     NA
>> 22096        21     NA
>> 108973        1     NA
>> 136687        1     NA
>> 138629       54     NA
>>
>> Design of the experiment:
>>           condition replicate    libType
>> sample1    control         1 paired-end
>> sample2    control         2 paired-end
>> sample3    control         3 paired-end
>> sample4    control         4 paired-end
>> sample5    control         5 paired-end
>> sample6    control         6 paired-end
>> sample7    control         7 paired-end
>> sample8    control         8 paired-end
>> sample9       data         1 paired-end
>> sample10      data         2 paired-end
>> sample11      data         3 paired-end
>> sample12      data         4 paired-end
>> sample13      data         5 paired-end
>> sample14      data         6 paired-end
>> sample15      data         7 paired-end
>> sample16      data         8 paired-end
>> sample17      data         9 paired-end
>> sample18      data        10 paired-end
>> sample19      data        11 paired-end
>> sample20      data        12 paired-end
>> sample21      data        13 paired-end
>> sample22      data        14 paired-end
>> sample23      data        15 paired-end
>>
>>  sessionInfo()
>>>
>> R version 2.15.0 (2012-03-30)
>> Platform: i686-pc-linux-gnu (32-bit)
>>
>> locale:
>>   [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C LC_TIME=en_GB.UTF-8
>> LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8
>>   [6] LC_MESSAGES=C              LC_PAPER=C LC_NAME=C
>> LC_ADDRESS=C LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods base
>>
>> other attached packages:
>> [1] plyr_1.7.1          DEXSeq_1.2.1        BiocInstaller_1.4.3
>> DESeq_1.8.3         locfit_1.5-7        Biobase_2.16.0
>> [7] BiocGenerics_0.2.0
>>
>> loaded via a namespace (and not attached):
>>   [1] annotate_1.34.0      AnnotationDbi_1.18.0 biomaRt_2.12.0
>> DBI_0.2-5            genefilter_1.38.0    geneplotter_1.34.0
>>   [7] grid_2.15.0          hwriter_1.3          IRanges_1.14.2
>> lattice_0.20-6       RColorBrewer_1.0-5   RCurl_1.91-1
>> [13] RSQLite_0.11.1       splines_2.15.0       statmod_1.4.15
>> stats4_2.15.0        stringr_0.6          survival_2.36-12
>> [19] tools_2.15.0         XML_3.9-4            xtable_1.7-0
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________**_________________
>> Bioconductor mailing list
>> Bioconductor@r-project.org
>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https://stat.ethz.ch/mailman/listinfo/bioconductor>
>> Search the archives: http://news.gmane.org/gmane.**
>> science.biology.informatics.**conductor<http://news.gmane.org/gmane.science.biology.informatics.conductor>
>>
>
>

	[[alternative HTML version deleted]]

