Hi Alejandro,
Yes, there is a batch effect. There are two experiments, exp1 with Controls
and Hypoxia samples at 24h and a second at 48h. However the controls were
always at 24h. In DESeq2 (and a limma/voom on exons I have to fetch the
plot, I'll send it to you later) pca, you clearly see that the controls
cluster well together(N1,N2,CTRL2,CTRL3) while 24 and 48 hours were
separated. A DESeq2 analysis of the kind ~condition + experiment (but also
~condition ) gave good results. Also DEXSeq CTRL vs HYPOXIA (not including
experiment as other factor)
But in anycase, I would have expected higher p.values rather than lower.
I will explore potential differences in terms of GC and other and will send
you a dropbox link for the RData of the ecs and res objects.
Thanks a lot!
J


2014-04-03 14:47 GMT+02:00 Alejandro Reyes <alejandro.reyes@embl.de>:

> Hi Jose,
>
> 98,000 hits!!?? Would if be possible for you to send me your raw input
> files
> offline? (via e.g. dropbox, ftp, etc: count files and DEXSeq flattened
> gtf file), so I can
> have a closer look at your data?
>
> Best regards,
> Alejandro
>
>
>
>
>
>
> > Hi Alejandro,
> > I apologize, I did not see the answer in
> >
> http://permalink.gmane.org/gmane.science.biology.informatics.conductor/53937
> > I was waiting for a Bioc... sorry about that.
> >
> > Ok, then those exons with very low log2FC and low p.value would
> > belong, if I got it right, to genes with many differentially used
> > exons and some with a very high log2FC, in that case, the linear model
> > would recognise wrongly as DU the complementary set of exons that
> > actually are not. Since you say that luckily these cases are
> > exceptions but I have ~98000 exons with p.adjust<0.05, I could have
> > something really interesting or a terrible flaw in my 48h compared to
> > my 24 h treated samples.
> > If the former case, I could run DEXSeq at the gene-level to identify
> > the genes and trust, which log2FC? or which p.values? to detect
> > interesting exons?
> > I first thought to put a threshold of log2FC, but the "volcano" was
> > strange with few volcano-like behaviour.
> > or better should I make gene-level DEXSeq and then filter out those
> > genes with very huge log2FC exons.
> > Thanks again for your help
> > Jose
> >
> >
> >
> > 2014-04-03 13:15 GMT+02:00 Alejandro Reyes <alejandro.reyes@embl.de
> > <mailto:alejandro.reyes@embl.de>>:
> >
> >     Hi Jose,
> >
> >     I have an e-mail answering to this thread on the 24.03.2014, maybe
> you
> >     missed it or did I write your e-mail wrong?
> >
> >
> http://permalink.gmane.org/gmane.science.biology.informatics.conductor/53937
> >
> >     Your concern is answered by the second point that I describe
> >     there. If you
> >     look at your "fitted splicing" plot, you can see this.  The
> >     extreme case
> >     is the coefficients fitted
> >     for your exon E032, it has a value of ~40,000 in one of your
> >     conditions
> >     and a value of ~800 on
> >     your other conditions.  This will affect the estimation of
> >     relative exon
> >     abundances from
> >     your other exons. As I mentioned before, this is a limitation of the
> >     DEXSeq model,
> >     but luckily, genes like this cases seem to be exceptions rather
> >     than the
> >     rule
> >     (at least in my experience!).
> >
> >     About using the output from voom to test for DEU, I have not explored
> >     that option,
> >     but maybe the maintainers/authors of that package are able to
> >     guide you
> >     better.
> >
> >     Hope it is useful,
> >     Alejandro
> >
> >
> >
> >
> >     > Dear Alejandro,
> >     > Have you had time to take a look at my problem (please see below)?
> >     > I am now using DEXSeq 1.9 to analyze the same ecs objects I had
> >     > analyzed with 1.8 but it produces the very same results. The
> problem
> >     > was regarding too many exons with very low log2FC and very low
> >     > p.values. I send here an object with the subsetByGene (ecs.one)
> with
> >     > one particular gene. The E029 has a very low p.value with a very
> low
> >     > log2FC. Either the log2FCs are not OK or the p.values. I cannot
> >     > understand how such low log2FC for the DEU analysis can give
> >     those low
> >     > p.values. Indeed the complete original ecs gave me 98000 exons with
> >     > table(res.48$padjust<0.05).
> >     > However the same analysis (ecs object 4CTRLS vs 4 TREATED) gave me
> >     > nice results when analysed with DEXSeq 1.6 on the complete design
> >     > without splitting into two.
> >     > Here's the picture with expression and splicing values:
> >     > Immagine in linea 2
> >     >
> >     > Here's the design of the ecs object created with CTRL vs HYPOXIA
> >     (only
> >     > at 48h):
> >     > > design(ecs.one)
> >     >          sampleName    fileName condition
> >     > N1               N1 Exon_Martelli_Sample_Martelli_N_1.bam      CTRL
> >     > N2               N2 Exon_Martelli_Sample_Martelli_N_2.bam      CTRL
> >     > CTRL2         CTRL2  Exon_Martelli_Sample_Martelli_CTRL_2.bam
> >        CTRL
> >     > CTRL3         CTRL3  Exon_Martelli_Sample_Martelli_CTRL_3.bam
> >        CTRL
> >     > HYPOXIA2   HYPOXIA2 Exon_Martelli_Sample_Martelli_HYPOXIA_2.bam
> >       HYPOXIA
> >     > HYPOXIA3   HYPOXIA3 Exon_Martelli_Sample_Martelli_HYPOXIA_3.bam
> >       HYPOXIA
> >     >
> >     > Maybe the sampleNames?? N1andN2 come from another batch but it is
> >     > still a CTRL. If they were different I would expect higher
> >     dispersions
> >     > and hence higher p.values not lower ones, wouldn't I?
> >     > I have tried to trace the problem a bit with these gene:
> >     >
> >     > modelFrame<-constructModelFrame(ecs.one)
> >     > formula0 = ~sample + exon
> >     > formula1 = ~sample + exon + condition:exon
> >     > mm0<-DEXSeq:::rmDepCols(model.matrix(formula0,modelFrame))
> >     > mm1<-DEXSeq:::rmDepCols(model.matrix(formula1,modelFrame))
> >     >
> >
> count<-DEXSeq:::getCountVector(ecs=ecs.one,geneID="ENSG00000170017","E029")
> >     >
> >     > > mm0
> >     >    (Intercept) sampleCTRL3 sampleHYPOXIA2 sampleHYPOXIA3 sampleN1
> >     > sampleN2 exonothers
> >     > 1            1     0              0  0        1        0
> >     >    0
> >     > 2            1     0              0  0        0        1
> >     >    0
> >     > *3            1       0              0              0        0
> >     >  0          0*
> >     > 4            1     1              0              0        0
>  0
> >     >    0
> >     > 5            1     0              1  0        0        0
> >     >    0
> >     > 6            1     0              0  1        0        0
> >     >    0
> >     > 7            1     0              0  0        1        0
> >     >    1
> >     > 8            1     0              0  0        0        1
> >     >    1
> >     > 9            1     0              0  0        0        0
> >     >    1
> >     > 10           1     1              0  0        0        0
> >     >    1
> >     > 11           1     0              1  0        0        0
> >     >    1
> >     > 12           1     0              0  1        0        0
> >     >    1
> >     > attr(,"assign")
> >     > [1] 0 1 1 1 1 1 2
> >     > attr(,"contrasts")
> >     > attr(,"contrasts")$sample
> >     > [1] "contr.treatment"
> >     >
> >     > attr(,"contrasts")$exon
> >     > [1] "contr.treatment"
> >     >
> >     > Does it seem OK to you? I guess the intercept is CTRL2 (in bold)
> but
> >     > why? Does it have to do with the 'CTRL' string in the sampleName? I
> >     > tried to change the sample names to CTRL1,CTRL2... but the
> >     result was
> >     > the same.
> >     >
> >     > Here's the mm1
> >     > > mm1
> >     >  (Intercept) sampleCTRL3 sampleHYPOXIA2 sampleHYPOXIA3 sampleN1
> >     > sampleN2 exonothers exonthis:conditionHYPOXIA
> >     > 1  1           0              0              0      1    0
> >     >  0                         0
> >     > 2  1           0              0              0      0    1
> >     >  0                         0
> >     > 3  1           0              0              0      0    0
> >     >  0                         0
> >     > 4  1           1              0              0      0    0
> >     >  0                         0
> >     > 5  1           0              1              0      0    0
> >     >  0                         1
> >     > 6  1           0              0              1      0    0
> >     >  0                         1
> >     > 7  1           0              0              0      1    0
> >     >  1                         0
> >     > 8  1           0              0              0      0    1
> >     >  1                         0
> >     > 9  1           0              0              0      0    0
> >     >  1                         0
> >     > 10 1           1              0              0      0  0          1
> >     >                         0
> >     > 11 1           0              1              0      0  0          1
> >     >                         0
> >     > 12 1           0              0              1      0  0          1
> >     >                         0
> >     > > sessionInfo()
> >     > R Under development (unstable) (2014-02-09 r64949)
> >     > Platform: x86_64-apple-darwin10.8.0 (64-bit)
> >     >
> >     > locale:
> >     > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
> >     >
> >     > attached base packages:
> >     > [1] parallel  stats   graphics  grDevices utils   datasets  methods
> >     >   base
> >     >
> >     > other attached packages:
> >     > [1] edgeR_3.5.28   limma_3.19.28  DEXSeq_1.9.5
> >     > Biobase_2.23.6 BiocGenerics_0.9.3 vimcom.plus_0.9-93 setwidth_1.0-3
> >     > colorout_1.0-2
> >     >
> >     > loaded via a namespace (and not attached):
> >     >  [1] AnnotationDbi_1.25.15 biomaRt_2.19.3  Biostrings_2.31.15
> >     >  bitops_1.0-6          DBI_0.2-7 GenomeInfoDb_0.99.19
> >     >  GenomicRanges_1.15.39
> >     >  [8] hwriter_1.3       IRanges_1.21.34 RCurl_1.95-4.1
> >     >  Rsamtools_1.15.33     RSQLite_0.11.4  statmod_1.4.18
> >      stats4_3.1.0
> >     > [15] stringr_0.6.2       tools_3.1.0 XML_3.98-1.1
> >     >  XVector_0.3.7         zlibbioc_1.9.0
> >     >
> >     >
> >     > I hope you can give me some hints since I am a bit confused and
> >     stuck
> >     > with these results.
> >     > By the way, for the other Bioc, I know limma/voom used on exons can
> >     > also work nicely. Is there an easy way to implement a sort of
> >     DEU test
> >     > with limma voom counts? I guess the annotation gtf used to count
> >     them
> >     > should be used to construct models and include it in a similar way
> >     > with formulae in linearmodels as DEXSeq does with glmnb.fit
> >     function.
> >     > It would be perfect to have it straight as a function also in
> >     limma to
> >     > compare results.
> >     >
> >     > Thanks again for your efforts,
> >     >
> >     > Looking forward to hearing to your comments.
> >     > Jose
> >     >
> >     >
> >     >
> >     >
> >     > 2014-03-20 16:07 GMT+01:00 Jose Garcia
> >     > <garciamanteiga.josemanuel@hsr.it
> >     <mailto:garciamanteiga.josemanuel@hsr.it>
> >     > <mailto:garciamanteiga.josemanuel@hsr.it
> >     <mailto:garciamanteiga.josemanuel@hsr.it>>>:
> >     >
> >     >     Hi Alejandro,
> >     >     I solved the problem by re-creating the object ecs.24. I had
> >     made
> >     >     one DEXSeq analysis up to the end by first creating an ecs
> >     object.
> >     >     Then I just split the ecs object, which already had p.value and
> >     >     other info, and re-run the analysis from sizeFactors on onto
> the
> >     >     new split ecs.24 object.
> >     >     Now it has worked.
> >     >     However, I have obtained a much harder to interpret result
> which
> >     >     points to something wrong I do not know why. And it is
> >     present in
> >     >     both the split and the original ecs.24 and ecs objects.
> >     >     From scratch:
> >     >     I made dexseq_prepare_annotation.py with the script from DEXSeq
> >     >     1.6 which contained the '-r' parameter in order to avoid
> >     counting
> >     >     exons overlapping different genes. Then I tried to count
> >     using the
> >     >     new dexseq_count.py in the same package but it gave me an error
> >     >     because it had been introduced a check for NH tag in the bam
> >     that
> >     >     I do not have because I use SOAPSplice. You suggested to use
> the
> >     >     old dexseq_count.py whithout the check (from DEXSeq 1.4).
> >     >     It worked and then I used the following script:
> >     >
> >     >     sampleFiles.R_ExonOUT<-Files
> >     >     sampleName.R_ExonOUT<-Names
> >     >
> >
> sampleCondition.R_ExonOUT<-c(rep("HYPOXIA",2),rep("CTRL",4),rep("HYPOXIA",2))
> >     > sampleExperiment.R_ExonOUT<-c(rep("RUN_2",4),rep("RUN_1",4))
> >     >     sampleTable.R_ExonOUT <- data.frame(sampleName =
> >     sampleName.R_ExonOUT,
> >     >                                  fileName = sampleFiles.R_ExonOUT,
> >     >                                  condition =
> >     sampleCondition.R_ExonOUT,
> >     >                                  experiment =
> >     sampleExperiment.R_ExonOUT)
> >     >     inDir = getwd()
> >     >     annotationfile = file.path
> >     >
> >
> ("/lustre1/genomes/hg19/annotation","Homo_sapiens.ensembl72.DEXSeq.gff")
> >     >
> >     >     ecs = read.HTSeqCounts(countfiles = file.path(inDir,
> >     >     sampleTable.R_ExonOUT$fileName),design = sampleTable.R_ExonOUT,
> >     >     flattenedfile = annotationfile)
> >     >
> >     >     sampleNames(ecs) = sampleTable.R_ExonOUT$sampleName
> >     >     ecs <- estimateSizeFactors(ecs)
> >     >     library(parallel)
> >     >     ecs <- estimateDispersions(ecs,nCores=8)
> >     >     ecs <- fitDispersionFunction(ecs)
> >     >     ecs <- testForDEU(ecs, nCores=8)
> >     >     ecs <- estimatelog2FoldChanges(ecs, nCores=8)
> >     >     res<- DEUresultTable(ecs)
> >     >
> >     >     The problem is that I have some exons with a ridiculous
> >     log2FC but
> >     >     with a very good p.adjust.
> >     >     Same thing with the ecs.24 or ecs.48 split objects. Here an
> >     example:
> >     >
> >     > head(res.48[which(res.48$geneID=="ENSG00000148516"),])
> >     >
> >     >     geneID exonID  dispersion       pvalue  padjust  meanBase
> >     >     log2fold(HYPOXIA/CTRL)
> >     >
> >     >     ENSG00000148516:E036 ENSG00000148516   E036 0.014798679
> >     >     2.873434e-16 5.646223e-14  171.5313  -6.075811e-01
> >     >
> >     >     ENSG00000148516:E049 ENSG00000148516   E049 0.011425856
> >     >     2.461690e-14 2.846653e-12  414.4351  -1.907197e-01
> >     >
> >     >     ENSG00000148516:E039 ENSG00000148516   E039 0.014486497
> >     >     2.332678e-13 2.043916e-11  181.3705  -4.226252e-01
> >     >
> >     >     *ENSG00000148516:E050 ENSG00000148516   E050 0.009733072
> >     >     1.131825e-12 8.326638e-11 1432.6492  -1.278668e-05*
> >     >
> >     >     ENSG00000148516:E033 ENSG00000148516   E033 0.037143254
> >     >     3.483915e-12 2.236853e-10  514.5010  -5.273017e-01
> >     >
> >     >     ENSG00000148516:E034 ENSG00000148516   E034 0.019826955
> >     >     4.660942e-12 2.896722e-10  113.6851  -6.541261e-01
> >     >
> >     >
> >     >     If you look at the plot (just a few exons around E50)
> >     >
> >     >     plotDEXSeq(ecs.48,"ENSG00000148516",splicing=T)
> >     >
> >     >
> >     >     Immagine in linea 3
> >     >
> >     >     It seems clear that all those p-values cannot come from those
> >     >     log2FC that are adjusted for expression of all exons coming
> from
> >     >     the same gene.
> >     >
> >     >     I have checked the design and formula:
> >     >
> >     >     design(ecs.48)
> >     >
> >     >      sampleName  fileName condition experiment
> >     >
> >     >     N1     N1 Exon_Martelli_Sample_Martelli_N_1.bam      CTRL
>  RUN_2
> >     >
> >     >     N2     N2 Exon_Martelli_Sample_Martelli_N_2.bam      CTRL
>  RUN_2
> >     >
> >     >     CTRL2 CTRL2  Exon_Martelli_Sample_Martelli_CTRL_2.bam  CTRL
> >      RUN_1
> >     >
> >     >     CTRL3 CTRL3  Exon_Martelli_Sample_Martelli_CTRL_3.bam  CTRL
> >      RUN_1
> >     >
> >     >     HYPOXIA2 HYPOXIA2 Exon_Martelli_Sample_Martelli_HYPOXIA_2.bam
> >     >     HYPOXIA      RUN_1
> >     >
> >     >     HYPOXIA3 HYPOXIA3 Exon_Martelli_Sample_Martelli_HYPOXIA_3.bam
> >     >     HYPOXIA      RUN_1
> >     >
> >     >
> >     >     formula(ecs.48)
> >     >
> >     >     $formulaDispersion
> >     >
> >     >     [1] "~sample + exon + condition:exon"
> >     >
> >     >
> >     >     $formula0
> >     >
> >     >     [1] "~sample + exon"
> >     >
> >     >
> >     >     $formula1
> >     >
> >     >     [1] "~sample + exon + condition:exon"
> >     >
> >     >
> >     >     So, I am a bit stuck with it. I guess everything comes from
> >     having
> >     >     used different versions but I cannot come across it.
> >     Summarizing:
> >     >
> >     >     SOASPSplice
> >     >
> >     >     dexseq_prepare_annotation.py (From DEXSeq 1.6) with Ensembl72
> >     >     (hg19) -r no
> >     >
> >     >     dexseq_count.py (From DEXSeq 1.4)
> >     >
> >     >     Analysis (DEXSeq 1.8)
> >     >
> >     >     Thanks for the help,
> >     >
> >     >
> >     >     Jose
> >     >
> >     >
> >     >
> >     >     sessionInfo()
> >     >
> >     >     R version 3.0.1 (2013-05-16)
> >     >
> >     >     Platform: x86_64-unknown-linux-gnu (64-bit)
> >     >
> >     >
> >     >     locale:
> >     >
> >     >      [1] LC_CTYPE=en_US       LC_NUMERIC=C LC_TIME=en_US
> >     >
> >     >      [4] LC_COLLATE=en_US     LC_MONETARY=en_US LC_MESSAGES=en_US
> >     >
> >     >      [7] LC_PAPER=C           LC_NAME=C LC_ADDRESS=C
> >     >
> >     >     [10] LC_TELEPHONE=C LC_MEASUREMENT=en_US LC_IDENTIFICATION=C
> >     >
> >     >
> >     >     attached base packages:
> >     >
> >     >     [1] parallel stats     graphics  grDevices utils datasets
> >      methods
> >     >
> >     >     [8] base
> >     >
> >     >
> >     >     other attached packages:
> >     >
> >     >     [1] DEXSeq_1.8.0       Biobase_2.22.0 BiocGenerics_0.8.0
> >     >
> >     >
> >     >     loaded via a namespace (and not attached):
> >     >
> >     >      [1] biomaRt_2.18.0       Biostrings_2.30.1 bitops_1.0-6
> >     >
> >     >      [4] GenomicRanges_1.14.3 hwriter_1.3 IRanges_1.20.6
> >     >
> >     >      [7] RCurl_1.95-4.1       Rsamtools_1.14.2 statmod_1.4.18
> >     >
> >     >     [10] stats4_3.0.1         stringr_0.6.2 tools_3.0.1
> >     >
> >     >     [13] XML_3.98-1.1         XVector_0.2.0 zlibbioc_1.8.0
> >     >
> >     >
> >     >
> >     >     2014-03-19 13:18 GMT+01:00 Jose Garcia
> >     >     <garciamanteiga.josemanuel@hsr.it
> >     <mailto:garciamanteiga.josemanuel@hsr.it>
> >     >     <mailto:garciamanteiga.josemanuel@hsr.it
> >     <mailto:garciamanteiga.josemanuel@hsr.it>>>:
> >     >
> >     >         Hi Alejandro,
> >     >         I am analyzing with DEXSeq my data. 4 CTRLs and 2 Treated
> >     >         samples. My design is the following:
> >     >
> >     >         design(ecs.24)
> >     >
> >     >               sampleName fileName condition experiment
> >     >
> >     >         H1            H1 Exon_Martelli_Sample_Martelli_H_1.bam
> >     >         HYPOXIA      RUN_2
> >     >
> >     >         H2            H2 Exon_Martelli_Sample_Martelli_H_2.bam
> >     >         HYPOXIA      RUN_2
> >     >
> >     >         N1            N1 Exon_Martelli_Sample_Martelli_N_1.bam
> >     CTRL
> >     >             RUN_2
> >     >
> >     >         N2            N2 Exon_Martelli_Sample_Martelli_N_2.bam
> >     CTRL
> >     >             RUN_2
> >     >
> >     >         CTRL2      CTRL2 Exon_Martelli_Sample_Martelli_CTRL_2.bam
> >     >         CTRL      RUN_1
> >     >
> >     >         CTRL3      CTRL3 Exon_Martelli_Sample_Martelli_CTRL_3.bam
> >     >         CTRL      RUN_1
> >     >
> >     >         When I follow the vignette:
> >     >
> >     >         ecs.24 <- estimateDispersions(ecs.24,nCores=8)
> >     >
> >     >         ....Done
> >     >
> >     >         ecs.24 <- fitDispersionFunction(ecs.24)
> >     >
> >     >         ....Done
> >     >
> >     >         ecs.24 <- testForDEU(ecs.24, nCores=8)
> >     >
> >     >         .....
> >     >
> >     >         ecs.24 <- estimatelog2FoldChanges(ecs.24, nCores=8)
> >     >
> >     >         *Error in `row.names<-.data.frame`(`*tmp*`, value =
> >     >         c("geneID", "exonID",  : *
> >     >
> >     >         *  duplicate 'row.names' are not allowed*
> >     >
> >     >         *Calls: estimatelog2FoldChanges ... pData<- -> pData<- ->
> >     >         row.names<- -> row.names<-.data.frame*
> >     >
> >     >         *In addition: Warning message:*
> >     >
> >     >         *non-unique value when setting 'row.names':
> >     >         'log2fold(CTRL/HYPOXIA)' *
> >     >
> >     >
> >     >         I checked for duplication as you had suggested elsewhere
> >     >
> >     >         any(duplicated(featureNames(ecs.24)))
> >     >
> >     >         [1] FALSE
> >     >
> >     >  any(duplicated(paste(geneIDs(ecs.24),exonIDs(ecs.24),sep=":")))
> >     >
> >     >         [1] FALSE
> >     >
> >     >
> >     >         I also checked that the condition in design where factors:
> >     >
> >     >
> >     >         is.factor(pData(ecs.24)$condition)
> >     >
> >     >         [1] TRUE
> >     >
> >     >
> >     >         The only explanation I could come to is the fact that I
> have
> >     >         an even number of samples for control and treated? or that
> I
> >     >         have the 'experiment' column in the design, but it would be
> >     >         irrelevant since the default formula is only taking
> >     condition
> >     >         into consideration, isn't it?
> >     >
> >     >         What could be the origin of the error?
> >     >
> >     >         Thanks again!
> >     >
> >     >         Jose
> >     >
> >     >
> >     >
> >     >         > sessionInfo()
> >     >
> >     >         R version 3.0.1 (2013-05-16)
> >     >
> >     >         Platform: x86_64-unknown-linux-gnu (64-bit)
> >     >
> >     >
> >     >         locale:
> >     >
> >     >          [1] LC_CTYPE=en_US       LC_NUMERIC=C     LC_TIME=en_US
> >     >
> >     >          [4] LC_COLLATE=en_US LC_MONETARY=en_US  LC_MESSAGES=en_US
> >     >
> >     >          [7] LC_PAPER=C           LC_NAME=C LC_ADDRESS=C
> >     >
> >     >         [10] LC_TELEPHONE=C LC_MEASUREMENT=en_US
> LC_IDENTIFICATION=C
> >     >
> >     >
> >     >         attached base packages:
> >     >
> >     >         [1] parallel  stats     graphics grDevices utils
> >     datasets
> >     >         methods
> >     >
> >     >         [8] base
> >     >
> >     >
> >     >         other attached packages:
> >     >
> >     >         [1] DEXSeq_1.8.0       Biobase_2.22.0 BiocGenerics_0.8.0
> >     >
> >     >
> >     >         loaded via a namespace (and not attached):
> >     >
> >     >          [1] biomaRt_2.18.0 Biostrings_2.30.1  bitops_1.0-6
> >     >
> >     >          [4] GenomicRanges_1.14.3 hwriter_1.3   IRanges_1.20.6
> >     >
> >     >          [7] RCurl_1.95-4.1 Rsamtools_1.14.2 statmod_1.4.18
> >     >
> >     >         [10] stats4_3.0.1         stringr_0.6.2     tools_3.0.1
> >     >
> >     >         [13] XML_3.98-1.1         XVector_0.2.0     zlibbioc_1.8.0
> >     >
> >     >
> >     >
> >     >         2014-03-13 16:32 GMT+01:00 Alejandro Reyes
> >     >         <alejandro.reyes@embl.de
> >     <mailto:alejandro.reyes@embl.de> <mailto:alejandro.reyes@embl.de
> >     <mailto:alejandro.reyes@embl.de>>>:
> >     >
> >     >             Dear Xiayu Rao,
> >     >
> >     >             Thanks for your interest in DEXSeq.
> >     >             That looks strange, does it happen when you use files
> >     >             different from the
> >     >             one you generated by your own? Could you maybe send me
> >     >             (offline) your
> >     >             gtf file and the first 1000 lines of one of your sam
> >     files?
> >     >
> >     >             Bests,
> >     >             Alejandro
> >     >
> >     >             > Hello,
> >     >             >
> >     >             > DEXSeq is a great tool. Thank you for that. I
> recently
> >     >             generate my own gtf file with the same format as the
> >     >             exon.gff generated by dexseq_prepare_annotation.py.
> >     >             However, the output is strange. I tried to find the
> >     reason
> >     >             with no success. Could you please provide any idea
> about
> >     >             that problem? Thank you very much in advance!
> >     >             >
> >     >             > Note: I used the latest dexseq, and the sam files had
> >     >             been sorted.
> >     >             >
> >     >             > 1       genes.gtf exonic_part   12228   12594   .
> >     >               +       . exonic_part_number "001"; gene_id
> >     >             "ENSG00000223972"
> >     >             > 1       genes.gtf exonic_part   12722   12974   .
> >     >               +       . exonic_part_number "002"; gene_id
> >     >             "ENSG00000223972"
> >     >             > 1       genes.gtf exonic_part   13053   13220   .
> >     >               +       . exonic_part_number "003"; gene_id
> >     >             "ENSG00000223972"
> >     >             > 1       genes.gtf exonic_part   14830   14969   .
> >     >               -       . exonic_part_number "001"; gene_id
> >     >             "ENSG00000223972+ENSG00000227232"
> >     >             > .............
> >     >             >
> >     >             >
> >     >             > ==> SRR791043_counts.txt <==
> >     >             > :001G00027000003"
> >     >             > :002G00021000003"
> >     >             > :003G00070000003"
> >     >             > :004G00040000003"
> >     >             > :005G00060000003"
> >     >             > :006G00030000003"
> >     >             > :007G00019000003"
> >     >             > :008G00045600003"
> >     >             > :009G00020400003"
> >     >             > :001G00000000005"
> >     >             >
> >     >             >
> >     >             > Thanks,
> >     >             > Xiayu
> >     >
> >     > _______________________________________________
> >     >             Bioconductor mailing list
> >     > Bioconductor@r-project.org <mailto:Bioconductor@r-project.org>
> >     <mailto:Bioconductor@r-project.org
> >     <mailto:Bioconductor@r-project.org>>
> >     > https://stat.ethz.ch/mailman/listinfo/bioconductor
> >     >             Search the archives:
> >     > http://news.gmane.org/gmane.science.biology.informatics.conductor
> >     >
> >     >
> >     >
> >     >
> >     >         --
> >     >         Jose M. Garcia Manteiga PhD
> >     >         Research Assistant - Data Analysis in Functional Genomics
> >     >         Center for Translational Genomics and BioInformatics
> >     >         Dibit2-Basilica, 3A3
> >     >         San Raffaele Scientific Institute
> >     >         Via Olgettina 58, 20132 Milano (MI), Italy
> >     >
> >     >         Tel: +39-02-2643-4751 <tel:%2B39-02-2643-4751>
> >     >         e-mail: garciamanteiga.josemanuel@hsr.it
> >     <mailto:garciamanteiga.josemanuel@hsr.it>
> >     >         <mailto:garciamanteiga.josemanuel@hsr.it
> >     <mailto:garciamanteiga.josemanuel@hsr.it>>
> >     >
> >     >
> >     >
> >     >
> >     >     --
> >     >     Jose M. Garcia Manteiga PhD
> >     >     Research Assistant - Data Analysis in Functional Genomics
> >     >     Center for Translational Genomics and BioInformatics
> >     >     Dibit2-Basilica, 3A3
> >     >     San Raffaele Scientific Institute
> >     >     Via Olgettina 58, 20132 Milano (MI), Italy
> >     >
> >     >     Tel: +39-02-2643-4751 <tel:%2B39-02-2643-4751>
> >     >     e-mail: garciamanteiga.josemanuel@hsr.it
> >     <mailto:garciamanteiga.josemanuel@hsr.it>
> >     >     <mailto:garciamanteiga.josemanuel@hsr.it
> >     <mailto:garciamanteiga.josemanuel@hsr.it>>
> >     >
> >     >
> >     >
> >     >
> >     > --
> >     > Jose M. Garcia Manteiga PhD
> >     > Research Assistant - Data Analysis in Functional Genomics
> >     > Center for Translational Genomics and BioInformatics
> >     > Dibit2-Basilica, 3A3
> >     > San Raffaele Scientific Institute
> >     > Via Olgettina 58, 20132 Milano (MI), Italy
> >     >
> >     > Tel: +39-02-2643-4751 <tel:%2B39-02-2643-4751>
> >     > e-mail: garciamanteiga.josemanuel@hsr.it
> >     <mailto:garciamanteiga.josemanuel@hsr.it>
> >     > <mailto:garciamanteiga.josemanuel@hsr.it
> >     <mailto:garciamanteiga.josemanuel@hsr.it>>
> >
> >
> >
> >
> > --
> > Jose M. Garcia Manteiga PhD
> > Research Assistant - Data Analysis in Functional Genomics
> > Center for Translational Genomics and BioInformatics
> > Dibit2-Basilica, 3A3
> > San Raffaele Scientific Institute
> > Via Olgettina 58, 20132 Milano (MI), Italy
> >
> > Tel: +39-02-2643-4751
> > e-mail: garciamanteiga.josemanuel@hsr.it
> > <mailto:garciamanteiga.josemanuel@hsr.it>
>
>


-- 
Jose M. Garcia Manteiga PhD
Research Assistant - Data Analysis in Functional Genomics
Center for Translational Genomics and BioInformatics
Dibit2-Basilica, 3A3
San Raffaele Scientific Institute
Via Olgettina 58, 20132 Milano (MI), Italy

Tel: +39-02-2643-4751
e-mail: garciamanteiga.josemanuel@hsr.it

	[[alternative HTML version deleted]]