Dear Bioconductor mailing list,

I am trying to load and summarize ~ 60 paired-read BAM files from human
cancer RNA-seq experiments using Illumina 2x50 protocol, for downstream use
with edgeR and DESeq.

First question: I noticed several posts have been issued on the same topic,
i.e. the way to solve the warning: "There are [any number] synthetic exons
as determined from your annotation that overlap! This implies that some
reads will be counted more than once! Is that really what you want?" when
using easyRNASeq.

So far, I haven't seen any answer that doesn't pass through the use of
GenomicRanges, or even any answer at all for some decently written posts.
The easyRNASeq vignette is not entirely clear on that point. I'm therefore
wondering whether anybody has come up with a solution and posted it in a
plain and reproducible fashion.

Second question: does anybody know whether the aforementioned easyRNASeq
package makes use of the "properly paired" reads for summarization? I
really couldn't find anything on that either, even after a month of
googling around.

That's what I've done so far:

countTable <- easyRNASeq(filesDirectory=getwd(),
                         organism="Hsapiens",
                         annotationMethod="rda",
                         annotationFile="gAnnot.rda",
                         gapped=TRUE, count="genes",
                         summarization="geneModels",
                         filesDirectory=getwd(),
                         filenames=BAM_files,
             outputFormat="RNAseq",
                 nBcores=4)
Checking arguments...
Fetching annotations...
Computing gene models...
Summarizing counts...
Processing sample1.bam
Updating the read length information.
The alignments are gapped.
Minimum length of 1 bp.
Maximum length of 51 bp.
[...]
Preparing output

Warning messages:
[...]
2: In easyRNASeq(filesDirectory = getwd(), organism = "Hsapiens",
annotationMethod = "rda",  :
  There are 16816 synthetic exons as determined from your annotation that
overlap! This implies that some reads will be counted more than once! Is
that really what you want?
[...]

##rda file is derived from a previous iteration of the same command using
annotationMethod="biomaRt" and then doing

gAnnot <- genomicAnnotation(count.genes)
gAnnot <- gAnnot[space(gAnnot) %in%
paste("chr",c(1:22,"X","Y","M"),sep=""),]
save(gAnnot,file="gAnnot.rda")

as suggested by Nicholas Delhomme

Thanks in advance!

sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=C                 LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
 [1] easyRNASeq_1.6.0       ShortRead_1.18.0       latticeExtra_0.6-24
 [4] RColorBrewer_1.0-5     Rsamtools_1.12.3       DESeq_1.12.0
 [7] lattice_0.20-15        locfit_1.5-9.1         BSgenome_1.28.0
[10] GenomicRanges_1.12.4   Biostrings_2.28.0      IRanges_1.18.1
[13] edgeR_3.2.3            limma_3.16.5           biomaRt_2.16.0
[16] Biobase_2.20.0         genomeIntervals_1.16.0 BiocGenerics_0.6.0
[19] intervals_0.14.0

loaded via a namespace (and not attached):
 [1] annotate_1.38.0      AnnotationDbi_1.22.6 bitops_1.0-5
 [4] DBI_0.2-7            genefilter_1.42.0    geneplotter_1.38.0
 [7] grid_3.0.1           hwriter_1.3          RCurl_1.95-4.1
[10] RSQLite_0.11.4       splines_3.0.1        stats4_3.0.1
[13] survival_2.37-4      XML_3.96-1.1         xtable_1.7-1
[16] zlibbioc_1.6.0




-- 
Gabriele Zoppoli, MD
Ph.D., Clinical and Experimental Oncology and Hematology
Visiting Researcher, BCTRL J.C. Heuson, Institut J. Bordet, Bruxelles BE
Internal Medicine Resident, DiMI, IRCCS AOU San Martino IST, Genova, IT
Former Guest Researcher, LMP, CCR, NCI, NIH, Bethesda MD


Tel: +39 010 353 7968
Mobile 1: +32 478 337 942
Mobile 2: +39 349 617 0129
Email:           gabriele.zoppoli@unige.it
Alt. Email:     zoppoli@gmail.com
Alt. Email 2:  gzoppoli@libero.it
Alt. Email 3:  gabriele.zoppoli@bordet.be
----------------------------------------------------------


Ζεῦ πάτερ ἀλλὰ σὺ ῥῦσαι ὑπ' ἠέρος υἷας Ἀχαιῶν,

ποίησον δ' αἴθρην, δὸς δ' ὀφθαλμοῖσιν ἰδέσθαι:

ἐν δὲ φάει καὶ ὄλεσσον, ἐπεί νύ τοι εὔαδεν οὕτως.
*Father Zeus, at least deliver the sons of Acheans from the gloom,*
*And make clear the air, and give it to our eyes to see.*
*In the light destroy us, since to do thus pleases you. (Il. 17, 645-7)
*


----------------------------------------------------------

CONFIDENTIALITY NOTICE\ \ This e-mail message is intende...{{dropped:14}}

