[BioC] RNASeq:- getting Zero Count
Valerie Obenchain
vobencha at fhcrc.org
Mon Aug 5 17:51:52 CEST 2013
Hi Reema,
To perform overlap or matching operations the seqlevels (chromosome
names) of the objects must match. The error message is telling you that
some of these do not match. It's reasonable that a few names may not
match (maybe a chromosome is present in one object and not the other)
but the majority should.
Check the seqlevels:
seqlevels(aln)
seqlevels(gnModel)
Which names are common to both:
intersect(seqlevels(gnModel), seqlevels(aln))
You can rename seqlevels in several different ways. See
?renameSeqlevels or ?seqlevels for examples.
Valerie
On 08/04/2013 06:35 AM, Reema Singh wrote:
> Dear All,
>
> I am trying to extract the read count from three .bam files. But I am
> getting Zero count entries.
>
> I am using Mycobacterium Tuberculosis H37Rv gtf file (
> ftp://ftp.ensemblgenomes.org/pub/release-19/bacteria//gtf/bacteria_1_collection/mycobacterium_tuberculosis_h37rv/Mycobacterium_tuberculosis_h37rv.GCA_000277735.1.19.gtf.gz)
> and RNASeq data used here were downloaded from (
> http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE40846) and aligned
> with bowtie2.
>
>
> library(GenomicFeatures)
> txdb <-
> makeTranscriptDbFromGFF(file="Mycobacterium_tuberculosis_h37rv.GCA_000277735.1.19.gtf",format="gtf")
> saveDb(txdb,file="MycoTubeH37Rv.sqlite")
> load("MycoTubeH37Rv.sqlite")
> gnModel <- exonsBy(txdb,"gene") ### *also tried with "transcripts", "cds",
> but getting same *
>
> bamFiles <- list.files(".", "bam$", full=TRUE)
> names(bamFiles) <- sub("\\..*","",basename(bamFiles))
> counter <- function(fl, gnModel){
> aln <- GenomicRanges::readGappedAlignments(fl)
> strand(aln)
> hits <- countOverlaps(aln,gnModel)
> counts <- countOverlaps(gnModel,aln[hits==5])
> names(counts) <- names(gnModel)
> counts
> }
>
> counts <- sapply(bamFiles,counter,gnModel)
>
> Note: method with signature ‘Vector#GRangesList’ chosen for function
> ‘countOverlaps’,
> target signature ‘GappedAlignments#GRangesList’.
> "GappedAlignments#Vector" would also be valid
> Note: method with signature ‘GRangesList#Vector’ chosen for function
> ‘countOverlaps’,
> target signature ‘GRangesList#GappedAlignments’.
> "Vector#GappedAlignments" would also be valid
> Warning messages:
> 1: In .Seqinfo.mergexy(x, y) :
> Each of the 2 combined objects has sequence levels not in the other:
> - in 'x': gi|448814763|ref|NC_000962.3|
> - in 'y': Chromosome
> Make sure to always combine/compare objects based on the same reference
> genome (use suppressWarnings() to suppress this warning).
> 2: In .Seqinfo.mergexy(x, y) :
> Each of the 2 combined objects has sequence levels not in the other:
> - in 'x': Chromosome
> - in 'y': gi|448814763|ref|NC_000962.3|
> Make sure to always combine/compare objects based on the same reference
> genome (use suppressWarnings() to suppress this warning).
> 3: In .Seqinfo.mergexy(x, y) :
> Each of the 2 combined objects has sequence levels not in the other:
> - in 'x': gi|448814763|ref|NC_000962.3|
> - in 'y': Chromosome
> Make sure to always combine/compare objects based on the same reference
> genome (use suppressWarnings() to suppress this warning).
> 4: In .Seqinfo.mergexy(x, y) :
> Each of the 2 combined objects has sequence levels not in the other:
> - in 'x': Chromosome
> - in 'y': gi|448814763|ref|NC_000962.3|
> Make sure to always combine/compare objects based on the same reference
> genome (use suppressWarnings() to suppress this warning).
> 5: In .Seqinfo.mergexy(x, y) :
> Each of the 2 combined objects has sequence levels not in the other:
> - in 'x': gi|448814763|ref|NC_000962.3|
> - in 'y': Chromosome
> Make sure to always combine/compare objects based on the same reference
> genome (use suppressWarnings() to suppress this warning).
> 6: In .Seqinfo.mergexy(x, y) :
> Each of the 2 combined objects has sequence levels not in the other:
> - in 'x': Chromosome
> - in 'y': gi|448814763|ref|NC_000962.3|
> Make sure to always combine/compare objects based on the same reference
> genome (use suppressWarnings() to suppress this warning).
>
> head(counts)
>
> SRR568038 SRR568039 SRR568040
> RVBD_0001 0 0 0
> RVBD_0002 0 0 0
> RVBD_0003 0 0 0
> RVBD_0004 0 0 0
> RVBD_0005 0 0 0
> RVBD_0006 0 0 0
>
>> sessionInfo()
> R version 3.0.1 (2013-05-16)
> Platform: x86_64-redhat-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_US.utf8 LC_NUMERIC=C
> [3] LC_TIME=en_US.utf8 LC_COLLATE=en_US.utf8
> [5] LC_MONETARY=en_US.utf8 LC_MESSAGES=en_US.utf8
> [7] LC_PAPER=C LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] parallel stats graphics grDevices utils datasets methods
> [8] base
>
> other attached packages:
> [1] Rsamtools_1.12.3 Biostrings_2.28.0 GenomicFeatures_1.12.3
> [4] AnnotationDbi_1.22.6 Biobase_2.20.1 GenomicRanges_1.12.4
> [7] IRanges_1.18.2 BiocGenerics_0.6.0
>
> loaded via a namespace (and not attached):
> [1] biomaRt_2.16.0 bitops_1.0-5 BSgenome_1.28.0 DBI_0.2-6
>
> [5] RCurl_1.95-4.1 RSQLite_0.11.3 rtracklayer_1.20.4 stats4_3.0.1
>
> [9] tools_3.0.1 XML_3.96-1.1 zlibbioc_1.6.0
>
>
> Kind regards
>
>
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
More information about the Bioconductor
mailing list