[BioC] GAGE/Pathview RNA-Seq Workflows: reference genome issue

Luo Weijun luo_weijun at yahoo.com
Thu Mar 13 03:53:36 CET 2014


Ashesh,
I don’t really have this problem when I run the same code on the demo data. Indeed, your problem is due to the discrepancy in the chrM seqlengths based on error message. That’s why users need to stick to the same version of references genome as the gene annotation (TxDb.Hsapiens.UCSC.hg19.knownGene). Did you use the hg19 indexed by bowtie2? I downloaded it from: ftp://ftp.ccb.jhu.edu/pub/data/bowtie2_indexes/hg19.zip
Try to use that hg19 reference genome, you should fix the problem. Otherwise, try to follow the demo code with demo data, see if you can work that out.
Weijun

--------------------------------------------
On Wed, 3/12/14, Ashesh wrote:

 Dear Dr. Luo,
 I recently performed some RNA-seq experiments and want to use
 the Pathview software to look at changes in biological
 pathways.  While the instructions on how to use the
 software are easy to follow, I have been unable to solve the
 following initial error:

 >
 exByGn <-
 exonsBy(TxDb.Hsapiens.UCSC.hg19.knownGene,
 "gene")

 >
 library(Rsamtools)


 Loading
 required package: Biostrings
  
 >
 fls <-
 list.files("Cancer_test/",
 pattern="bam$", full.names=T)

 >
 bamfls <-
 BamFileList(fls)

 >
 flag <-
 scanBamFlag(isNotPrimaryRead=FALSE,
 isProperPair=NA)

 >
 param <-
 ScanBamParam(flag=flag)

 >
 gnCnt <-
 summarizeOverlaps(exByGn, bamfls, mode="Union",
 ignore.strand=TRUE, single.end=TRUE,
 param=param)

 Error
 in
 mergeNamedAtomicVectors(seqlengths(x), seqlengths(y), what =
 c("sequence",  : 

   sequence
 chrM has incompatible seqlengths:

  
 - in
 'x': 16571

  
 - in
 'y': 12069
 It seems that the bam files created by tophat (I
 also used hg19) and the TxDb.Hsapiens.UCSC.hg19.knownGene
 have different seqlengths for chrM. Since, I am a beginner I
 have not been able to determine how best to resolve this
 error.  Should I change the bam file header?  Is there a
 way to modify the TxDb.Hsapiens.UCSC.hg19.knownGene file?
  Is there a work around?  

 Any help is deeply appreciated.  
 Sincerely,
 Ashesh



More information about the Bioconductor mailing list