[Bioc-devel] summarizeOverlaps example error

Mark Robinson mark.robinson at imls.uzh.ch
Wed Aug 8 06:55:32 CEST 2012


Hi Valerie/Dan,

Looks good now, with GenomicRanges_1.9.42 and IRanges_1.15.29.  Thanks for this.

Mark




smmrzO> res <- summarizeOverlaps(exbygene, reads, "Union")

smmrzO> stopifnot(length(assays(res)$counts) == length(exbygene))
> res
class: SummarizedExperiment 
dim: 14869 1 
exptData(0):
assays(1): counts
rownames(14869): FBgn0000003 FBgn0000008 ... FBgn0261574 FBgn0261575
rowData values names(0):
colnames: NULL
colData names(1): metaData
> sessionInfo()
R version 2.15.1 (2012-06-22)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_CA.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_CA.UTF-8        LC_COLLATE=en_CA.UTF-8    
 [5] LC_MONETARY=en_CA.UTF-8    LC_MESSAGES=en_CA.UTF-8   
 [7] LC_PAPER=C                 LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] TxDb.Dmelanogaster.UCSC.dm3.ensGene_2.7.1
 [2] GenomicFeatures_1.9.28                   
 [3] AnnotationDbi_1.19.28                    
 [4] edgeR_2.99.4                             
 [5] limma_3.13.16                            
 [6] DESeq_1.9.11                             
 [7] lattice_0.20-6                           
 [8] locfit_1.5-8                             
 [9] Biobase_2.17.6                           
[10] Rsamtools_1.9.25                         
[11] Biostrings_2.25.8                        
[12] GenomicRanges_1.9.42                     
[13] IRanges_1.15.29                          
[14] BiocGenerics_0.3.0                       

loaded via a namespace (and not attached):
 [1] annotate_1.35.3     biomaRt_2.13.2      bitops_1.0-4.1     
 [4] BSgenome_1.25.6     DBI_0.2-5           genefilter_1.39.0  
 [7] geneplotter_1.35.1  grid_2.15.1         RColorBrewer_1.0-5 
[10] RCurl_1.91-1        RSQLite_0.11.1      rtracklayer_1.17.15
[13] splines_2.15.1      stats4_2.15.1       survival_2.36-14   
[16] tools_2.15.1        XML_3.9-4           xtable_1.7-0       
[19] zlibbioc_1.3.0     


On 08.08.2012, at 04:06, Valerie Obenchain wrote:

> Hi Mark,
> 
> Try GenomicRanges 1.9.42 which requires IRanges 1.15.29. The example ran successfully for me with this combination. There was a problem was with newCompressedList() in IRanges which was fixed in r68298.
> 
> Valerie
> 
> svn log for IRanges :
> 
> ------------------------------------------------------------------------
> r68298 | hpages at fhcrc.org | 2012-08-07 18:43:15 -0700 (Tue, 07 Aug 2012) | 3 lines
> 
> Small tweak to newCompressedList0() internal constructor so it produces valid
> GRangesList objects.
> 
> 
> 
> 
> On 08/06/12 21:56, Mark Robinson wrote:
>> For what its worth, a colleague checked it out on a couple different systems (Thanks Keith!) and … I don't think my original using R-devel instead of R-2.15.1 is really the issue.
>> 
>> This works … no error.
>> 
>>> sessionInfo()
>> R Under development (unstable) (2012-06-11 r59557)
>> Platform: x86_64-unknown-linux-gnu (64-bit)
>> 
>> locale:
>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
>> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
>> [7] LC_PAPER=C LC_NAME=C
>> [9] LC_ADDRESS=C LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>> 
>> attached base packages:
>> [1] stats graphics grDevices utils datasets methods base
>> 
>> other attached packages:
>> [1] TxDb.Dmelanogaster.UCSC.dm3.ensGene_2.7.1
>> [2] GenomicFeatures_1.9.11
>> [3] AnnotationDbi_1.19.16
>> [4] BiocInstaller_1.5.12
>> [5] edgeR_2.7.25
>> [6] limma_3.13.8
>> [7] DESeq_1.9.7
>> [8] locfit_1.5-8
>> [9] Biobase_2.17.6
>> [10] Rsamtools_1.9.18
>> [11] Biostrings_2.25.6
>> [12] GenomicRanges_1.9.28
>> [13] IRanges_1.15.17
>> [14] BiocGenerics_0.3.0
>> 
>> loaded via a namespace (and not attached):
>> [1] annotate_1.35.3 biomaRt_2.13.1 bitops_1.0-4.1
>> [4] BSgenome_1.25.3 DBI_0.2-5 genefilter_1.39.0
>> [7] geneplotter_1.35.0 grid_2.16.0 lattice_0.20-6
>> [10] RColorBrewer_1.0-5 RCurl_1.91-1 RSQLite_0.11.1
>> [13] rtracklayer_1.17.11 splines_2.16.0 stats4_2.16.0
>> [16] survival_2.36-14 tools_2.16.0 XML_3.9-4
>> [19] xtable_1.7-0 zlibbioc_1.3.0
>> 
>> This throws the same error as mine …
>> 
>> 
>>> sessionInfo()
>> R Under development (unstable) (2012-05-08 r59331)
>> Platform: x86_64-pc-mingw32/x64 (64-bit)
>> 
>> locale:
>> [1] LC_COLLATE=English_Australia.1252 LC_CTYPE=English_Australia.1252
>> [3] LC_MONETARY=English_Australia.1252 LC_NUMERIC=C
>> [5] LC_TIME=English_Australia.1252
>> 
>> attached base packages:
>> [1] stats graphics grDevices utils datasets methods base
>> 
>> other attached packages:
>> [1] TxDb.Dmelanogaster.UCSC.dm3.ensGene_2.7.1
>> [2] GenomicFeatures_1.9.28
>> [3] AnnotationDbi_1.19.28
>> [4] edgeR_2.99.4
>> [5] limma_3.13.16
>> [6] DESeq_1.9.11
>> [7] lattice_0.20-6
>> [8] locfit_1.5-8
>> [9] Biobase_2.17.6
>> [10] Rsamtools_1.9.25
>> [11] Biostrings_2.25.8
>> [12] GenomicRanges_1.9.41
>> [13] IRanges_1.15.25
>> [14] BiocGenerics_0.3.0
>> [15] BiocInstaller_1.5.12
>> 
>> loaded via a namespace (and not attached):
>> [1] annotate_1.35.3 biomaRt_2.13.2 bitops_1.0-4.1
>> [4] BSgenome_1.25.6 DBI_0.2-5 genefilter_1.39.0
>> [7] geneplotter_1.35.1 grid_2.16.0 RColorBrewer_1.0-5
>> [10] RCurl_1.91-1.1 RSQLite_0.11.1 rtracklayer_1.17.15
>> [13] splines_2.16.0 stats4_2.16.0 survival_2.36-14
>> [16] tools_2.16.0 XML_3.9-4.1 xtable_1.7-0
>> [19] zlibbioc_1.3.0
>> So, something has happened between GenomicRanges_1.9.28 and GenomicRanges_1.9.41 (plus a multitude other changes).
>> 
>> Best,
>> Mark
>> 
>> On 06.08.2012, at 22:11, Dan Tenenbaum wrote:
>> 
>>> Hi Mark,
>>> 
>>> On Mon, Aug 6, 2012 at 1:01 PM, Mark Robinson<mark.robinson at imls.uzh.ch>  wrote:
>>>> Hi all,
>>>> 
>>>> My apologies in advance if I've done something pin-headed here, but can anyone else produce an error just by doing:
>>>> 
>>>> library(GenomicRanges)
>>>> example(summarizeOverlaps)
>>>> 
>>>> I think I have a set of updated (devel) packages:
>>>> 
>>>>> source("http://bioconductor.org/biocLite.R")
>>>> BiocInstaller version 1.5.12, ?biocLite for help
>>>>> old.packages(repos=biocinstallRepos())
>>>> NULL
>>>> 
>>>> My R devel is a bit old (2012-06-14 r59564) … maybe I should upgrade?
>>>> 
>>> The devel version of Bioconductor (2.11) is not intended for use with
>>> R-devel. It's meant to be used with R 2.15 (just as Bioconductor 2.10,
>>> the release version, is). It's a bit confusing. It's because R moved
>>> to a yearly release schedule but we are keeping our twice-yearly
>>> release schedule.
>>> 
>>> I don't know if that is the cause of your problems, but Bioconductor
>>> packages are not tested or built on R-devel, so who knows what could
>>> happen. Unless there is some feature in R-devel that you really want,
>>> I'd suggest trying R 2.15 and see if that fixes things.
>>> 
>>> More info here:
>>> http://bioconductor.org/developers/useDevel/
>>> 
>>> Dan
>>> 
>>> 
>>>> Error, traceback() and sessionInfo() below …
>>>> 
>>>> Help much appreciated.
>>>> 
>>>> Best,
>>>> Mark
>>>> 
>>>> 
>>>>> example(summarizeOverlaps)
>>>> smmrzO>  reads<- GappedAlignments(
>>>> smmrzO+     names = c("a","b","c","d","e","f","g"),
>>>> smmrzO+     seqnames = Rle(c(rep(c("chr1", "chr2"), 3), "chr1")),
>>>> smmrzO+     pos = as.integer(c(1400, 2700, 3400, 7100, 4000, 3100, 5200)),
>>>> smmrzO+     cigar = c("500M", "100M", "300M", "500M", "300M",
>>>> smmrzO+         "50M200N50M", "50M150N50M"),
>>>> smmrzO+     strand = strand(rep("+", 7)))
>>>> 
>>>> smmrzO>  ## ---------------------------------------------------------------------
>>>> smmrzO>  ## 'features' as GRanges
>>>> smmrzO>  ##
>>>> smmrzO>
>>>> smmrzO>  features<- GRanges(
>>>> smmrzO+     seqnames = c(rep("chr1", 7), rep("chr2", 4)), strand = "+",
>>>> smmrzO+     ranges = IRanges(c(1000, 3000, 3600, 4000, 4000, 5000, 5400, 2000,
>>>> smmrzO+         3000, 7000, 7500), width = c(500, 500, 300, 500, 900, 500, 500,
>>>> smmrzO+         900, 500, 600, 300)),
>>>> smmrzO+     group_id = c("A", "B", "C", "C", "D", "D", "E", "F", "G", "H", "H"))
>>>> 
>>>> smmrzO>  ## Each row is a feature the read can 'hit'.
>>>> smmrzO>  rowsAsFeatures<- data.frame(
>>>> smmrzO+     union = assays(summarizeOverlaps(features, reads))$counts,
>>>> smmrzO+     intStrict = assays(summarizeOverlaps(features, reads,
>>>> smmrzO+         mode="IntersectionStrict"))$counts,
>>>> smmrzO+     intNotEmpty = assays(summarizeOverlaps(features, reads,
>>>> smmrzO+         mode="IntersectionNotEmpty"))$counts,
>>>> smmrzO+     countOverlaps = countOverlaps(features, reads))
>>>> 
>>>> smmrzO>  ## Results from countOverlaps() are included to highlight how
>>>> smmrzO>  ## summarizeOverlaps() counts a read only once. For these 7
>>>> smmrzO>  ## reads, 'Union' is the most conservative counting mode, followed
>>>> smmrzO>  ## by 'Intersectionstrict' and then 'IntersectionNotEmpty'.
>>>> smmrzO>
>>>> smmrzO>  colSums(rowsAsFeatures)
>>>>        union     intStrict   intNotEmpty countOverlaps
>>>>            3             4             5            11
>>>> 
>>>> smmrzO>  ## ---------------------------------------------------------------------
>>>> smmrzO>  ## 'features' as GRangesList
>>>> smmrzO>  ##
>>>> smmrzO>
>>>> smmrzO>  ## Each highest list-level is a feature the read can 'hit'.
>>>> smmrzO>  lst<- split(features, values(features)[["group_id"]])
>>>> 
>>>> smmrzO>  listAsFeatures<- data.frame(
>>>> smmrzO+     union = assays(summarizeOverlaps(lst, reads))$counts,
>>>> smmrzO+     intStrict = assays(summarizeOverlaps(lst, reads,
>>>> smmrzO+         mode="IntersectionStrict"))$counts,
>>>> smmrzO+     intNotEmpty = assays(summarizeOverlaps(lst, reads,
>>>> smmrzO+         mode="IntersectionNotEmpty"))$counts,
>>>> smmrzO+     countOverlaps = countOverlaps(lst, reads))
>>>> 
>>>> smmrzO>  ## ---------------------------------------------------------------------
>>>> smmrzO>  ## 'reads' as BamFileList
>>>> smmrzO>  ##
>>>> smmrzO>
>>>> smmrzO>  ## Count BAM files and prepare output for DESeq or edgeR analysis
>>>> smmrzO>  library(Rsamtools)
>>>> 
>>>> smmrzO>  library(DESeq)
>>>> 
>>>> smmrzO>  library(edgeR)
>>>> 
>>>> smmrzO>  fls<- list.files(system.file("extdata",package="GenomicRanges"),
>>>> smmrzO+     recursive=TRUE, pattern="*bam$", full=TRUE)
>>>> 
>>>> smmrzO>  names(fls)<- basename(fls)
>>>> 
>>>> smmrzO>  bfl<- BamFileList(fls, index=character())
>>>> 
>>>> smmrzO>  features<- GRanges(
>>>> smmrzO+     seqnames = c(rep("chr2L", 4), rep("chr2R", 5), rep("chr3L", 2)),
>>>> smmrzO+     ranges = IRanges(c(1000, 3000, 4000, 7000, 2000, 3000, 3600, 4000,
>>>> smmrzO+         7500, 5000, 5400), width=c(rep(500, 3), 600, 900, 500, 300, 900,
>>>> smmrzO+         300, 500, 500)), "-",
>>>> smmrzO+     group_id=c(rep("A", 4), rep("B", 5), rep("C", 2)))
>>>> 
>>>> smmrzO>  solap<- summarizeOverlaps(features, bfl)
>>>> 
>>>> smmrzO>  deseq<- newCountDataSet(assays(solap)$counts, rownames(colData(solap)))
>>>> 
>>>> smmrzO>  edger<- DGEList(assays(solap)$counts, group=rownames(colData(solap)))
>>>> Calculating library sizes from column totals.
>>>> 
>>>> smmrzO>  ## ---------------------------------------------------------------------
>>>> smmrzO>  ## paired-end reads
>>>> smmrzO>  ##
>>>> smmrzO>
>>>> smmrzO>  library("TxDb.Dmelanogaster.UCSC.dm3.ensGene")
>>>> 
>>>> smmrzO>  exbygene<- exonsBy(TxDb.Dmelanogaster.UCSC.dm3.ensGene, "gene")
>>>> 
>>>> smmrzO>  fl<- system.file("extdata", "untreated3_chr4.bam",
>>>> smmrzO+     package="pasillaBamSubset")
>>>> 
>>>> smmrzO>  ## Paired-end reads are stored in a GappedAlignmentPairs object
>>>> smmrzO>  reads<- readGappedAlignmentPairs(fl)
>>>> 
>>>> smmrzO>  res<- summarizeOverlaps(exbygene, reads, "Union")
>>>> Warning messages:
>>>> 1: In is.na(unique(f)) :
>>>>  is.na() applied to non-(list or vector) of type 'S4'
>>>> 2: In IRanges:::newCompressedList("GRangesList", x, splitFactor = f,  :
>>>>  data length is not a multiple of split variable
>>>> Error in countOverlaps(reads, features, ignore.strand = ignore.strand) :
>>>>  error in evaluating the argument 'query' in selecting a method for function 'countOverlaps': Error in unique.default(x) : unique() applies only to vectors
>>>>> traceback()
>>>> 10: countOverlaps(reads, features, ignore.strand = ignore.strand)
>>>> 9: mode(reads, features, ignore.strand)
>>>> 8: .dispatchOverlaps(grglist(reads), features, mode, ignore.strand)
>>>> 7: summarizeOverlaps(exbygene, reads, "Union")
>>>> 6: summarizeOverlaps(exbygene, reads, "Union") at Rex84b25fb6afa1#102
>>>> 5: eval(expr, envir, enclos)
>>>> 4: eval(ei, envir)
>>>> 3: withVisible(eval(ei, envir))
>>>> 2: source(tf, local, echo = echo, prompt.echo = paste0(prompt.prefix,
>>>>       getOption("prompt")), continue.echo = paste0(prompt.prefix,
>>>>       getOption("continue")), verbose = verbose, max.deparse.length = Inf,
>>>>       encoding = "UTF-8", skip.echo = skips, keep.source = TRUE)
>>>> 1: example(summarizeOverlaps)
>>>> 
>>>> 
>>>>> sessionInfo()
>>>> R Under development (unstable) (2012-06-14 r59564)
>>>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>>> 
>>>> locale:
>>>> [1] en_CA.UTF-8
>>>> 
>>>> attached base packages:
>>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>> 
>>>> other attached packages:
>>>> [1] BiocInstaller_1.5.12
>>>> [2] TxDb.Dmelanogaster.UCSC.dm3.ensGene_2.7.1
>>>> [3] GenomicFeatures_1.9.28
>>>> [4] AnnotationDbi_1.19.28
>>>> [5] edgeR_2.99.4
>>>> [6] limma_3.13.16
>>>> [7] DESeq_1.9.11
>>>> [8] lattice_0.20-6
>>>> [9] locfit_1.5-8
>>>> [10] Biobase_2.17.6
>>>> [11] Rsamtools_1.9.25
>>>> [12] Biostrings_2.25.8
>>>> [13] GenomicRanges_1.9.41
>>>> [14] IRanges_1.15.25
>>>> [15] BiocGenerics_0.3.0
>>>> 
>>>> loaded via a namespace (and not attached):
>>>> [1] annotate_1.35.3     biomaRt_2.13.2      bitops_1.0-4.1
>>>> [4] BSgenome_1.25.6     DBI_0.2-5           genefilter_1.39.0
>>>> [7] geneplotter_1.35.1  grid_2.16.0         RColorBrewer_1.0-5
>>>> [10] RCurl_1.91-1        RSQLite_0.11.1      rtracklayer_1.17.15
>>>> [13] splines_2.16.0      stats4_2.16.0       survival_2.36-14
>>>> [16] tools_2.16.0        XML_3.9-4           xtable_1.7-0
>>>> [19] zlibbioc_1.3.0
>>>> 
>>>> 
>>>> 
>>>> 
>>>> ----------
>>>> Prof. Dr. Mark Robinson
>>>> Bioinformatics
>>>> Institute of Molecular Life Sciences
>>>> University of Zurich
>>>> Winterthurerstrasse 190
>>>> 8057 Zurich
>>>> Switzerland
>>>> 
>>>> v: +41 44 635 4848
>>>> f: +41 44 635 6898
>>>> e: mark.robinson at imls.uzh.ch
>>>> o: Y11-J-16
>>>> w: http://tiny.cc/mrobin
>>>> 
>>>> ----------
>>>> http://www.fgcz.ch/Bioconductor2012
>>>> http://www.eccb12.org/t5
>>>> 
>>>> _______________________________________________
>>>> Bioc-devel at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
> 



More information about the Bioc-devel mailing list