[BioC] error in locateVariants for a GRanges object
Valerie Obenchain
vobencha at fhcrc.org
Sun Mar 11 21:02:05 CET 2012
Hi Francesco,
Looks like you've hit a bug. Is your GRanges too large to attach or make
available for testing?
Valerie
On 03/11/12 12:29, Lescai, Francesco wrote:
> but.. I have another different error with a different dataset.
>
> k1.ranges = GRanges(
> seqnames=paste("chr",CEUstats.variants$chromosome,sep=""),
> IRanges(start=CEUstats.variants$position,
> width=1)
> )
>
>> head(k1.ranges)
> GRanges with 6 ranges and 0 elementMetadata cols:
> seqnames ranges strand
> <Rle> <IRanges> <Rle>
> [1] chr1 [1177919, 1177919] *
> [2] chr1 [1234763, 1234763] *
> [3] chr1 [1246257, 1246257] *
> [4] chr1 [1564953, 1564953] *
> [5] chr1 [1887112, 1887112] *
> [6] chr1 [1900107, 1900107] *
> ---
> seqlengths:
> chr1 chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 ... chr22 chr3 chr4 chr5 chr6 chr7 chr8 chr9
> NA NA NA NA NA NA NA NA NA ... NA NA NA NA NA NA NA NA
>
>> k1.locations = locateVariants(k1.ranges, txdb19)
> Error in DataFrame(queryID = which(intergenic), location = location, txID = NA_integer_, :
> different row counts implied by arguments
>
> sessionInfo is the same as below.
> thanks very much,
>
> Francesco
>
>
> On 11 Mar 2012, at 18:53, Lescai, Francesco wrote:
>
> Adopted the suggestion of Steve and went the "hard" way of re-compiling from source the packages in my session.
> not it seems to work :-))
> therefore, no idea where the problem was but at least it is solved!
> VariantAnnotation is no .63 instead of .61, that might have changed few things together with the other packages.
>
> this is my session in case it might be useful for the developers.
>
> thanks very much for your help!!
> Francesco
>
>
> sessionInfo()
> R Under development (unstable) (2012-01-20 r58146)
> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>
> locale:
> [1] C/en_US.UTF-8/C/C/C/C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] TxDb.Hsapiens.UCSC.hg19.knownGene_2.6.4 GenomicFeatures_1.7.30
> [3] AnnotationDbi_1.17.27 Biobase_2.15.4
> [5] VariantAnnotation_1.1.63 Rsamtools_1.7.40
> [7] Biostrings_2.23.6 GenomicRanges_1.7.34
> [9] IRanges_1.13.30 BiocGenerics_0.1.14
>
> loaded via a namespace (and not attached):
> [1] BSgenome_1.23.4 DBI_0.2-5 Matrix_1.0-4 RCurl_1.91-1 RSQLite_0.11.1
> [6] XML_3.9-4 biomaRt_2.11.1 bitops_1.0-4.1 ggplot2_0.8.9 grid_2.15.0
> [11] lattice_0.20-0 plyr_1.7.1 rtracklayer_1.15.7 snpStats_1.5.5 splines_2.15.0
> [16] stats4_2.15.0 survival_2.36-12 tools_2.15.0 zlibbioc_1.1.1
>
>
> On 11 Mar 2012, at 18:34, Martin Morgan wrote:
>
> On 03/11/2012 10:39 AM, Lescai, Francesco wrote:
> Hi, this is the traceback output.
>
> traceback()
> 12: stop(gettextf("invalid names for slots of class %s: %s", dQuote(Class),
> paste(snames[is.na(which)], collapse = ", ")), domain = NA)
> 11: initialize(value, ...)
> 10: initialize(value, ...)
> 9: new("RangesMatching", matchMatrix = matchMatrix, DIM = DIM)
> 8: .local(query, subject, maxgap, minoverlap, type, select, ...)
> 7: findOverlaps(query, unlistSubject, maxgap = maxgap, type = type,
> select = "all", ignore.strand = ignore.strand)
> 6: findOverlaps(query, unlistSubject, maxgap = maxgap, type = type,
> select = "all", ignore.strand = ignore.strand)
> 5: .local(query, subject, maxgap, minoverlap, type, select, ...)
> 4: findOverlaps(queryAdj, cdsByTx, type = "within")
> 3: findOverlaps(queryAdj, cdsByTx, type = "within")
>
> 'cdsByTx' isn't used in this context in VariantAnnotation 1.1.61, which has two lines like
>
> cdsCO<- countOverlaps(query, cache[["cdsByTx"]], type="within")
> txFO<- findOverlaps(query, cache[["tx"]], type="within")
>
> that might be the current implementation. This line
>
> cdsFO<- findOverlaps(queryAdj, cdsByTx, type="within")
>
> _is_ in VariantAnnotation 1.0.5; I think you are getting the wrong version of VariantAnnotation, but this is not consistent with your sessionInfo().
>
> Martin
>
>
> 2: locateVariants(my.ranges, txdb19)
> 1: locateVariants(my.ranges, txdb19)
>
> I tried to install the package, but it seems it still picks up the old version.
>
> biocLite("TxDb.Hsapiens.UCSC.hg19.knownGene")
> BioC_mirror: http://bioconductor.org
> Using R version 2.15, BiocInstaller version 1.3.7.
> Installing package(s) 'TxDb.Hsapiens.UCSC.hg19.knownGene'
> Installing package(s) into Œ/Library/Frameworks/R.framework/Versions/2.15/Resources/library‚
> (as Œlib‚ is unspecified)
> Warning: unable to access index for repository http://www.stats.ox.ac.uk/pub/RWin/bin/macosx/leopard/contrib/2.15
> trying URL 'http://bioconductor.org/packages/2.10/data/annotation/bin/macosx/leopard/contrib/2.15/TxDb.Hsapiens.UCSC.hg19.knownGene_2.6.2.tgz'
> Content type 'application/x-gzip' length 16722765 bytes (15.9 Mb)
> opened URL
> ==================================================
> downloaded 15.9 Mb
>
> and I checked on the website for the 2.10 release and the Mac version of the packages seems to be still 2.6.2.
>
> Is there any other package I can try to install manually?
> It seems now I cannot access to the developer wiki of BioC.
>
> thanks
> Francesco
>
>
>
>
>
> On 10 Mar 2012, at 19:01, Martin Morgan wrote:
>
> On 03/10/2012 10:14 AM, Lescai, Francesco wrote:
> Thanks Martin,
> done, but I still get the same error.
>
> I can't spot the problem; maybe someone else will chime in.
>
> (a) TxDb.Hsapiens... is still out-of-date; maybe it isn't checked by biocLite()
>
> (b) the error
>
> my.locations = locateVariants(my.ranges, txdb19)
> Error in initialize(value, ...) :
> invalid names for slots of class „RangesMatching‰: matchMatrix, DIM
>
> definitely looks like an 'old package' issue -- the RangesMatching class was replaced by the 'Hits' class during this release cycle. It might help to call
>
> traceback()
>
> after the error, and to confirm that you are accessing only functions defined in the loaded packages by starting your R session with
>
> R --vanilla
>
> Obviously, the sessionInfo() needs to reflect the session the command fails in not, e.g., R gui in one instance and the terminal in the other.
>
> Martin
>
>
> My new sessionInfo is
>
> sessionInfo()
> R Under development (unstable) (2012-01-20 r58146)
> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>
> locale:
> [1] C/en_US.UTF-8/C/C/C/C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] BiocInstaller_1.3.7 TxDb.Hsapiens.UCSC.hg19.knownGene_2.6.2
> [3] GenomicFeatures_1.7.30 VariantAnnotation_1.1.61
> [5] Rsamtools_1.7.38 Biostrings_2.23.6
> [7] AnnotationDbi_1.17.27 Biobase_2.15.4
> [9] GenomicRanges_1.7.33 IRanges_1.13.28
> [11] BiocGenerics_0.1.12 biomaRt_2.11.1
>
> loaded via a namespace (and not attached):
> [1] BSgenome_1.23.4 DBI_0.2-5 Matrix_1.0-4 RCurl_1.91-1 RSQLite_0.11.1
> [6] XML_3.9-4 bitops_1.0-4.1 ggplot2_0.8.9 grid_2.15.0 lattice_0.20-0
> [11] plyr_1.7.1 rtracklayer_1.15.7 snpStats_1.5.5 splines_2.15.0 survival_2.36-12
> [16] tools_2.15.0 zlibbioc_1.1.1
>
>
> On 10 Mar 2012, at 17:50, Martin Morgan wrote:
>
> On 03/10/2012 09:39 AM, Lescai, Francesco wrote:
> Hi there,
> maybe I'm just doing a silly error somewhere, but I get an error when trying to locate the variants from a GRanges object.
> I have a file with SNP positions, thefore I build up the GRanges this way
>
> my.ranges = GRanges(
> seqnames=paste("chr", my.snp.unique$chromosome, sep=""),
> IRanges(start= my.snp.unique$position,
> width=1))
>
> head(my.ranges)
> GRanges with 6 ranges and 0 elementMetadata values:
> seqnames ranges strand
> <Rle> <IRanges> <Rle>
> [1] chr1 [ 1323144, 1323144] *
> [2] chr1 [ 3544236, 3544236] *
> [3] chr1 [ 6252966, 6252966] *
> [4] chr1 [ 7861154, 7861154] *
> [5] chr1 [10425118, 10425118] *
> [6] chr1 [10502308, 10502308] *
> ---
> seqlengths:
> chr1 chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 ... chr4 chr5 chr6 chr7 chr8 chr9 chrX chrY
> NA NA NA NA NA NA NA NA NA ... NA NA NA NA NA NA NA NA
>
> library(TxDb.Hsapiens.UCSC.hg19.knownGene)
> txdb19<- TxDb.Hsapiens.UCSC.hg19.knownGene
> #
> my.locations = locateVariants(my.ranges, txdb19)
> Error in initialize(value, ...) :
> invalid names for slots of class „RangesMatching‰: matchMatrix, DIM
>
> What am I doing wrong?
>
> Your devel packages are out of date, so I'd start with
>
> source("http://bioconductor.org/biocLite.R")
> biocLite(character())
>
> Martin
>
>
> thanks,
> Francesco
>
>
> sessionInfo()
> R Under development (unstable) (2012-01-20 r58146)
> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>
> locale:
> [1] C/en_US.UTF-8/C/C/C/C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] TxDb.Hsapiens.UCSC.hg19.knownGene_2.6.2 GenomicFeatures_1.6.7
> [3] VariantAnnotation_1.1.33 Rsamtools_1.7.27
> [5] Biostrings_2.23.6 AnnotationDbi_1.17.15
> [7] Biobase_2.15.3 GenomicRanges_1.7.16
> [9] IRanges_1.13.22 BiocGenerics_0.1.4
> [11] biomaRt_2.11.1
>
> loaded via a namespace (and not attached):
> [1] BSgenome_1.23.2 DBI_0.2-5 Matrix_1.0-3 RCurl_1.9-5 RSQLite_0.11.1
> [6] XML_3.8-0 bitops_1.0-4.1 ggplot2_0.8.9 grid_2.15.0 lattice_0.20-0
> [11] plyr_1.7.1 rtracklayer_1.15.7 snpStats_1.5.3 splines_2.15.0 survival_2.36-10
> [16] tools_2.15.0 zlibbioc_1.1.1
>
> ---------------------------------------------------------------------------------
> Francesco Lescai, PhD, EDBT
> Senior Research Associate in Genome Analysis
> University College London
> Faculty of Population Health Sciences
> Dept. Genes, Development& Disease
> ICH - Molecular Medicine Unit, GOSgene team
> 30 Guilford Street
> WC1N 1EH London UK
>
> email: f.lescai at ucl.ac.uk<mailto:f.lescai at ucl.ac.uk><mailto:f.lescai at ucl.ac.uk><mailto:f.lescai at ucl.ac.uk><mailto:f.lescai at ucl.ac.uk><mailto:f.lescai at ucl.ac.uk>
> phone: +44.(0)207.905.2274
> [ext: 2274]
> --------------------------------------------------------------------------------
>
>
> [[alternative HTML version deleted]]
>
>
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org<mailto:Bioconductor at r-project.org><mailto:Bioconductor at r-project.org><mailto:Bioconductor at r-project.org><mailto:Bioconductor at r-project.org>
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
> --
> Computational Biology
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
>
> Location: M1-B861
> Telephone: 206 667-2793
>
>
> ---------------------------------------------------------------------------------
> Francesco Lescai, PhD, EDBT
> Senior Research Associate in Genome Analysis
> University College London
> Faculty of Population Health Sciences
> Dept. Genes, Development& Disease
> ICH - Molecular Medicine Unit, GOSgene team
> 30 Guilford Street
> WC1N 1EH London UK
>
> email: f.lescai at ucl.ac.uk<mailto:f.lescai at ucl.ac.uk><mailto:f.lescai at ucl.ac.uk><mailto:f.lescai at ucl.ac.uk>
> phone: +44.(0)207.905.2274
> [ext: 2274]
> --------------------------------------------------------------------------------
>
>
> [[alternative HTML version deleted]]
>
>
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org<mailto:Bioconductor at r-project.org><mailto:Bioconductor at r-project.org>
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
> --
> Computational Biology
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
>
> Location: M1-B861
> Telephone: 206 667-2793
>
>
> ---------------------------------------------------------------------------------
> Francesco Lescai, PhD, EDBT
> Senior Research Associate in Genome Analysis
> University College London
> Faculty of Population Health Sciences
> Dept. Genes, Development& Disease
> ICH - Molecular Medicine Unit, GOSgene team
> 30 Guilford Street
> WC1N 1EH London UK
>
> email: f.lescai at ucl.ac.uk<mailto:f.lescai at ucl.ac.uk><mailto:f.lescai at ucl.ac.uk>
> phone: +44.(0)207.905.2274
> [ext: 2274]
> --------------------------------------------------------------------------------
>
>
> [[alternative HTML version deleted]]
>
>
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org<mailto:Bioconductor at r-project.org>
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
> --
> Computational Biology
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
>
> Location: M1-B861
> Telephone: 206 667-2793
>
>
> ---------------------------------------------------------------------------------
> Francesco Lescai, PhD, EDBT
> Senior Research Associate in Genome Analysis
> University College London
> Faculty of Population Health Sciences
> Dept. Genes, Development& Disease
> ICH - Molecular Medicine Unit, GOSgene team
> 30 Guilford Street
> WC1N 1EH London UK
>
> email: f.lescai at ucl.ac.uk<mailto:f.lescai at ucl.ac.uk>
> phone: +44.(0)207.905.2274
> [ext: 2274]
> --------------------------------------------------------------------------------
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> ---------------------------------------------------------------------------------
> Francesco Lescai, PhD, EDBT
> Senior Research Associate in Genome Analysis
> University College London
> Faculty of Population Health Sciences
> Dept. Genes, Development& Disease
> ICH - Molecular Medicine Unit, GOSgene team
> 30 Guilford Street
> WC1N 1EH London UK
>
> email: f.lescai at ucl.ac.uk<mailto:f.lescai at ucl.ac.uk>
> phone: +44.(0)207.905.2274
> [ext: 2274]
> --------------------------------------------------------------------------------
>
>
> [[alternative HTML version deleted]]
>
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list