[BioC] Error in R code for GOstats Vignette section "Using Shortest Paths"
Wolfgang Huber
huber at ebi.ac.uk
Tue May 30 17:53:22 CEST 2006
Hi Carleton,
it appears that you are using old and probably outdated versions of the
software and the vignette. Please update to the latest release of
Bioconductor (1.8), or the current development version. It will make
more sense for this list to discuss current versions.
Also, please include the output of
sessionInfo()
to your posting, because version numbers in particular of the data /
annotation packages are important to answer your question.
Best wishes
Wolfgang.
-------------------------------------
Wolfgang Huber
European Bioinformatics Institute
European Molecular Biology Laboratory
Cambridge CB10 1SD
England
Phone: +44 1223 494642
Fax: +44 1223 494486
Http: www.ebi.ac.uk/huber
-------------------------------------
Carleton Garrett wrote:
> Hi
>
> I'm currently running R version 2.2.0 under Windows XP with 2 Gb RAM.
>
> I'm working through the GOstats vignette using the GOstats.Rnw file to
> obtain the R code (for a description of the package see at end of this
> E-mail)
>
> The first objective of this section of the vignette is to extract all of
> the probe sets in the hgu95av2 chip that are associated with
> transcription factor GO identifiers using:
>
> TF2 <- get("GO:0003700", hgu95av2GO2ALLPROBES)
>
> FYI- length(TF2) = 834
>
> The next step gets the locus links (Entrez Gene IDs) associated with
> these probe sets thus:
>
> LLs <- getLL(TF2, "hgu95av2")
>
> FYI - length(LLs) = 834
>
> The third step gets a vector of probe sets that have been selected that
> show some level of expression and some variation in expression across
> samples. The data is contained within the exprSet = esetSub.
>
> gN = geneNames(esetSub)
>
> FYI - length(gN) = 2391
>
> So far so good. The next objective is to get probe sets that are common
> to both TF2 and gN and uses the following code:
>
> hv <- match(gN, TF2, 0)
>
> hv contains 159 non zero terms
> > length(hv[!(hv ==0 )])
> [1] 159
>
> HOWEVER, THESE NON ZERO TERMS ARE THE INDEX VALUES THAT LOCATE THE PROBE
> SETS in TF2 - NOT in gN!!! The next part of the code is where the error
> occurs and this error is propagated in the subsequent code for this section.
>
> oTF2 <- gN[hv]
>
> As one would expect from the above, the length(oTF2) does equal 159.
> However, VERY FEW of these probe sets in oTF2 belong to the vector of
> probes selected on the basis of an association with GO::0003700 - that
> is - very few of them (only 14) are actually part of TF2. Thus:
>
> > length(oTF2[oTF2 %in% TF2])
> [1] 14
>
> whereas all values of oTF2 should be in TF2.
>
> If one revises the above code thus:
>
> hvcorr <- hv[!(hv==0)]
>
> oTF2corr<- TF2[hvcorr]
>
> One again gets length(oTF2corr) = 159 but now the probe sets are in both
> TF2 and gN:
>
> > length(TF2[TF2 %in% oTF2corr])
> [1] 159
>
> > length(gN[gN %in% oTF2corr])
> [1] 159
>
> Thus, all subsequent calculations in this section of the vignette that
> depend on oTF2 are in error.
>
> This problem has probably been raised before and I just now
> rediscovering it. If so, I would appreciate your pointing me to the
> thread or location of the correction.
>
> Thanks
>
> Carl Garrett
>
>
>
>
> ======================================================================
> Description
> Package: GOstats
> Title: Tools for manipulating GO and microarrays.
> Version: 1.4.0
> Date: 20 Jan 2005
> Author: R. Gentleman
> Description: A set of tools for interacting with GO and microarray
> data. A variety of basic manipulation tools for graphs,
> hypothesis testing and other simple calculations.
> biocViews: Statistics, Annotation, GO, MultipleComparisons
> Depends: graph, GO, annotate, RBGL, xtable, Biobase, genefilter,
> multtest
> Suggests: hgu95av2 (>= 1.6.0)
> Maintainer: R. Gentleman <rgentlem at fhcrc.org>
> License: GPL2.0
> Packaged: Wed Oct 12 21:34:06 2005; biocbuild
> Built: R 2.2.0; ; 2005-10-12 21:34:10; windows
More information about the Bioconductor
mailing list