[BioC] GOLOCUSID and GOALLLOCUSID disagree with AmiGO
Dick Beyer
dbeyer at u.washington.edu
Sat Jan 29 02:12:14 CET 2005
Hi Robert,
Thanks for showing me how to get the LL the easy way.
When I submit the LL list to S.O.U.R.C.E., I see that they are all species (from looking at the UGCluster), whereas I actually want just mouse.
Is there an easy way to filter by species? If not, would it be possible to build something like GOLOCUSIDMUSMUSCULUS?
My goal is to feed a set of LLs to GOstats, get a list of ranked GOIDs, pick the top most significant, generate a list of LLs from each those GOIDs, but just for a particular species, then go back to the microarray results and see what this last list of LLs is doing in my experiment.
Please let me know if you think there is a better way.
Thanks very much,
Dick
*******************************************************************************
Richard P. Beyer, Ph.D. University of Washington
Tel.:(206) 616 7378 Env. & Occ. Health Sci. , Box 354695
Fax: (206) 685 4696 4225 Roosevelt Way NE, # 100
Seattle, WA 98105-6099
http://depts.washington.edu/ceeh/ServiceCores/FC5/FC5.html
*******************************************************************************
On Fri, 28 Jan 2005, Robert Gentleman wrote:
> As another data point,
> why not just do (rather than the rather peculiar set of operations that you
> did?)
>> GOALLLOCUSID$"GO:0000158"
> IDA IDA IEA IEA IEA IEA IEA IEA IEA IEA IEA
> 24673 5520 116663 19053 24666 24668 24669 24672 24673 24674 24675
> IEA IEA IMP ISS ISS ISS ISS ISS ISS ISS ISS
> 25594 65179 45959 117281 19045 19046 19052 19053 19055 28227 319520
> ISS ISS ISS ISS NAS NR NR TAS TAS
> 45959 47877 63953 67857 45959 5518 5519 19052 24672
>> gg=GOALLLOCUSID$"GO:0000158"
>
> Note that some of the LocusLink IDs are duplicated? Why you might ask, well
> because they are annotated there for two different reasons (there are two
> evidence codes)
>
>> sum(duplicated(gg))
> [1] 6
>
> A quick check suggests that all of the ones you have listed are there, as are
> some others (and you can verify whether they are right at NCBI if you want...
>
> For example we have 5518 and AmiGO doesn't; my read of LocusLink says that they
> agree with us. It is reasonably simple to verify most of this, if that is what
> you want to do.
>
> Robert
>
> On Jan 28, 2005, at 1:33 PM, Dick Beyer wrote:
>
>> I am having some trouble understanding the correct usage of GOLOCUSID and
>> GOALLLOCUSID. I can't get the list of LocusLink identifiers output for a
>> particular GOID to agree with AmiGO. Also, for this particular GOID,
>> GO:0000158, the return from GOLOCUSID and GOALLLOCUSID are the same, which
>> seems wrong. I am using the latest development version of GO.
>>
>> Then again, perhaps I am not approaching this correctly as I have not used
>> these functions before.
>>
>> AmiGO shows 8 genes for GO:0000158, and both GOLOCUSID and GOALLLOCUSID show
>> 33.
>>
>> Would someone please look at the following code example and tell me what I
>> am doing wrong?
>>
>>> require("GO") || stop("GO unavailable")
>>> myGOALLLOCUSID <- as.list(GOALLLOCUSID)
>>> allGOALLLOCUSID <- names(myGOALLLOCUSID) allGOALLLOCUSID <-
>>> sub("GO:","",allGOALLLOCUSID)
>>> myGOLOCUSID <- as.list(GOLOCUSID)
>>> allGOLOCUSID <- names(myGOLOCUSID) allGOLOCUSID <-
>>> sub("GO:","",allGOLOCUSID)
>>> which(allGOLOCUSID == "0000158")
>> [1] 3370
>>> myGOLOCUSID[3370]
>> $"GO:0000158"
>> IDA IDA IEA IEA IEA IEA IEA IEA IEA IEA IEA
>> IEA IEA
>> 24673 5520 116663 19053 24666 24668 24669 24672 24673 24674 24675
>> 25594 65179
>> IMP ISS ISS ISS ISS ISS ISS ISS ISS ISS ISS
>> ISS ISS
>> 45959 117281 19045 19046 19052 19053 19055 28227 319520 39337 45959
>> 47877 63953
>> ISS NAS NR NR TAS TAS TAS
>> 67857 45959 5518 5519 19052 24672 5516
>>
>>> which(allGOALLLOCUSID == "0000158")
>> [1] 2856
>>> myGOALLLOCUSID[2856]
>> $"GO:0000158"
>> IDA IDA IEA IEA IEA IEA IEA IEA IEA IEA IEA
>> IEA IEA
>> 24673 5520 116663 19053 24666 24668 24669 24672 24673 24674 24675
>> 25594 65179
>> IMP ISS ISS ISS ISS ISS ISS ISS ISS ISS ISS
>> ISS ISS
>> 45959 117281 19045 19046 19052 19053 19055 28227 319520 39337 45959
>> 47877 63953
>> ISS NAS NR NR TAS TAS TAS
>> 67857 45959 5518 5519 19052 24672 5516
>>
>>
>>
>> AmiGO tells me GO:0000158 has the genes:
>> 19045 Ppp1ca
>> 19046 Ppp1cb
>> 19052 Ppp2ca
>> 19053 Ppp2cb
>> 19055 Ppp3ca
>> 63953 Dusp10
>> 319520 Dusp4
>> 67857 Ppp6c
>>
>> base 2.0.1 datasets 2.0.1 utils 2.0.1 grDevices 2.0.1 graphics 2.0.1 stats
>> 2.0.1 methods 2.0.1 tools 2.0.1 Biobase 1.5.0 reposTools 1.5.1 affy 1.5.8
>> matchprobes 1.0.12 gcrma 1.1.1 qvalue 1.1 siggenes 1.2.11 limma 1.8.6 GO
>> 1.6.8 xtable 1.2-4
>>
>> Thanks very much for any help or suggestions,
>> Dick
>> ***********************************************************************
>> ********
>> Richard P. Beyer, Ph.D. University of Washington
>> Tel.:(206) 616 7378 Env. & Occ. Health Sci. , Box 354695
>> Fax: (206) 685 4696 4225 Roosevelt Way NE, # 100
>> Seattle, WA 98105-6099
>> http://depts.washington.edu/ceeh/ServiceCores/FC5/FC5.html
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>
>>
> +-----------------------------------------------------------------------
> ----------------+
> | Robert Gentleman phone: (206) 667-7700
> |
> | Head, Program in Computational Biology fax: (206) 667-1319 |
> | Division of Public Health Sciences office: M2-B865 |
> | Fred Hutchinson Cancer Research Center
> |
> | email: rgentlem at fhcrc.org
> |
> +-----------------------------------------------------------------------
> ----------------+
>
>
More information about the Bioconductor
mailing list