[BioC] HuGene annotation and htmls
cstrato
cstrato at aon.at
Fri Apr 10 21:22:09 CEST 2009
Dear Mayte
Everything is fine with your code, nothing to worry about.
If you look at column "gene_assignment" of
"HuGene-1_0-st-v1.na28.hg18.transcript.csv" you will see many NAs, e.g.:
> getSYMBOL("7896740", "hugene10st")
7896740
"OR4F17"
> getSYMBOL("7896746", "hugene10st")
7896746
NA
Best regards
Christian
Mayte Suarez-Farinas wrote:
> You are right James!!!
> with the keys James sent the package hugene10st work just fine.
> so it looks like the "error" come from my use of xps.
>
> here is my code:
>
> library(xps)
>
> ### define directories:
> # directory containing Affymetrix library files
> libdir <- "/Users/Mayte/Rlibrary/AffyDB/libraryfiles"
> anndir <- "/Users/Mayte/Rlibrary/AffyDB/Annotation"
> scmdir <- "/Users/Mayte/Rlibrary/AffyDB/ROOTSchemes"
>
> scheme.hugene10stv1r4.na28 <- import.exon.scheme
> ("Scheme_HuGene10stv1r4_na28",filedir=scmdir,
> layoutfile=paste(libdir,"HuGene-1_0-st-
> v1.r4.clf",sep="/"),
> schemefile=paste(libdir,"HuGene-1_0-st-
> v1.r4.pgf",sep="/"),
> probeset=paste(anndir,"HuGene-1_0-st-
> v1.na28.hg18.probeset.csv",sep="/"),
> transcript=paste(anndir,"HuGene-1_0-st-
> v1.na28.hg18.transcript.csv",sep="/"))
>
> scheme.hugene10stv1r4 <- root.scheme(paste(scmdir,
> "Scheme_HuGene10stv1r4_na28.root",sep = "/"))
> G1ST_data<-import.data(scheme.hugene10stv1r4, "Pamela_G1ST_dataxps",
> celdir=getwd(), celfiles = as.character(PD[1:8,'CELfile']), verbose =
> FALSE)
> G1ST_rma_xps <- rma(G1ST_data, "Pamela_G1ST_rma_t",
> background="antigenomic", option="transcript", exonlevel="core+affx",
> normalize=T)
>
> The "featureNames" of the data (or keys) can be taken as:
>
> keys<-as.character(exprs(G1ST_rma_xps)$UnitName)
>
> but almost half them do not have symbol:
>
> sum(!is.na(getSYMBOL(keys, "hugene10st")))
> [1] 19899
> sum(is.na(getSYMBOL(keys, "hugene10st")))
> 9027
>
> Is this OK ? or is there any mistake in my code??
>
> Thanks in advance for everybody help!!!
> and sorry for bothering so many times!
>
> Mayte
>
> On Apr 10, 2009, at 10:55 AM, James W. MacDonald wrote:
>
>
>> I wonder if this is a problem with how the package was built. The
>> numbers that Marc supplied are the Exon Probeset IDs, but the Lkeys
>> of the hugene10st.db package seem to be what Affy calls the
>> Transcript Cluster ID.
>>
>>
>>> keys <- c("7903188","7903203")
>>> getSYMBOL(keys, "hugene10st")
>>>
>> 7903188 7903203
>> "PTBP2" "SNX7"
>>
>> Best,
>>
>> Jim
>>
>>
>>
>> Mayte Suarez-Farinas wrote:
>>
>>> I meant that the usual functions from annotate does not work.
>>> When I ran your code, I get:
>>> library("annotate")
>>> > library("hugene10st.db")
>>> > keys = c("7903193","7903204")
>>> >
>>> > getSYMBOL(keys, "hugene10st")
>>> 7903193 7903204
>>> NA NA
>>> >
>>> > lookUp(keys, "hugene10st" , "CHR")
>>> $`7903193`
>>> [1] NA
>>> $`7903204`
>>> [1] NA
>>> > lookUp(keys, "hugene10st" , "ENTREZID")
>>> $`7903193`
>>> [1] NA
>>> $`7903204`
>>> [1] NA
>>> sessionInfo()
>>> R version 2.8.1 (2008-12-22)
>>> i386-apple-darwin8.11.1
>>> locale:
>>> en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>>> attached base packages:
>>> [1] splines tools stats graphics grDevices utils
>>> datasets methods base
>>> other attached packages:
>>> [1] hugene10st.db_1.0.2 statmod_1.3.8
>>> beadarray_1.10.0 sma_0.5.15 hwriter_1.0
>>> [6] affycoretools_1.14.1 annaffy_1.14.0
>>> KEGG.db_2.2.5 biomaRt_1.16.0 GOstats_2.8.0
>>> [11] Category_2.8.4 RBGL_1.18.0
>>> GO.db_2.2.5 RSQLite_0.7-1 DBI_0.2-4
>>> [16] graph_1.20.0 limma_2.16.4
>>> affyQCReport_1.20.0 geneplotter_1.20.0 annotate_1.20.1
>>> [21] AnnotationDbi_1.5.18 lattice_0.17-17
>>> RColorBrewer_1.0-2 affyPLM_1.18.1 preprocessCore_1.4.0
>>> [26] xtable_1.5-4 simpleaffy_2.18.0
>>> gcrma_2.14.1 matchprobes_1.14.1 genefilter_1.22.0
>>> [31] survival_2.34-1 affy_1.20.2 Biobase_2.2.2
>>> loaded via a namespace (and not attached):
>>> [1] GSEABase_1.4.0 KernSmooth_2.22-22 RCurl_0.94-1
>>> XML_2.1-0 affyio_1.10.1
>>> [6] cluster_1.11.11 grid_2.8.1 xps_1.2.8
>>> On Apr 9, 2009, at 5:26 PM, Marc Carlson wrote:
>>>
>>>> Hi Mayte,
>>>>
>>>> I can't tell from your post what you tried to do, or even what
>>>> exactly
>>>> you need to know. Please give us the code you were trying to
>>>> use, along
>>>> with an example that didn't behave the way you expected it to and
>>>> you
>>>> the results of calling sessionInfo() after you did that. You can
>>>> find
>>>> other helpful tips on the posting guide:
>>>>
>>>> http://www.bioconductor.org/docs/postingGuide.html
>>>>
>>>> What little I can discern from your post I will try to answer.
>>>> To use
>>>> getSYMBOL() or lookUp(), you need to 1st of all make sure that
>>>> you have
>>>> loaded the annotate package. Then you need to call it
>>>> correctly. Here
>>>> is an example that I did using the very latest version of the
>>>> hugene10st.db package.
>>>>
>>>> library("annotate")
>>>> library("hugene10st.db")
>>>> keys = c("7903193","7903204")
>>>>
>>>> getSYMBOL(keys, "hugene10st")
>>>>
>>>> lookUp(keys, "hugene10st" , "CHR")
>>>> lookUp(keys, "hugene10st" , "ENTREZID")
>>>>
>>>>
>>>>
>>>> Hope this helps,
>>>>
>>>>
>>>>
>>>> Marc
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Mayte Suarez-Farinas wrote:
>>>>
>>>>> I am learning to work with the HuGene ST1 chips.
>>>>> I was able to use xps to read and preprocess the files
>>>>> and then I convert to ExpressionSet class to use limma
>>>>> for modelling.
>>>>> Next step I stop: the annotation.
>>>>> I load library("hugene10st.db") but the normal functions
>>>>> to create html annotation does not seems to work on this chip.
>>>>> I also try to get each component using getSYMBOL and lookUP
>>>>> with no success.
>>>>> what's the way to go???
>>>>>
>>>>> Thanks
>>>>>
>>>>> Mayte
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> [[alternative HTML version deleted]]
>>>>>
>>>>> _______________________________________________
>>>>> Bioconductor mailing list
>>>>> Bioconductor at stat.math.ethz.ch
>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>> Search the archives: http://news.gmane.org/
>>>>> gmane.science.biology.informatics.conductor
>>>>>
>>>>>
>>>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives: http://news.gmane.org/
>>> gmane.science.biology.informatics.conductor
>>>
>> --
>> James W. MacDonald, M.S.
>> Biostatistician
>> Douglas Lab
>> University of Michigan
>> Department of Human Genetics
>> 5912 Buhl
>> 1241 E. Catherine St.
>> Ann Arbor MI 48109-5618
>> 734-615-7826
>>
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>
More information about the Bioconductor
mailing list