[BioC] problem with rat database
Alberto Goldoni
alberto.goldoni1975 at gmail.com
Tue May 10 14:34:26 CEST 2011
@Davis
You are right! But i have tryed to perform this kind of search:
library("rgug4130a.db")
x <- rgug4130aENSEMBL
mapped_genes <- mappedkeys(x)
xx <- as.list(x[mapped_genes])
or this approach:
x <- rgug4130aGENENAME
mapped_probes <- mappedkeys(x)
xx <- as.list(x[mapped_probes])
but the results are the same in some genes there is:"unknown function".
I would like to know if there is a method in order to perform the
search using another database or directly to the Rat Genome Database
or using biomaRt...but i don't know how.
I have more or less 100 genes with an "unknown function" and it would
be very useful if there is a script or function in order to perform
automatically instead of serching genes one by one.
Best regards.
2011/5/10 Sean Davis <sdavis2 at mail.nih.gov>:
>
>
> On Tue, May 10, 2011 at 8:17 AM, Alberto Goldoni
> <alberto.goldoni1975 at gmail.com> wrote:
>>
>> @Vincent
>>
>> The chip used is the "rgug4130a" so i have to use the "rgug4130a.db"
>> database.
>>
>> In order to obtain the toptable this is my history:
>>
>> library(limma)
>> library(vsn)
>> targets <- readTargets("targets.txt")
>> RG <- read.maimages(targets$FileName, source="agilent")
>> MA <- normalizeBetweenArrays(RG, method="Aquantile")
>> contrast.matrix <-
>>
>> cbind("(hda+str)-(ref)"=c(1,0),"(ref+str)-(ref)"=c(0,1),"(hda+str)-(ref+str)"=c(1,-1))
>> rownames(contrast.matrix) <- colnames(design)
>> fit <- lmFit(MA, design)
>> fit2 <- contrasts.fit(fit, contrast.matrix)
>> fit2 <- eBayes(fit2)
>> geni500<-topTable(fit2,number=500,adjust="BH")
>>
>
> Hi, Alberto.
> The data in your topTable result are taken from the feature extraction
> result file. In other words, rgug4130a.db is not used in what you show
> above. You could add to your annotation using either rgug4130a.db or
> biomaRt, but you will need to perform these steps yourself. As to why some
> of your probes do not appear to have annotation, you would probably need to
> contact Agilent as they are the source of your current annotation.
> Hope that helps,
> Sean
>
>>
>> > sessionInfo()
>> R version 2.12.1 (2010-12-16)
>> Platform: i386-pc-mingw32/i386 (32-bit)
>>
>> locale:
>> [1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United
>> Kingdom.1252
>> [3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
>> [5] LC_TIME=English_United Kingdom.1252
>>
>> attached base packages:
>> [1] stats graphics grDevices utils datasets methods base
>>
>> other attached packages:
>> [1] AnnotationDbi_1.12.0 Biobase_2.10.0 limma_3.6.9
>>
>> loaded via a namespace (and not attached):
>> [1] DBI_0.2-5 RSQLite_0.9-4 tools_2.12.1
>>
>>
>>
>> 2011/5/10 Vincent Carey <stvjc at channing.harvard.edu>:
>> > 1) you did not provide sessionInfo(), which is critical for helping
>> > you to diagnose an issue that may pertain to software version --
>> > revisions to annotation packages can have all sorts of consequences
>> >
>> > 2) i am not sure rgug4130.db has anything to do with this.
>> >
>> >> get("CB606456", revmap(rgug4130aSYMBOL))
>> > Error in .checkKeys(value, Rkeys(x), x at ifnotfound) :
>> > value for "CB606456" not found
>> >
>> >
>> > and so on. look at the featureData component of the object passed to
>> > lmFit -- the annotation may be in there. if this does not give
>> > clarification please give very explicity indication of how the
>> > topTable was generated, going back to the structure of the object
>> > passed to lmFit
>> >
>> > On Tue, May 10, 2011 at 5:30 AM, Alberto Goldoni
>> > <alberto.goldoni1975 at gmail.com> wrote:
>> >> Dear All,
>> >> i'm analyzing agilent microarrays with the "rgug4130a.db" database and
>> >> using the function:"topTable(fit2,number=500,adjust="BH")" i have
>> >> obtained 500 genes like these:
>> >>
>> >> Row Col ProbeUID ControlType ProbeName
>> >> GeneName SystematicName Description X.hda.str...ref.
>> >> X.ref.str...ref. X.hda.str...ref.str. AveExpr F P.Value
>> >> adj.P.Val
>> >> 16096 79 38 15309 0 A_43_P10328 CB606456
>> >> CB606456 unknown
>> >> function 3.988290607 -0.951656306 4.939946913
>> >> 10.29735936 36.77263264 0.000212298 0.641094595
>> >> 8109 40 109 7609 0 A_42_P552092 203358_Rn
>> >> 203358_Rn Rat c-fos
>> >> mRNA. 5.670956889 4.413365374 1.257591514 13.47699544
>> >> 33.20342601 0.000292278 0.641094595
>> >>
>> >> but as you can see most genes like the first one - CB606456 - in the
>> >> DESCRPTION there is written "unknown function".
>> >>
>> >> So i have performed a very simply search.
>> >> 1) First in ENSAMBLE using the GeneName "CB606456" with the "Locations
>> >> of DnaAlignFeature" it gives to me the Genomic location(strand): chr
>> >> 7:16261621-16262210
>> >> 2) Then in the Rat Genome Database
>> >> (http://rgd.mcw.edu/tools/genes/genes_view.cgi?id=735058) i have found
>> >> that in this position there is one gene:
>> >>
>> >> 735058 GENE Angptl4 angiopoietin-like 4 7 16261623
>> >> 16267852
>> >>
>> >> so the question is why in the "rgug4130a.db" database the R system
>> >> gives to me "unknown function" when using the genomic location in
>> >> ensamble and then in rgd it gives to me the Angptl4 gene!
>> >>
>> >> and there is a function in order to do to R to perform this kind of
>> >> search automatically? (this why in my 500 genes there are 100 "unknow
>> >> function" genes and it will be interesting to have a function that
>> >> perform this kind of search automatically).
>> >>
>> >>
>> >> Best regards to all and to whom answer to me.
>> >>
>> >> --
>> >> -----------------------------------------------------
>> >> Dr. Alberto Goldoni
>> >> Parma, Italy
>> >>
>> >> _______________________________________________
>> >> Bioconductor mailing list
>> >> Bioconductor at r-project.org
>> >> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> >> Search the archives:
>> >> http://news.gmane.org/gmane.science.biology.informatics.conductor
>> >>
>> >
>>
>>
>>
>> --
>> -----------------------------------------------------
>> Dr. Alberto Goldoni
>> Parma, Italy
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
--
-----------------------------------------------------
Dr. Alberto Goldoni
Parma, Italy
More information about the Bioconductor
mailing list