[BioC] problem about hgu133plus2 annotation

Benjamin Otto b.otto at uke.uni-hamburg.de
Fri Jul 23 13:15:24 CEST 2010


Hi Gina,

I do agree with Jim, have a look at the mentioned packages.

However if for some reason you still want to use the csv file from affy, do have a look the the merge() function. That should solve your problem. By the way, which one has 54630 rows and which 54640? Because if you use the affy csv file you might want to check wether  the missing probeset ids are AFFX-xxxx probesets or some transcripts you want to keep in your data set. As for the AFFX-probesets you can ignore those because these are only some kind of internal controls you won't need for differential expression analysis.

cheers 

Benjamin

PS:

Another thing: If you prefer some interface for annotation of not sooo big gene lists instead of coding some R-functions, have a look at BioMart on the ensembl webpage. As for my part, I always think R-code has the advantage, that you can save and use it as a kind of log-book. So you can always comeback to it and check what you have done and how. 


Am 22.07.2010 um 18:41 schrieb James W. MacDonald:

> Hi Gina,
> 
> On 7/22/2010 5:11 AM, Gina Liao wrote:
>> 
>> Dear All,
>> I have 20 chips, and I used R to standardize the CEL files.Then, i got an expression value data of all chips.And I also downloaded the annotation csv format from NetAffy.(HG-U133_Plus_2 Annotations, CSV format, Release 30 (22 MB, 11/15/09))
>> Here's my code.
>> ########test = justRMA()eset.st = standardise(test)
>> exprs.st = exprs(eset.st)e.out = exprs.stdim(e.out) #* 54675 20########
>> However, i found out that the order of the rownames(e.out) is a little different to the row name of hgu133plus2.csv. The order from 54630 to 54640 is not the same to these two rows.
>> They should be the same,right? Is "hgu133plus2cdf" the problem? How could I solve it?
> 
> I would recommend you use the annotation packages that are available from Bioconductor rather than downloading the annotation packages from Affymetrix. The BioC annotation packages contain the same information, but are designed to be easily used from within R, and you will find the .csv files you can get from Affy are not as user-friendly.
> 
> You can get the annotation package using biocLite():
> 
> biocLite("hgu133plus2.db")
> 
> Note that there is no reason to expect that the order of annotation data will be the same as the order of expression data. Re-ordering things is exceedingly simple in R, so this point is irrelevant.
> 
> Using the annotation packages will take some reading on your part, but once you get the hang of things, I think you will like how they work. You might start with
> 
> library(hgu133plus2.db)
> ?hgu133plus2.db
> 
> as well as
> 
> openVignette() and choose the AnnotationDbi vignette.
> 
> If you are interested in annotating the set of interesting genes from your experiment, you will want to look at the annaffy package, which will allow you to output both HTML and text files with your results and annotations for each gene.
> 
> In addition, you might want to look at the affycoretools package, which helps automate some of the steps required to annotate results. This package is also integrated with limma, so you can go straight from your linear model fits to output in one function call.
> 
> Best,
> 
> Jim
> 
> 
> 
>> Thanks!!!!!
>> Best,Gina 		 	   		
>> _________________________________________________________________
>> 
>> 
>> 	[[alternative HTML version deleted]]
>> 
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> -- 
> James W. MacDonald, M.S.
> Biostatistician
> Douglas Lab
> University of Michigan
> Department of Human Genetics
> 5912 Buhl
> 1241 E. Catherine St.
> Ann Arbor MI 48109-5618
> 734-615-7826
> **********************************************************
> Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 

___________________________________________
Benjamin Otto, PhD
University Medical Center Hamburg-Eppendorf
Institute For Clinical Chemistry / Central Laboratories
Campus Forschung N27
Martinistr. 52,
D-20246 Hamburg

Tel.: +49 40 7410 51908
Fax.: +49 40 7410 54971
___________________________________________





-- 
Pflichtangaben gemäß Gesetz über elektronische Handelsregister und Genossenschaftsregister sowie das Unternehmensregister (EHUG):

Universitätsklinikum Hamburg-Eppendorf
Körperschaft des öffentlichen Rechts
Gerichtsstand: Hamburg

Vorstandsmitglieder:
Prof. Dr. Jörg F. Debatin (Vorsitzender)
Dr. Alexander Kirstein
Joachim Prölß
Prof. Dr. Dr. Uwe Koch-Gromus



More information about the Bioconductor mailing list