[BioC] Exon array annotation with limma?

Michael Imbeault michael.imbeault at sympatico.ca
Wed Dec 23 06:35:58 CET 2009


Hello James,

The problem with biomaRt turned out to be on my ISP side; I can access 
it just fine through my university proxy. Sadly, biomart doesn't handle 
the transcript cluster affy ids either, just the exon level ones. 
Fortunately the onechannelGUI maintainer posted in the thread; his 
approach looks promising; as I understand it, there's no standard way to 
annotate gene level results on exon arrays as of now.

Happy holidays,
Michael

On 22/12/2009 9:09 AM, James W. MacDonald wrote:
> Hi Michael,
>
> To return to the limma2biomaRt issue, it is always possible that the 
> Biomart server was actually down when you made your last attempt.
>
> Does something simplistic like this not work for you (it just worked 
> for me)?
>
> library(biomaRt)
> mart <- useMart("ensembl","hsapiens_gene_ensembl")
> getBM("entrezgene","entrezgene", "1", mart)
>
> Best,
>
> Jim
>
>
>
>
> Michael Imbeault wrote:
>> Hello James,
>>
>> Little update: turns out onechannelGUI has an embedded annotation 
>> database for the huex arrays in a .rda format, so it's not going 
>> throught limma2annaffy. Seems like no .db solution exists so far 
>> (which is strange, it's the only affymetrix array that has no .db 
>> file on the bioconductor site) , so I think i'll just go throught 
>> exonmap and build the table manually.
>>
>> Happy holidays,
>> Michael
>>
>>
>> On 21/12/2009 3:54 PM, James W. MacDonald wrote:
>>> Hi Michael,
>>>
>>> Michael Imbeault wrote:
>>>> Thanks for the help James,
>>>>
>>>> I did:
>>>>
>>>> featureNames(eSet.gene) <- paste(featureNames(eSet.gene), "_at" , 
>>>> sep="")
>>>>
>>>> (note the  sep="", without it the probes were like "1000 _at"). 
>>>> Sadly, the end result is the same, except that as a side effect, 
>>>> the first column probe links don't work anymore (because of the 
>>>> added _at, they don't link to the right probe on the Affy site).
>>>>
>>>> I verified that the probes in eSet.gene contain _at after the 
>>>> operation. I build my eset with:
>>>>
>>>> eSet.gene <- new("ExpressionSet", exprs = rma.gene, phenoData = 
>>>> phenoData)
>>>>
>>>> Should I add annotation="huex10stv2" or "huex10stv2hsentrezg" or 
>>>> something similar? Do i need the cdf file in addition to the .db one?
>>>
>>> No. I think there is just a mismatch problem here. As you mentioned 
>>> below, onechannelGUI is able to create a table with annotation.
>>>
>>> All limma2annaffy is doing is passing the probeset IDs on to 
>>> annaffy. All the matching and link building are done there, but if 
>>> the IDs don't match to anything in the annotation package then 
>>> annaffy will just create an empty cell in the table.
>>>
>>> If you take the first 10 or so featureNames (with the _at appended) 
>>> and do e.g.,
>>>
>>> mget(<thefeaturenames>, huex10stv2hsentrezgUNIGENE)
>>>
>>> do you get anything returned?
>>>
>>> Best,
>>>
>>> Jim
>>>>
>>>> Thanks,
>>>> Michael
>>>>
>>>> On 21/12/2009 1:01 PM, James W. MacDonald wrote:
>>>>> Hi Michael,
>>>>>
>>>>> Michael Imbeault wrote:
>>>>>> Hello,
>>>>>>
>>>>>> I'm analyzing human exon arrays normalized using Affymetrix Power 
>>>>>> tools for normalization (using 'core' probes) and limma for 
>>>>>> significantly modulated genes (all at the gene level, of course).
>>>>>>
>>>>>> The limma2annaffy function produce tables, but with all 
>>>>>> annotation table cells empty. I'm doing:
>>>>>>
>>>>>> limma2annaffy(eSet.gene, fit2, design,cont.matrix, lib = 
>>>>>> "huex10stv2hsentrezg.db", interactive=F, pfilt=0.05, fldfilt=0.8)
>>>>>>
>>>>>> where huex10stv2entrezg.db is from : 
>>>>>> http://brainarray.mbni.med.umich.edu/Brainarray/Database/CustomCDF/12.1.0/entrezg.asp 
>>>>>>
>>>>>>
>>>>>> Is it the right file to use?
>>>>>
>>>>> Most likely. However, the MBNI folks have an unfortunate habit of 
>>>>> adding _at to the end of all their probesets, regardless the 
>>>>> source. So for instance, if I look at the probesets in this 
>>>>> package, I get something like this:
>>>>>
>>>>> > head(Lkeys(huex10stv2hsentrezgGENENAME))
>>>>> [1] "10000_at"     "10001_at"     "10002_at"
>>>>> [4] "100033423_at" "100033424_at" "100033425_at"
>>>>>
>>>>> And I am betting if you do something like 
>>>>> head(featureNames(eSet.gene)), you won't have any of those nasty 
>>>>> _at extensions.
>>>>>
>>>>> A simple albeit kludgy fix would be for you to first do
>>>>>
>>>>> featureNames(eSet.gene) <- paste(featureNames(eSet.gene), "_at")
>>>>>
>>>>> and then run limma2annaffy().
>>>>>>
>>>>>> Using onechannelGUI produce the same tables but with annotations, 
>>>>>> so I know there's a way to do it.
>>>>>
>>>>> I am betting that the onechannelGUI folks know about the extra _at 
>>>>> extensions and are silently stripping them. I could hypothetically 
>>>>> do the same, but I rebel against the idea that I should have to 
>>>>> put code in my package to protect people from infelicities in 
>>>>> other people's packages.
>>>>>
>>>>>
>>>>> Best,
>>>>>
>>>>> Jim
>>>>>
>>>>>
>>>>>>
>>>>>> To complicate things further, limma2biomaRt, which is another 
>>>>>> option, fails with:
>>>>>>
>>>>>>     "Request to BioMart web service failed. Verify if you are still
>>>>>>     connected to the internet.  Alternatively the BioMart web 
>>>>>> service is
>>>>>>     temporarily down."
>>>>>>
>>>>>> which from the mailing list seem to be an RCurl problem. I tried 
>>>>>> updating it to the latest and older (0.92) versions, using 
>>>>>> --internet2 doesn't solve this and as far as I know i'm not using 
>>>>>> a proxy to connect to the net. I'm under Windows 7.
>>>>>>
>>>>>> Any help would be appreciated,
>>>>>> Michael
>>>>>>
>>>>>>     [[alternative HTML version deleted]]
>>>>>>
>>>>>> _______________________________________________
>>>>>> Bioconductor mailing list
>>>>>> Bioconductor at stat.math.ethz.ch
>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>>> Search the archives: 
>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>>
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at stat.math.ethz.ch
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives: 
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>



More information about the Bioconductor mailing list