[BioC] hyperGTest html report
James W. MacDonald
jmacdon at med.umich.edu
Thu Jan 10 15:42:17 CET 2008
Sebastien Gerega wrote:
> Thanks for that!
> I can now almost get what I want.....
> Here is the code I use:
>
> hgOver = hyperGTest(params)
> report = summary(hgOver, htmlLinks=TRUE)
> cats = sigCategories(hgOver)
> reportGenes = vector()
>
> for(i in 1:length(cats)){
> reportGenes = append(reportGenes, geneIdsByCategory(hgOver, cats[i]))
> }
>
> This gives me reportGenes as a list something like this:
>
> $`04650`
> [1] 10451 4277 5296 5880 6464 8743 8795 8797
>
> $`04670`
> [1] 10451 1365 5296 5829 5880 6387 6494 87 9564
>
> $`00150`
> [1] 3291 51451 6715
>
> $`04080`
> [1] 154 2150 4886 4923 7433
>
> $`04360`
> [1] 10512 1969 2043 56920 57522 57556 5880 6387
>
> I would then like to run the following code:
>
> report <- data.frame(report, reportGenes)
> xtab <- xtable(report, caption="A Caption")
> print(xtab, type="html", file="Afile.html", caption.placement="top",
> sanitize.text.function=function(x) x, include.rownames=FALSE)
>
> But I get the following error:
> Error in data.frame("04650" = c(10451L, 4277L, 5296L, 5880L, 6464L,
> 8743L, :
> arguments imply differing number of rows: 8, 9, 3, 5, 7
This is the part where I said you have to wrap the Entrez Gene IDs in
<P>EGID</P> so you can a.)have a vector of the correct length, and b.)
create a table that will be readable.
Something like this should suffice:
rg.out <- sapply(reportGenes, function(x)
paste("<P>", paste(x, collapse="</P><P>"), "</P>", sep=""))
then use rg.out in lieu of reportGenes when making the data.frame.
Best,
Jim
>
> How should I deal with this list so that I can add it to the data.frame?
> And are there any faster ways to do what I have done in this code?
> I am still getting used to R.
> thanks heaps,
> Sebastien
>
> James W. MacDonald wrote:
>> Hi Sebastien,
>>
>> Maybe not directly, but note that htmlReport() is simply using xtable
>> to create the HTML page using the output from summary(). So you could
>> just create the table and then add a column of Entrez Gene IDs and
>> then output the result.
>>
>> Say your GOHyperGResult object is called 'hypt':
>>
>> out <- summary(hyp, summary.args=list(htmlLinks=TRUE, categorySize=10))
>>
>> Note that the categorySize argument isn't necessary, but does protect
>> you from choosing arguably spurious results (like a GO term with 3
>> genes in the universe and 1 that was significant).
>>
>> Now you are going to have to create a vector containing all the Entrez
>> Gene IDs for each GO term. For this to work in HTML, you will also
>> need to separate each ID with a <P>EntreGeneID</P>, so you will need
>> to either cat() or paste() things together. Once you have that, just
>> add to the data.frame created above:
>>
>> out <- data.frame(out, entregeneidvector)
>> xtab <- xtable(out, caption="A Caption", digits=rep(c(3,0), c(4,8)))
>> print(xtab, type="html", file="A file name.html",
>> caption.placement="top", sanitize.text.function=function(x) x,
>> include.rownames=FALSE)
>>
>> HOWEVER, that might not really be what you want, as it will obviously
>> be a bit of work, and could get really messy if there are dozens of
>> Entrez Gene IDs for a particular GO term. An alternative is to output
>> individual HTML tables for each GO term of interest that list out the
>> probesets that contributed to the significance of that term. For that
>> you might want to look at hyperGoutput() in the affycoretools package.
>>
>> Best,
>>
>> Jim
>>
>>
>> Sebastien Gerega wrote:
>>> Hi,
>>> is there any way to get additional information into the hyperGTest
>>> html report?
>>> Specifically, I would like to include the Entrez IDs for the genes
>>> contributing to
>>> each overrepresented GO term.
>>> thanks,
>>> Sebastien
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623
More information about the Bioconductor
mailing list