[BioC] GO graph plotting with some in color

Wed Nov 28 05:29:41 CET 2007

Thank you again for discussing, I am grateful

Cost?
True, Ingenuity is like $8,000 per year and does not make graphs as I can
see, I examined Genesifter Genespring GoMiner and Gene_xyz and returned to
Bio, and the others would all love to sell something, is true, most
certainly

Still, 12 days for one graph is expensive
R and Bioconductor are not easy to use for the beginner

Like I did 
> rows2  <-  read.delim(file="339namesColorsRowsBBtab.txt", sep="\t")
And the process turned ":" into "." which of course wiped out the GO terms
which read like GO:1111111, and it took perhaps an hour to learn to add
check.names=FALSE, but so for the novice

Question 1.......................
Read as one???
Ok with
> rows2  <-  read.delim(file="339namesColorsRowsBBtab.txt", sep="\t",
check.names = FALSE)
> colors  <-  as.matrix(rows2)[1,]
I can now read from one file, still two commands but is more fun

I found the as.matrix trick in an old email from Brian Ripley in 2004 which
is still findable by google

So question 1 is done, thank you

Question 2................................
As for details, I copied the code exactly from R, that is exactly what I
did, no more, no less, I am trying to comply with the detail issue, believe
me

Or perhaps you mean experimental details.  We have Duroc pigs which make
thick scar.  And Yorkshire pigs that make thin scar.  And would like to know
why Durocs make thick and perhaps how to alter this.  So we wounded the pigs
and biopsied them at 1 2 3 12 and 20 weeks. And amplified the RNA and
hybridized Affy porcine chips.  Then with Bio packages did mixed linear
regression and chose to go further with those probes with appropriate
p-value. Then cut those probes for which the bio replicates did not match.
Then cut those probes for which fold change was <1.4.  Then cut those probes
for which the Affy present absent calls were illogical.  Then cut those
probes that do not match human hypertrophic scar since we do not wish to
study a pig gene that makes a curly tail. And are left with 1,019 probes of
interest.  Some of these are dupes so we are down to 953 genes that
satisfied all the criteria. And enrichment is likely to come from DAVID but
a graph seemed cool so we did makeGoGraph and after study founds many GO
terms not relevant and cut them so am now down to 333 "relevant" terms. And
continue with the graphing process.

Read two papers??? I did, and as you might anticipate they are shall we say
"over my head".  But I think I am not trying to determine over or under
representation at this point. Just make a graph demonstrating where the 333
are located in the "universe" of BPPARENTS with incoming and outgoing edges.
But then, maybe that is simply a dumb thing to do.

But if not, then question 2, why in the plot do none of the 333 have
incoming edges? Is it possible that none of the 333 is a parent to another
of the 333?

I think I have answered this.  If I pick a known ancestor, and put it in the
"interesting" file, it appears colored.  So would appear none of the 333
have an ancestor within the 333.  Interesting. So this is a graph merged
from 333 graphs.  This then would suggest that any node, colored or not,
with lots of incoming edges becomes interesting.

Question 3.................
So statistics and enrichment I think are not our issue at the moment.

So question3 remains. If we would obtain a graph as mentioned with incoming
and outgoing edges, would it tell any one anything? Or is enrichment the
only deal with GO?

Thank again for discussing.  I am grateful and appreciative.

Thanks again

Loren Engrav
Univ Wash
Seattle

> From: Robert Gentleman <rgentlem at fhcrc.org>
> Date: Mon, 26 Nov 2007 17:55:06 -0800
> To: Loren Engrav <engrav at u.washington.edu>
> Cc: <bioconductor at stat.math.ethz.ch>
> Subject: Re: [BioC] GO graph plotting with some in color
> 
> 
> 
> Loren Engrav wrote:
>> Cool, thank you
>> 
>> After 11 days 
>> I can read in the data and have the graph
>> That is a rather expensive graph I would say :)
> 
>   It depends a lot on what your options were, I suspect as compared to
> something like Ingenuity it was a remarkably cheap graph, but you might
> investigate that option.  There are lots of commercial vendors that
> would love to sell you something.
> 
>> 
>>> nAttrs = list()
>>> colors  <-  scan(file="339colorsRowBB.txt", what="character")
>> Read 339 items
>>> names(colors)  <-  scan(file="339namesRowBB.txt", what="character")
>> Read 339 items
>>> nAttrs$fillcolor  <- colors
>>> nAttrs$color  <-  colors
>> 
>>> bpCutLeaves <- scan(file="339namesRowBB.txt", what = "character")
>> Read 339 items
>>> bpCutLeavestree <- GOGraph(bpCutLeaves, GOBPPARENTS)
>>> postscript ("bpCutLeavestree.ps", width=100, height = 100, paper="special");
>> plot (bpCutLeavestree, nodeAttrs=nAttrs); dev.off()
>> postscript 
>>          2 
>> 
>> 
>> Graph is up at <http://homepage.mac.com/engrav/Menu9.html> and hit arrow on
>> FileSharing
>> 
>> Now I have 3 questions please
>> 
>> 1) the code above reads in two files, one for colors and one for names; that
>> seems rather grade school, is there not a more efficient way, like with one
>> file
> 
>    Yes, put them all in one comma (or otherwise separated) file and read
> that in, with read.table, or something like that, and then manipulate in R.
> 
>> 
>> 2) in the graph the colored and therefore "significant" nodes have no
>> incoming edges from other colored and "significant" nodes; is it possible
>> that none of the 339 "significant" nodes have a parent within the 339?
> 
>    It depends on what you did, and you do seem to really insist on not
> giving details, so I am not sure what to say. If you used the
> conditional test then the point of conditioning is to remove the
> parent-child artifact.  For complete details you need to read either our
> paper or Adrian Alexa's, references given in another thread today.
> 
> 
>> 
>> 3) and finally, What do I have?  We are getting enrichment from DAVID and
>> Amigo as they are lots more user friendly than Bioconductor. But they fail
>> at plotting so I tried Bioconductor.  The resultant graph shows red and blue
>> sprinkled here and there but so what?
> 
>    Computing something is no substitute for understanding it. It is not
> easy to help here, as one would need to know what your scientific
> objectives are and how using GO, or other functional information might
> help to achieve those objectives.  These are tools that can be used for
> many purposes, and do require some non-trivial interpretation. UW has a
> good Biostatistics Dept, consulting someone there is probably your best bet.
> 
>> 
>> I found a manuscript with an almost identical graph with credits to
>> Bioconductor.  See Journal of Molecular Endocrinology (2006) 37, 301316.  I
>> checked their conclusions from the graph and found them rather skimpy.
>> 
>> What is the conclusion from the graph? Or is it just a pretty graph?  But it
>> was fun, by the way.
>> 
>> Again thank you for helping.  I am grateful.
> 
>   you're welcome
> 
>> 
>> Loren Engrav
>> Univ Wash
>> Seattle, WA USA 
>> 
>>