[BioC] GO graph plotting with some in color

Robert Gentleman rgentlem at fhcrc.org
Tue Nov 27 02:55:06 CET 2007



Loren Engrav wrote:
> Cool, thank you
> 
> After 11 days 
> I can read in the data and have the graph
> That is a rather expensive graph I would say :)

  It depends a lot on what your options were, I suspect as compared to 
something like Ingenuity it was a remarkably cheap graph, but you might 
investigate that option.  There are lots of commercial vendors that 
would love to sell you something.

> 
>> nAttrs = list()
>> colors  <-  scan(file="339colorsRowBB.txt", what="character")
> Read 339 items
>> names(colors)  <-  scan(file="339namesRowBB.txt", what="character")
> Read 339 items
>> nAttrs$fillcolor  <- colors
>> nAttrs$color  <-  colors
> 
>> bpCutLeaves <- scan(file="339namesRowBB.txt", what = "character")
> Read 339 items
>> bpCutLeavestree <- GOGraph(bpCutLeaves, GOBPPARENTS)
>> postscript ("bpCutLeavestree.ps", width=100, height = 100, paper="special");
> plot (bpCutLeavestree, nodeAttrs=nAttrs); dev.off()
> postscript 
>          2 
> 
> 
> Graph is up at <http://homepage.mac.com/engrav/Menu9.html> and hit arrow on
> FileSharing
> 
> Now I have 3 questions please
> 
> 1) the code above reads in two files, one for colors and one for names; that
> seems rather grade school, is there not a more efficient way, like with one
> file

   Yes, put them all in one comma (or otherwise separated) file and read 
that in, with read.table, or something like that, and then manipulate in R.

> 
> 2) in the graph the colored and therefore "significant" nodes have no
> incoming edges from other colored and "significant" nodes; is it possible
> that none of the 339 "significant" nodes have a parent within the 339?

   It depends on what you did, and you do seem to really insist on not 
giving details, so I am not sure what to say. If you used the 
conditional test then the point of conditioning is to remove the 
parent-child artifact.  For complete details you need to read either our 
paper or Adrian Alexa's, references given in another thread today.


> 
> 3) and finally, What do I have?  We are getting enrichment from DAVID and
> Amigo as they are lots more user friendly than Bioconductor. But they fail
> at plotting so I tried Bioconductor.  The resultant graph shows red and blue
> sprinkled here and there but so what?

   Computing something is no substitute for understanding it. It is not 
easy to help here, as one would need to know what your scientific 
objectives are and how using GO, or other functional information might 
help to achieve those objectives.  These are tools that can be used for 
many purposes, and do require some non-trivial interpretation. UW has a 
good Biostatistics Dept, consulting someone there is probably your best bet.

> 
> I found a manuscript with an almost identical graph with credits to
> Bioconductor.  See Journal of Molecular Endocrinology (2006) 37, 301­316.  I
> checked their conclusions from the graph and found them rather skimpy.
> 
> What is the conclusion from the graph? Or is it just a pretty graph?  But it
> was fun, by the way.
> 
> Again thank you for helping.  I am grateful.

  you're welcome

> 
> Loren Engrav
> Univ Wash
> Seattle, WA USA  
> 
> 
>> From: Robert Gentleman <rgentlem at fhcrc.org>
>> Date: Mon, 26 Nov 2007 11:33:58 -0800
>> To: Loren Engrav <engrav at u.washington.edu>
>> Cc: <bioconductor at stat.math.ethz.ch>
>> Subject: Re: [BioC] GO graph plotting with some in color
>>
>>
>>
>> Loren Engrav wrote:
>>> Followup to this issue
>>>
>>> I can move the color data from Excel to BBedit
>>> Then edit till I have the string like GO:0004198="red", GO:004199="blue"
>>> Then copy paste into the c function and the error below does not occur and I
>>> obtain red and blue nodes, which is cool
>>    If you read the documentation for the function you are trying to use,
>> it will explain the format of the arguments.  In this case you need a
>> named vector.  You can also evaluate the examples and see what was used
>> to create them.
>>
>>    If you then look at what you get when you read in the data from a
>> file, as you have described, you will see it is not a named vector, and
>> you will need to process it to get the form needed to pass on to the
>> plotting routines.
>>
>>    Perhaps the easiest way is to replace the = in your file with a comma
>> or some other delimiter (or use the sep argument in the
>> read.table/read.delim family of functions - lots of documentation here
>> too, ?read.delim) and read in the data as a data.frame. Then use that to
>> create the named vector.
>>
>>
>>> This may be good enough. But can I set up a text file in some form and then
>>> read or scan or something the data in?
>>>
>>> I have tried scan and read.delim and googled, etc but have failed
>>>
>>> Thank you
>>> =========================================
>>>
>>>> From: Loren Engrav <engrav at u.washington.edu>
>>>> Date: Fri, 23 Nov 2007 13:31:32 -0800
>>>> To: <bioconductor at stat.math.ethz.ch>
>>>> Conversation: GO graph plotting with some in color
>>>> Subject: [BioC] GO graph plotting with some in color
>>>>
>>>>
>>>> So am doing GO graphing and I have done
>>>>
>>>>> bpCutLeaves <- scan(file="343afterDupesNotCut.txt", what = "character")
>>>> Read 343 items
>>>>> bpCutLeavestree <- GOGraph(bpCutLeaves, GOBPPARENTS)
>>>>> postscript ("bpCutLeavestreeTest.ps", width=100, height = 100,
>>>> paper="special"); plot (bpCutLeavestree); dev.off()
>>>> quartz 
>>>>      2 
>>>>> bpCutLeavestree
>>>> A graphNEL graph with directed edges
>>>> Number of Nodes = 1456
>>>> Number of Edges = 2418
>>>>
>>>> And the postscript file is very nice
>>>> But I need the 343 leaves colored red (over expression) and blue (under
>>>> expression)
>>>>
>>>> So from Rgraphviz documentation I do
>>>>  
>>>>> nAttrs  <-  list()
>>>>> nAttrs$color <- scan(file="343forColorBB.txt", what="character")
>>>> Read 343 items
>>>>> nAttrs$fillcolor <- scan(file="343forColorBB.txt", what="character")
>>>> Read 343 items
>>>>
>>>> Where in 343ForColorBB.txt I set the 343 to red or blue, it is a text file
>>>> with 343 items like GO:0000002="red"
>>>>
>>>> Then I do
>>>>> postscript ("bpCutLeavestreeTest.ps", width=100, height = 100,
>>>> paper="special"); plot (bpCutLeavestree, nodeAttrs = nAttrs); dev.off()
>>>>
>>>> But it returns
>>>>
>>>> Error in buildNodeList(graph, nodeAttrs, subGList, attrs$node) :
>>>>   the character vector must have names
>>>>
>>>> I am stuck...
>>>> What have I left out
>>>>
>>>> Thank you
>>>>
>>>> Loren Engrav
>>>> Univ Wash
>>>> Seattle
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at stat.math.ethz.ch
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>> -- 
>> Robert Gentleman, PhD
>> Program in Computational Biology
>> Division of Public Health Sciences
>> Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N, M2-B876
>> PO Box 19024
>> Seattle, Washington 98109-1024
>> 206-667-7700
>> rgentlem at fhcrc.org
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 

-- 
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024
206-667-7700
rgentlem at fhcrc.org



More information about the Bioconductor mailing list