[BioC] KEGGGraph: some complexed proteins are orphans in graphNEL
    Paul Shannon 
    pshannon at systemsbiology.org
       
    Fri May  1 00:23:03 CEST 2009
    
    
  
We have been using the admirable KEGGGraph package to obtain pathways  
in graphNEL form.  It is very useful.
mTor is the signalling pathway we are working with: http://www.genome.jp/dbget-bin/get_pathway?org_name=hsa&mapno=04150
We find that proteins which appear only as members of a complex are  
orphans in the graphNEL.
For instance, "hsa:7248" (TSC1) forms a complex with "hsa: 
7249" (TSC2).  TSC2 is well connected, but its complex partner TSC1
is an orphan.
There are a number of ways to handle this, some quite sophisticated,  
some not.  Once could define a node for the complex, create edges to  
that node, and then specify (with a 'complex membership' edge) that  
TSC1 and TSC2 both belong.
mTor presents a good (though challenging) use case: there are two  
differently-acting complexes which include mTor and GBL.  The third  
member of the complex is different, however, as are the interactions  
the two complexes participate in.   This seems to argue for 'complex'  
being a node type.
One simple improvement, which solves some of the 'orphan complex node'  
problem, could be this workaround:  all members of each complex  
participate in all the interactions which belong to the complex.
Here is some incomplete (but suggestive) evidence of the orphan status  
of TSC1.  A more detailed search reveals that TSC1 is not found in the  
target nodes of any of the edges of g.mTor.
f <- '~/s/data/public/kegg/hsa04150.xml'
g.mTor <- parseKGML2Graph (f)
tsc1 <- 'hsa:7248'
tsc2 <- 'hsa:7249'
tsc1 %in% nodes (g.mTor)  #  TRUE
tsc2 %in% nodes (g.mTor)  #  TRUE
tsc2 %in% names (edges (g.mTor)) # TRUE
tsc1 %in% names (edges (g.mTor)) # TRUE
edges (g.mTor)[[tsc1]]   # character(0)
edges (g.mTor)[[tsc2]]   # "hsa:6009"
Thanks,
  - Paul
sessionInfo ()
R version 2.9.0 (2009-04-17)
i386-apple-darwin8.11.1
locale:
en_US/en_US/en_US/C/en_US/en_US
attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base
other attached packages:
  [1] RBGL_1.20.0         gaggle_1.12.0       rJava_0.6-2          
org.Hs.eg.db_2.2.6  RUnit_0.4.22        KEGG.db_2.2.5        
RSQLite_0.7-1
  [8] DBI_0.2-4           AnnotationDbi_1.6.0 Biobase_2.4.0        
KEGGgraph_1.0.0     graph_1.22.0        XML_2.3-0
loaded via a namespace (and not attached):
[1] cluster_1.11.13 tools_2.9.0
    
    
More information about the Bioconductor
mailing list