[BioC] biomaRt -> XML::addNode masks graph [was 'slow insertions']
Paul Shannon
pshannon at systemsbiology.org
Tue Sep 11 23:58:16 CEST 2007
Hi Seth,
If you are digging around in the innards of the graph package, I have
another suggestion -- an imperfect one -- to suggest.
I sometimes use biomaRt and graph in the same project. biomaRt
requires XML which has, like graph, a method called 'addMode':
library (biomaRt)
Loading required package: XML
Attaching package: 'XML'
The following object(s) are masked from package:graph :
addNode
I can work around this just fine by calling graph::addNode, but maybe
an alias could be adopted as well, and then favored over the long
term -- 'add.node' or some such thing.
Or maybe this isn't worth bothering with.
- Paul
> Hi Paul,
>
> Paul Shannon <pshannon at systemsbiology.org> writes:
>> It took nearly 24 hours (!) to create a 16k node graph using two
>> different techniques:
>>
>> g = fromGXL (file ('someFile.gxl'))
>>
>> and
>>
>> g = new ('graphNEL', edgemode='undirected')
>> edgeDataDefaults (g, attr='edgeType') = 'edge'
>> edgeDataDefaults (g, attr='source') = 'unknown'
>>
>> ...
>>
>> for (r in 1:max) {
>> ...
>> g = addNode (a, g)
>> g = addNode (b, g)
>> g = addEdge (a, b, g)
>> edgeData (g, a, b, 'source') = source
>> edgeData (g, a, b, 'edgeType') = method
>> }
>>
>> The 16k nodes and their edges are from a suitably parsed version of
>> all of the reactions
>> reported by KEGG.
>>
>> Is this user error, user misconception, ... or maybe an inefficiency
>> that future versions
>> of the graph package could improve upon?
>
> It looks like fromGXL is doing something quite similar to the for loop
> you describe above. As you have demonstrated, this is not the most
> efficient way to construct graph objects. The immediate reason is
> that each call to addNode, addEdge, and edgeData creates a new copy of
> the _entire_ graph.
>
> We will look into this and see if we can provide some relief for
> fromGXL. Thanks for the report.
>
> + seth
>
> --
> Seth Falcon | Computational Biology | Fred Hutchinson Cancer
> Research Center
> BioC: http://bioconductor.org/
> Blog: http://userprimary.net/user/
More information about the Bioconductor
mailing list