[BioC] bug in graph::addNode() w/ graph::edgeData()

Robert Castelo robert.castelo at upf.edu
Fri Nov 8 16:03:02 CET 2013


hi Paul,

thanks for the suggestion, i had no experience with unit testing, it's 
indeed a great way to find out unwanted side effects of putative 
bugfixes. i should incorporate this in my own packages.

i ran it with the fix and 2 tests failed, the relevant output is this:

FAILURE in test_graphBAM_addNode1: Error in checkEquals(target, current) 
: Names: 4 string mismatches
FAILURE in testInEdges: Error in checkEquals("not a node: 'not-a-node'", 
conditionMessage(ans)) :

i ran also the unit tests with the original version 1.40.0 as a control 
and they gave 1 of these two failures as well:

FAILURE in testInEdges: Error in checkEquals("not a node: 'not-a-node'", 
conditionMessage(ans)) :

so i guess only the first one points to a real problem with the fix i 
proposed below.

i look forward to the final fix :)

cheers,
robert.

On 11/07/2013 11:45 PM, Paul Shannon wrote:
> Hi Robert,
>
> Thanks for the graphBAM bug report -- delivered, as always, in a truly helpful and reproducible manner, and with a proposed fix!.   I will look into this and see if I have anything to add.  If you have a moment, and have not already given it a try, you could test out your fix by running the unit tests:
>
>> library(graph)
>> BiocGenerics:::testPackage("graph", pattern="_test.R")
>
> Thank you!
>
>   - Paul
>
> On Nov 7, 2013, at 7:04 AM, Robert Castelo wrote:
>
>> hi,
>>
>> i have found a problem when using the functions addNode() and edgeData() from the graph package, which i believe is a bug and it can be reproduced with the following minimal example:
>>
>> library(graph)
>>
>> ## build a graphBAM object with vertices a, b, d, e and edges a-d, b-e
>> df<- data.frame(from=c("a", "b"),
>>                  to  =c("d", "e"),
>>                  weight=rep(1, 2))
>> g<- graphBAM(df)
>> nodes(g)
>> [1] "a" "b" "d" "e"
>>
>> ## add a numerical attribute to one of the edges
>> edgeDataDefaults(g, "x")<- NA_real_
>> edgeData(g, from="a", to="d", "x")<- 1
>> unlist(edgeData(g, attr="x"))
>> a|d b|e d|a e|b
>>   1  NA   1  NA
>>
>> ## add an extra node f to the graphBAM object and fetch edge attributes
>> gOK<- addNode("f", g)
>> nodes(gOK)
>> [1] "a" "b" "d" "e" "f"
>> unlist(edgeData(gOK, attr="x"))
>> a|d b|e d|a e|b
>>   1  NA   1  NA
>>
>> ## now comes the bug ...
>>
>> ## add an extra node c to the graphBAM object and fetch edge attributes
>> gBUG<- addNode("c", g)
>> nodes(gBUG)
>> [1] "a" "b" "c" "d" "e"
>> unlist(edgeData(gBUG, attr="x"))
>> Error in data.frame(ft, tmp, stringsAsFactors = FALSE) :
>>   arguments imply differing number of rows: 4, 6
>> traceback()
>> 7: stop(gettextf("arguments imply differing number of rows: %s",
>>        paste(unique(nrows), collapse = ", ")), domain = NA)
>> 6: data.frame(ft, tmp, stringsAsFactors = FALSE)
>> 5: .eAttrsFun(self, from = names(edges(self)), attr = attr)
>> 4: as.list(.eAttrsFun(self, from = names(edges(self)), attr = attr))
>> 3: edgeData(gBUG, attr = "x")
>> 2: edgeData(gBUG, attr = "x")
>> 1: unlist(edgeData(gBUG, attr = "x"))
>>
>> from this output, my guess is that the problem is related to the first line of code of the addNode() method in the R/methods-graphBAM.R file:
>>
>> setMethod("addNode",
>>         signature(node="character", object="graphBAM", edges="missing"),
>>         function(node, object) {
>>
>>             nds<- sort(unique(c(nodes(object), node)))
>> [...]
>>
>> because if i simply remove the call to the function sort, i.e., replacing this line by
>>
>>             nds<- unique(c(nodes(object), node))
>>
>> then the problem is solved. however, i don't know whether this has other consequences related to the design of the package. upfront, i do not see a reason why, internally to the package, vertices should be alphabetically ordered.
>>
>> best regards,
>> robert.
>> ps: sessionInfo()
>> R version 3.0.2 (2013-09-25)
>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>
>> locale:
>> [1] LC_CTYPE=en_US.UTF8       LC_NUMERIC=C LC_TIME=en_US.UTF8        LC_COLLATE=en_US.UTF8
>> [5] LC_MONETARY=en_US.UTF8    LC_MESSAGES=en_US.UTF8 LC_PAPER=en_US.UTF8       LC_NAME=C
>> [9] LC_ADDRESS=C              LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>
>> other attached packages:
>> [1] graph_1.40.0   vimcom_0.9-91  setwidth_1.0-3 colorout_1.0-1
>>
>> loaded via a namespace (and not attached):
>> [1] BiocGenerics_0.8.0 parallel_3.0.2     stats4_3.0.2       tools_3.0.2
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>

-- 
Robert Castelo, PhD
Associate Professor
Dept. of Experimental and Health Sciences
Universitat Pompeu Fabra (UPF)
Barcelona Biomedical Research Park (PRBB)
Dr Aiguader 88
E-08003 Barcelona, Spain
telf: +34.933.160.514
fax: +34.933.160.550



More information about the Bioconductor mailing list