[BioC] bug in graph::edgeData()

Robert Castelo robert.castelo at upf.edu
Mon Feb 18 16:54:38 CET 2013


hi Paul,

thanks for the workaround, it will help me in the meantime, but i'm 
definitely more interested in the graphBAM class for the compact 
representation that it offers. i look forward to your news on this.

cheers,
robert.

On 02/16/2013 01:31 AM, Paul Shannon wrote:
> Hi Robert,
>
> Thanks for the bug report.  I reproduced the problem and made some progress -- but not enough -- in unraveling the faulty logic which causes it.  I offer here a workaround which might help you proceed with your work while I continue to work on the bug.
>
> The workaround depends upon converting your graphBAM to a graphAM, as demonstrated below.  If your graph is very large, this may not be practical.
>
> In the code below, I reproduce the error, convert the graphBAM to a graphAM, then get the right result.
>
> I will continue working on the bug.  Let us know if this workaround is helpful.
>
>   - Paul
>
>
>
> library(graph)
> df<- data.frame(from=c("a", "b", "c"),
>                   to=c("b", "c", "d"),
>                   weight=rep(1, 3), stringsAsFactors=FALSE)
> g.orig<- graphBAM(df)
> g<- g.orig
> edgeDataDefaults(g, attr="a")<- 0
> edgeData(g, from="a", to="b", attr="a")<- 1
> edgeData(g, attr="a", from="a")
>    # $`a|b`
>    # [1] 1
>
> edgeData(g, attr="a", from="a", to="b")  # bug
>    # $`a|b`
>    # [1] 0
>
>
> g<- as(g.orig, "graphAM")
> edgeDataDefaults(g, attr="a")<- 0
> edgeData(g, from="a", to="b", attr="a")<- 1
> edgeData(g, attr="a", from="a")
>    # $`a|b`
>    # [1] 1
>
> edgeData(g, attr="a", from="a", to="b")  # no bug
>    # $`a|b`
>    # [1] 1
>
>
> On Feb 15, 2013, at 2:49 AM, Robert Castelo wrote:
>
>> hi,
>>
>> the function edgeData() from the Bioconductor graph package seems to have a problem with the way it stores and retrieves edge atributes. After investigating the issue i think it has to do with setting and retrieving edge attributes for a subset of the edges, as opposed to doing it for all edges at once. here is a minimal example that reproduces the problem:
>>
>> library(graph)
>>
>> df<- data.frame(from=c("a", "b", "c"),
>>                  to=c("b", "c", "d"),
>>                  weight=rep(1, 3), stringsAsFactors=FALSE)
>> g<- graphBAM(df)
>> ## this builds the undirected graph a-b-c-d
>>
>> ## set a new edge attribute "a" with 0 by default
>> edgeDataDefaults(g, attr="a")<- 0
>>
>> ## set the "a" attribute of all edges to 1
>> edgeData(g, from=df$from, to=df$to, attr="a")<- 1
>>
>> ## show the value of the "a" attribute for all edges,
>> ## everything works as expected
>> edgeData(g, from=df$from, to=df$to, attr="a")
>> $`a|b`
>> [1] 1
>>
>> $`b|c`
>> [1] 1
>>
>> $`c|d`
>> [1] 1
>>
>> ## now repeat the operation but setting the "a" attribute
>> ## only for the first edge a-b
>>
>> g<- graphBAM(df)
>>
>> edgeDataDefaults(g, attr="a")<- 0
>>
>> edgeData(g, from=df$from[1], to=df$to[1], attr="a")<- 1
>>
>> edgeData(g, from=df$from, to=df$to, attr="a")
>> $`a|b`
>> [1] 0
>>
>> $`b|c`
>> [1] 1
>>
>> $`c|d`
>> [1] 0
>>
>>
>> as you see, the value 1 is not set for the first edge "a|b" but for the second "b|c". if i repeat the operation setting the edge attribute "a" for the last two edges, it goes also wrong:
>>
>> g<- graphBAM(df)
>>
>> edgeDataDefaults(g, attr="a")<- 0
>>
>> edgeData(g, from=df$from[2:3], to=df$to[2:3], attr="a")<- 1
>>
>> edgeData(g, from=df$from, to=df$to, attr="a")
>> $`a|b`
>> [1] 1
>>
>> $`b|c`
>> [1] 0
>>
>> $`c|d`
>> [1] 1
>>
>> since the attribute is set to edges "a|b" and "c|d" while it should have been set to "b|c" instead of "a|b".
>>
>> i put my sessionInfo() below which correspond to the release version of the package but i can also reproduce it in the devel version.
>>
>> thanks!
>> robert.
>>
>> R version 2.15.1 (2012-06-22)
>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>
>> locale:
>> [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>> [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>> [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>> [7] LC_PAPER=C                 LC_NAME=C
>> [9] LC_ADDRESS=C               LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>
>> other attached packages:
>> [1] graph_1.36.2   vimcom_0.9-7   setwidth_1.0-3 colorout_0.9-9
>>
>> loaded via a namespace (and not attached):
>> [1] BiocGenerics_0.4.0 stats4_2.15.1      tools_2.15.1
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>

-- 
Robert Castelo, PhD
Associate Professor
Dept. of Experimental and Health Sciences
Universitat Pompeu Fabra (UPF)
Barcelona Biomedical Research Park (PRBB)
Dr Aiguader 88
E-08003 Barcelona, Spain
telf: +34.933.160.514
fax: +34.933.160.550



More information about the Bioconductor mailing list