[Bioc-devel] Bug in les:::cdfDuplicates

Ryan C. Thompson rct at thompsonclan.org
Mon Mar 24 19:14:30 CET 2014


Hello,

I have discovered a bug in the cdfDuplicates function in the les 
package. This function is used indirectly by the GSRI package, and I was 
attempting to use this package when I encountered an error. The error 
appears to occur because both rle and table are used to deduplicate a 
(sorted) vector, and these two functions apparently fail to use the same 
definition of equality for floating point values. This results in two 
different length vectors, which raises an error when they are passed to 
rep.int, which requires vectors of the same length. Replacing 
rle(pvalSort)$length with table(pvalSort) seems to solve the problem. I 
have compiled my test case into an RDS file that you can download and 
use to reproduce the bug:

https://www.dropbox.com/s/k7k1m3s28aa4ajb/GSRI-les-cdfDuplicates-error-case.RDS

This RDS file contains the full argument list that I pass to the "gsri" 
function to reproduce the error. Just download it, then execute the 
following R code:

library(GSRI)
do.call(gsri, readRDS("GSRI-les-cdfDuplicates-error-case.RDS"))

After making the suggested change, this test case now works properly. 
The expression data is my own, and the gene set is MSigDB ID 
"AAAYRNCTG_UNKNOWN", with the gene IDs converted to my organism 
(cynomolgus monkey, whose genes are annotated with orthologous Ensembl 
Peptide IDs from human & rhesus).

-Ryan



More information about the Bioc-devel mailing list