[Bioc-devel] Bug in frmaTools

Tue Jul 7 19:08:31 CEST 2015

Actually, the code looks like it should be adding these names, so I need to
go back through my code and get back to you on that issue.
On Jul 7, 2015 9:58 AM, "Ryan C. Thompson" <rct at thompsonclan.org> wrote:

> I've also encoundered another problem, this time I believe related to a
> change the the frma package which requires a corresponding change in the
> frmaTools package. Apparently frma has added a check for equality of probe
> names in various places, and frmaTools does not store these names, which
> means that the current version of frma will reject any vectors generated by
> the current version of frmaTools with the following error:
>
> > eset <- frma(affy[,1:3] , input.vecs=myFrmaToolsVecs)
> Error in frmaAffyBatch(object, background, normalize, summarize,
> input.vecs,  :
>   Mismatch between pmindex(object) and names of input.vecs and unable to
> create unique mapping.
> > traceback()
> 3: stop("Mismatch between pmindex(object) and names of input.vecs and
> unable to create unique mapping.")
> 2: frmaAffyBatch(object, background, normalize, summarize, input.vecs,
>        output.param, verbose)
> 1: frma(affy[, 1:3], input.vecs = myFrmaToolsVecs)
>
> > sessionInfo()
> R version 3.2.0 (2015-04-16)
> Platform: x86_64-unknown-linux-gnu (64-bit)
> Running under: Ubuntu 15.04
>
> locale:
>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] grDevices datasets  parallel  graphics  stats4    stats utils
> [8] methods   base
>
> other attached packages:
>  [1] frmaTools_1.20.0
>  [2] hgu133afrmavecs_1.5.0
>  [3] hthgu133pluspmcdf_2.16.0
>  [4] DSalomon.PAX.hthgu133pluspmfrmavecs_0.1
>  [5] frma_1.20.0
>  [6] affy_1.46.1
>  [7] Biobase_2.28.0
>  [8] openxlsx_2.5.10
>  [9] foreach_1.4.2
> [10] plyr_1.8.3
> [11] stringr_1.0.0
> [12] IRanges_2.2.4
> [13] ggplot2_1.0.1
> [14] S4Vectors_0.6.0
> [15] BiocGenerics_0.14.0
> [16] BiocInstaller_1.18.3
>
> loaded via a namespace (and not attached):
>  [1] Rcpp_0.11.6           compiler_3.2.0 GenomeInfoDb_1.4.1
>  [4] XVector_0.8.0         iterators_1.0.7 tools_3.2.0
>  [7] zlibbioc_1.14.0       digest_0.6.8 bit_1.1-12
> [10] RSQLite_1.0.0         preprocessCore_1.30.0 gtable_0.1.2
> [13] ff_2.2-13             DBI_0.3.1 proto_0.3-10
> [16] affxparser_1.40.0     Biostrings_2.36.1 grid_3.2.0
> [19] AnnotationDbi_1.30.1  oligo_1.32.0 reshape2_1.4.1
> [22] magrittr_1.5          splines_3.2.0 scales_0.2.5
> [25] codetools_0.2-11      oligoClasses_1.30.0 MASS_7.3-41
> [28] GenomicRanges_1.20.5  colorspace_1.2-6 stringi_0.5-2
> [31] munsell_0.4.2         affyio_1.36.0
>
> On 07/07/2015 06:52 AM, Matthew McCall wrote:
>
>> Ryan,
>>
>> Thanks for pointing these out. I'll look into them soon, but I imagine
>> your assessment is correct.
>>
>> Best,
>> Matt
>>
>>
>>
>> On Mon, Jul 6, 2015 at 5:42 PM, Ryan C. Thompson <rct at thompsonclan.org
>> <mailto:rct at thompsonclan.org>> wrote:
>>
>>     I also discovered another apparent bug later in the same function.
>>     The second to last line of makeVectorsAffyBatch is
>>
>>         vers <- ifelse(!is.null(cdfname),
>>     as.character(packageVersion(cdfname)), "")
>>
>>     If cdfname is NULL, this line will throw an error because the
>>     second argument to ifelse will have length zero and "ifelse" does
>>     NOT do lazy evaluation.
>>
>>
>>     On 07/06/2015 12:21 PM, Ryan C. Thompson wrote:
>>
>>         Hello,
>>
>>         I just encountered a bug in frmaTools that makes it impossible
>>         to use on certain array platforms. The following lines in
>>         makeVectorsAffyBatch fail on an AffyBatch object on the
>>         hthgu133pluspm platform:
>>
>>             pms <- pm(object)
>>             pns <- probeNames(object)
>>             pmi <- unlist(pmindex(object))
>>             if (!identical(as.character(pmi), rownames(pms)))
>>                 stop("Mismatch between pmindex and rownames of pms")
>>
>>         I isolated the problem to five probes:
>>
>>         > i <- which(as.character(pmi) != rownames(pms))
>>         > pmi[i]
>>          1564498_PM_at9 205398_PM_s_at8 217695_PM_x_at7
>>         223446_PM_s_at7 237802_PM_at3
>>                   3e+05           5e+05           2e+05 1e+05 4e+05
>>         > rownames(pms)[i]
>>         [1] "300000" "500000" "200000" "100000" "400000"
>>         > as.character(pmi)[i]
>>         [1] "3e+05" "5e+05" "2e+05" "1e+05" "4e+05"
>>
>>         As you can see, the problem is that as.character will happily
>>         use scientific notation when it feels like it, which then
>>         fails a test for string equality. I believe the solution is to
>>         replace that test with:
>>
>>         all(sprintf("%i", pmi) == rownames(pms))
>>
>>         -Ryan
>>
>>
>>     _______________________________________________
>>     Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org> mailing
>>     list
>>     https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>>
>>
>>
>> --
>> Matthew N McCall, PhD
>> 6 Bridgewood Dr.
>> Fairport, NY 14450
>> Cell: 202-222-5880
>>
>>
>

	[[alternative HTML version deleted]]