[BioC] interactions of range(flowFrame) with transform, densityplot and splom

Mroz, Edmund A. EMROZ at PARTNERS.ORG
Wed Dec 8 16:26:58 CET 2010


In using the flowCore and flowViz packages, I came up against two problems,
apparently related to how transform() re-sets ranges, and how densityplot() and
splom() get their limits from range().

I was wondering if I was doing something wrong, if these are necessary features,
or are inadvertent glitches. I haven't seen them mentioned in the archives.

They can be seen with a subset (Time,SSC-H,FL2-A) of a flowFrame from the GvHD
data.

> library(flowViz)
> data(GvHD)
> f1<-GvHD[[1]][,c(8,2,6)]

Problem 1: transform() and range(), with densityplot()
Create two derived columns with transform():

> f1<-transform(f1,sumCol=`SSC-H`+`FL2-A`,diffCol=`SSC-H`-`FL2-A`)

splom shows a wide range of all data

> splom(f1)

but densityplot() gives a trivial plot for diffCol, not representative of the
data. Compare:

> densityplot(~.,f1)
> hist(exprs(f1)[,"diffCol"])

This might arise from densityplot() taking limits from the range of the
flowFrame, which gave the following after transform():

> range(f1)
    Time SSC-H FL2-A sumCol diffCol
min    0     0     0      0       0
max 1023  1023  1023   2046       0

The values for range(f1) in the added columns seem to have been obtained by
arithmetic on the ranges of the source columns, rather than obtained from ranges
after transformation.

I understand that is the desired result for single-column data transformations,
but this poses a problem for on-the-fly plotting of transformations involving
multiple columns of flowFrames and flowSets. (I originally found it when taking
the difference of 2 log-transformed columns, to get the log of their ratio.)

-------------------

Problem 2: splom() and plotting range
I found this problem when my data had Time as the first data column, with a wide
time range. To illustrate, simply change the "maxRange" for Time in the above
example:

> pData(parameters(f1))[1,"maxRange"]<-10000
> range(f1)
     Time SSC-H FL2-A sumCol diffCol
min     0     0     0      0       0
max 10000  1023  1023   2046       0

Now, with:

> splom(f1)

the limits on the SSC-H axes (and for density calculation) seem to have been
taken from the first data column (Time), not from the SSC-H column, constricting
the output and leading to error messages from KernSmooth::bkde2D().

My initial look at the code for splom(flowFrame) suggests that scales for splom
are obtained from all data columns, even when exclude.time=TRUE, although I
might be misinterpreting the code.

If my interpretation is correct, this poses a problem for splom() if Time isn't
the last column and exclude.time=TRUE.

My sessionInfo() follows. Thanks for your help.

Ed Mroz
Surgical Oncology Research
MGH Cancer Center
Boston, MA

> sessionInfo()
R version 2.12.0 (2010-10-15)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] flowViz_1.14.0     lattice_0.19-13    flowCore_1.16.0
[4] rrcov_1.1-00       pcaPP_1.8-3        mvtnorm_0.9-92
[7] robustbase_0.5-0-1 Biobase_2.10.0

loaded via a namespace (and not attached):
[1] feature_1.2.5       graph_1.28.0        grid_2.12.0
[4] KernSmooth_2.23-4   ks_1.7.4            latticeExtra_0.6-14
[7] MASS_7.3-8          RColorBrewer_1.0-2  stats4_2.12.0
[10] tools_2.12.0



The information in this e-mail is intended only for the ...{{dropped:11}}



More information about the Bioconductor mailing list