[R] approxfun-problems (yleft and yright ignored)

Samuel Wuest wuests at tcd.ie
Wed Aug 25 16:20:09 CEST 2010


Dear all,

I have run into a problem when running some code implemented in the
Bioconductor panp-package (applied to my own expression data), whereby gene
expression values of known true negative probesets (x) are interpolated onto
present/absent p-values (y) between 0 and 1 using the *approxfun -
function*{stats}; when I have used R version 2.8, everything had
worked fine,
however, after updating to R 2.11.1., I got unexpected output (explained
below).

Please correct me here, but as far as I understand, the yleft and yright
arguments set the extreme values of the interpolated y-values in case the
input x-values (on whose approxfun is applied) fall outside range(x). So if
I run approxfun with yleft=1 and yright=0 with y-values between 0 and 1,
then I should never get any values higher than 1. However, this is not the
case, as this code-example illustrates:

> ### define the x-values used to construct the approxfun, basically these
are 2000 expression values ranging from ~ 3 to 7:
> xNeg <- NegExprs[, 1]
> xNeg <- sort(xNeg, decreasing = TRUE)
>
> ### generate 2000 y-values between 0 and 1:
> yNeg <- seq(0, 1, 1/(length(xNeg) - 1))
> ### define yleft and yright as well as the rule to clarify what should
happen if input x-values lie outside range(x):
> interp <- approxfun(xNeg, yNeg, yleft = 1, yright = 0, rule=2)
Warning message:
In approxfun(xNeg, yNeg, yleft = 1, yright = 0, rule = 2) :
  collapsing to unique 'x' values
> ### apply the approxfun to expression data that range from ~2.9 to 13.9
and can therefore lie outside range(xNeg):
>  PV <- sapply(AllExprs[, 1], interp)
> range(PV)
[1]    0.000 6208.932
> summary(PV)
     Min.   1st Qu.    Median      Mean   3rd Qu.      Max.
0.000e+00 0.000e+00 2.774e-03 1.299e+00 3.164e-01 6.209e+03

So the resulting output PV object contains data ranging from 0 to 6208, the
latter of which lies outside yleft and is not anywhere close to extreme
y-values that were used to set up the interp-function. This seems wrong to
me, and from what I understand, yleft and yright are simply ignored?

I have attached a few histograms that visualize the data distributions of
the objects I xNeg, yNeg, AllExprs[,1] (== input x-values) and PV (the
output), so that it is easier to make sense of the data structures...

Does anyone have an explanation for this or can tell me how to fix the
problem?

Thanks a million for any help, best, Sam

> sessionInfo()
R version 2.11.1 (2010-05-31)
x86_64-apple-darwin9.8.0

locale:
[1] en_IE.UTF-8/en_IE.UTF-8/C/C/en_IE.UTF-8/en_IE.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] panp_1.18.0   affy_1.26.1   Biobase_2.8.0

loaded via a namespace (and not attached):
[1] affyio_1.16.0         preprocessCore_1.10.0


-- 
-----------------------------------------------------
Samuel Wuest
Smurfit Institute of Genetics
Trinity College Dublin
Dublin 2, Ireland
Phone: +353-1-896 2444
Web: http://www.tcd.ie/Genetics/wellmer-2/index.html
Email: wuests at tcd.ie
------------------------------------------------------


More information about the R-help mailing list