[R] approxfun-problems (yleft and yright ignored)
Duncan Murdoch
murdoch.duncan at gmail.com
Sat Sep 11 17:08:56 CEST 2010
On 11/09/2010 10:53 AM, Martin Maechler wrote:
>>>>>> "MM" == Martin Maechler <maechler at stat.math.ethz.ch>
>>>>>> on Sat, 11 Sep 2010 16:04:37 +0200 writes:
>
>>>>>> "SW" == Samuel Wuest <wuests at tcd.ie>
>>>>>> on Thu, 26 Aug 2010 14:34:26 +0100 writes:
>
> SW> Hi Greg,
> SW> thanks for the suggestion:
>
> SW> I have attached some small dataset that can be used to reproduce the
> SW> odd behavior of the approxfun-function.
>
> SW> If it gets stripped off my email, it can also be downloaded at:
> SW> http://bioinf.gen.tcd.ie/approx.data.Rdata
>
> SW> Strangely, the problem seems specific to the data structure in my
> SW> expression set, when I use simulated data, everything worked fine.
>
> SW> Here is some code that I run and resulted in the strange output that I
> SW> have described in my initial post:
>
> >>> ### load the data: a list called approx.data
> >>> load(file="approx.data.Rdata")
> >>> ### contains the slots "x", "y", "input"
> >>> names(approx.data)
> SW> [1] "x" "y" "input"
> >>> ### with y ranging between 0 and 1
> >>> range(approx.data$y)
> SW> [1] 0 1
> >>> ### compare ranges of x and input-x values (the latter is a small subset of 500 data points):
> >>> range(approx.data$x)
> SW> [1] 3.098444 7.268812
> >>> range(approx.data$input)
> SW> [1] 3.329408 13.026700
> >>>
> >>>
> >>> ### generate the interpolation function (warning message benign)
> >>> interp <- approxfun(approx.data$x, approx.data$y, yleft=1, yright=0, rule=2)
> SW> Warning message:
> SW> In approxfun(approx.data$x, approx.data$y, yleft = 1, yright = 0, :
> SW> collapsing to unique 'x' values
> >>>
> >>> ### apply to input-values
> >>> y.out <- sapply(approx.data$input, interp)
> >>>
> >>> ### still I find output values >1, even though yleft=1:
> >>> range(y.out)
> SW> [1] 0.000000 7.207233
>
>
> MM> I get completely different (and correct) results,
> MM> by the way the *same* you have in the bug report you've
> MM> submitted
> MM> (https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=14377)
> MM> and which does *not* show any bug:
>
> >> range(y.out)
> MM> [1] 0.0000000 0.9816907
>
> MM> Of course, I do believe that you've seen the above problems,
> MM> -- on 64-bit Mac ? as you report in sessionInfo() ? --
> MM> but I cannot reproduce them.
>
> MM> And also, you seem yourself to be able to get different results
> MM> for the same data... what are the circumstances?
>
> I now see that you *did* mention the fact that you
> see *different* results when you *sort(levels(as.factor(approx.data$x)))[285:287]RE*run the same code
> on this data.
> The subject (" ...yleft and yright ignored") is misleading in
> any case. These are not at all ignored...,
> but indeed (as Duncan Murdoch has noted on the bug website),
> there *is* a bug,
> so you are principally right in reporting -- thank you!
> ....
> and I could also confirm that -- as you mentioned in your first
> post -- this bug does not appear when using R 2.8.1 (at least on
> my platform).
I see this too -- and it appears to be because as.factor() finds a
different number of levels in R-devel than it did in 2.8.1. In 2.8.1,
the level names are not unique, but in R-devel, they are, so there are
fewer of them.
In 2.8.1, I see
sort(levels(as.factor(approx.data$x)))[285:287]
[1] "3.97124402436476" "3.97124402436476" "3.97129741844245"
(notice the first two are identical), whereas in R-devel, we get
[1] "3.97124402436476" "3.97129741844245" "3.97408959567848"
from the same expression.
Duncan Murdoch
Duncan Murdoch
More information about the R-help
mailing list