[Rd] bug (PR#13570)
Duncan Murdoch
murdoch at stats.uwo.ca
Fri Mar 6 02:09:57 CET 2009
On 05/03/2009 9:42 AM, Ryan Hafen wrote:
> On Mar 5, 2009, at 7:59 AM, Prof Brian Ripley wrote:
>
>> On Thu, 5 Mar 2009, Peter Dalgaard wrote:
>>
>>> Prof Brian Ripley wrote:
>>>> Undortunately the example is random, so not really reproducible
>>>> (and I
>>>> see nothing wrong on my Mac). However, Linux valgrind on R-devel is
>>>> showing a problem:
>>>>
>>>> ==3973== Conditional jump or move depends on uninitialised value(s)
>>>> ==3973== at 0xD76017B: ehg141_ (loessf.f:532)
>>>> ==3973== by 0xD761600: lowesa_ (loessf.f:769)
>>>> ==3973== by 0xD736E47: loess_raw (loessc.c:117)
>>>>
>>>> (The uninitiialized value is in someone else's code and I suspect
>>>> it was
>>>> either never intended to work or never tested.) No essential
>>>> change has
>>>> been made to the loess code for many years.
>>>>
>>>> I would not have read the documentation to say that degree = 0 was a
>>>> reasonable value. It is not to my mind 'a polynomial surface', and
>>>> loess() is described as a 'local regression' for degree 1 or 2 in
>>>> the
>>>> reference. So unless anyone wants to bury their heads in that
>>>> code I
>>>> think a perfectly adequate fix would be to disallow degree = 0.
>>>> (I vaguely recall debating allowing in the code ca 10 years ago.)
>>> The code itself has
>>>
>>> if (!match(degree, 0:2, 0))
>>> stop("'degree' must be 0, 1 or 2")
>>>
>>> though. "Local fitting of a constant" essentially becomes kernel
>>> smoothing, right?
>> I do know the R code allows it: the question is whether it is worth
>> the effort of finding the problem(s) in the underlying c/dloess
>> code, whose manual (and our reference) is entirely about 1 or 2. I
>> am concerned that there may be other things lurking in the degree=0
>> case if it was never tested (in the netlib version: I am sure it was
>> only minmally tested through my R interface).
>>
>> I checked the original documentation on netlib and that says
>>
>> 29 DIM dimension of local regression
>> 1 constant
>> d+1 linear (default)
>> (d+2)(d+1)/2 quadratic
>> Modified by ehg127 if cdeg<tdeg.
>>
>> which seems to confirm that degree = 0 was intended to be allowed,
>> and what I dimly recall from ca 1998 is debating whether the R code
>> should allow that or not.
>>
>> If left to me I would say I did not wish to continue to support
>> degree = 0.
>
> True. There are plenty of reasons why one wouldn't want to use
> degree=0 anyway. And I'm sure there are plenty of other simple ways
> to achieve the same effect.
>
> I ran into the problem because some code I'm planning on distributing
> as part of a paper submission "blends" partway down to degree 0
> smoothing at the endpoints to reduce the variance. The only bad
> effect of disallowing degree 0 is for anyone with code depending on
> it, although there are probably few that use it and better to disallow
> than to give an incorrect computation. I got around the problem by
> installing a modified loess by one of Cleveland's former students: https://centauri.stat.purdue.edu:98/loess/
> (but don't want to require others who use my code to do so as well).
>
> What is very strange to me is that it has been working fine in
> previous R versions (tested on 2.7.1 and 2.6.1) and nothing has
> changed in the loess source but yet it is having problems on 2.8.1.
> Would this suggest it not being a problem with the netlib code?
>
> Also strange that it reportedly works on Linux but not on Mac or
> Windows. On the mac, the effect was much smaller. With windows, it
> was predicting values like 2e215 whereas on the mac, you would almost
> believe the results were legitimate if you didn't think about the fact
> that a weighted moving average involving half the data shouldn't
> oscillate so much.
I think it's pretty clear that it's using an uninitialized value. On
other systems (and previous versions) we've just been lucky, and those
locations held values like 0.0 that didn't matter.
> If the consensus is to keep degree=0, I'd be happy to help try to find
> the problem or provide a test case or something. Thanks for looking
> into this.
I'd say right now the consensus among R core members is that nobody
wants to support degree=0, but if you're volunteering, the consensus
could change.
Duncan Murdoch
>
> Ryan
>
>
>
>>>
>>>> On Thu, 5 Mar 2009, Uwe Ligges wrote:
>>>>
>>>>> Berwin A Turlach wrote:
>>>>>> G'day Peter,
>>>>>>
>>>>>> On Thu, 05 Mar 2009 09:09:27 +0100
>>>>>> Peter Dalgaard <p.dalgaard at biostat.ku.dk> wrote:
>>>>>>
>>>>>>> rhafen at stat.purdue.edu wrote:
>>>>>>>> <<insert bug report here>>
>>>>>>>>
>>>>>>>> This is a CRITICAL bug!!! I have verified it in R 2.8.1 for mac
>>>>>>>> and for windows. The problem is with loess degree=0 smoothing.
>>>>>>>> For example, try the following:
>>>>>>>>
>>>>>>>> x <- 1:100
>>>>>>>> y <- rnorm(100)
>>>>>>>> plot(x, y)
>>>>>>>> lines(predict(loess(y ~ x, degree=0, span=0.5)))
>>>>>>>>
>>>>>>>> This is obviously wrong.
>>>>>>> Obvious? How? I don't see anything particularly odd (on Linux).
>>>>>> Neither did I on linux; but the OP mentioned mac and windows. On
>>>>>> windows, on running that code, the lines() command added a lot of
>>>>>> vertical lines; most spanning the complete window but some only
>>>>>> part.
>>>>>> Executing the code a second time (or in steps) gave sensible
>>>>>> results. My guess would be that some memory is not correctly
>>>>>> allocated or
>>>>>> initialised. Or is it something like an object with storage mode
>>>>>> "integer" being passed to a double? But then, why doesn't it
>>>>>> show on
>>>>>> linux?
>>>>>>
>>>>>> Happy bug hunting. If my guess is correct, then I have no idea
>>>>>> how to
>>>>>> track down such things under windows.....
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> Berwin
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-devel at r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>
>>>>> Please can you folks try under R-devel (to be R-2.9.0 in a couple
>>>>> of
>>>>> weeks) and report if you still see it. I do not under R-devel
>>>>> (but do
>>>>> under R-release), so my guess is that something called by loess()
>>>>> has
>>>>> been fixed in the meantime.
>>>>>
>>>>> Moreover it is not the plot stuff that was wrong under R-2.8.1
>>>>> (release) but the loess computations.
>>>>>
>>>>> Uwe Ligges
>>>>>
>>>>> ______________________________________________
>>>>> R-devel at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>
>>>
>>> --
>>> O__ ---- Peter Dalgaard Ă˜ster Farimagsgade 5, Entr.B
>>> c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
>>> (*) \(*) -- University of Copenhagen Denmark Ph: (+45)
>>> 35327918
>>> ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45)
>>> 35327907
>>>
>>>
>> --
>> Brian D. Ripley, ripley at stats.ox.ac.uk
>> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
>> University of Oxford, Tel: +44 1865 272861 (self)
>> 1 South Parks Road, +44 1865 272866 (PA)
>> Oxford OX1 3TG, UK Fax: +44 1865 272595
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list