[Rd] bug (PR#13570)

Duncan Murdoch murdoch at stats.uwo.ca
Fri Mar 6 02:09:57 CET 2009


On 05/03/2009 9:42 AM, Ryan Hafen wrote:
> On Mar 5, 2009, at 7:59 AM, Prof Brian Ripley wrote:
> 
>> On Thu, 5 Mar 2009, Peter Dalgaard wrote:
>>
>>> Prof Brian Ripley wrote:
>>>> Undortunately the example is random, so not really reproducible  
>>>> (and I
>>>> see nothing wrong on my Mac). However, Linux valgrind on R-devel is
>>>> showing a problem:
>>>>
>>>> ==3973== Conditional jump or move depends on uninitialised value(s)
>>>> ==3973==    at 0xD76017B: ehg141_ (loessf.f:532)
>>>> ==3973==    by 0xD761600: lowesa_ (loessf.f:769)
>>>> ==3973==    by 0xD736E47: loess_raw (loessc.c:117)
>>>>
>>>> (The uninitiialized value is in someone else's code and I suspect  
>>>> it was
>>>> either never intended to work or never tested.)  No essential  
>>>> change has
>>>> been made to the loess code for many years.
>>>>
>>>> I would not have read the documentation to say that degree = 0 was a
>>>> reasonable value. It is not to my mind 'a polynomial surface', and
>>>> loess() is described as a 'local regression' for degree 1 or 2 in  
>>>> the
>>>> reference.  So unless anyone wants to bury their heads in that  
>>>> code I
>>>> think a perfectly adequate fix would be to disallow degree = 0.
>>>> (I vaguely recall debating allowing in the code ca 10 years ago.)
>>> The code itself has
>>>
>>>   if (!match(degree, 0:2, 0))
>>>       stop("'degree' must be 0, 1 or 2")
>>>
>>> though. "Local fitting of a constant" essentially becomes kernel
>>> smoothing, right?
>> I do know the R code allows it: the question is whether it is worth  
>> the effort of finding the problem(s) in the underlying c/dloess  
>> code, whose manual (and our reference) is entirely about 1 or 2.  I  
>> am concerned that there may be other things lurking in the degree=0  
>> case if it was never tested (in the netlib version: I am sure it was  
>> only minmally tested through my R interface).
>>
>> I checked the original documentation on netlib and that says
>>
>> 29      DIM     dimension of local regression
>>                1               constant
>>                d+1             linear   (default)
>>                (d+2)(d+1)/2    quadratic
>>                Modified by ehg127 if cdeg<tdeg.
>>
>> which seems to confirm that degree = 0 was intended to be allowed,  
>> and what I dimly recall from ca 1998 is debating whether the R code  
>> should allow that or not.
>>
>> If left to me I would say I did not wish to continue to support  
>> degree = 0.
> 
> True.  There are plenty of reasons why one wouldn't want to use  
> degree=0 anyway.  And I'm sure there are plenty of other simple ways  
> to achieve the same effect.
> 
> I ran into the problem because some code I'm planning on distributing  
> as part of a paper submission "blends" partway down to degree 0  
> smoothing at the endpoints to reduce the variance.  The only bad  
> effect of disallowing degree 0 is for anyone with code depending on  
> it, although there are probably few that use it and better to disallow  
> than to give an incorrect computation.  I got around the problem by  
> installing a modified loess by one of Cleveland's former students: https://centauri.stat.purdue.edu:98/loess/ 
>   (but don't want to require others who use my code to do so as well).
> 
> What is very strange to me is that it has been working fine in  
> previous R versions (tested on 2.7.1 and 2.6.1) and nothing has  
> changed in the loess source but yet it is having problems on 2.8.1.   
> Would this suggest it not being a problem with the netlib code?
> 
> Also strange that it reportedly works on Linux but not on Mac or  
> Windows.  On the mac, the effect was much smaller. With windows, it  
> was predicting values like 2e215 whereas on the mac, you would almost  
> believe the results were legitimate if you didn't think about the fact  
> that a weighted moving average involving half the data shouldn't  
> oscillate so much.

I think it's pretty clear that it's using an uninitialized value.  On 
other systems (and previous versions) we've just been lucky, and those 
locations held values like 0.0 that didn't matter.

> If the consensus is to keep degree=0, I'd be happy to help try to find  
> the problem or provide a test case or something.  Thanks for looking  
> into this.

I'd say right now the consensus among R core members is that nobody 
wants to support degree=0, but if you're volunteering, the consensus 
could change.

Duncan Murdoch

> 
> Ryan
> 
> 
> 
>>>
>>>> On Thu, 5 Mar 2009, Uwe Ligges wrote:
>>>>
>>>>> Berwin A Turlach wrote:
>>>>>> G'day Peter,
>>>>>>
>>>>>> On Thu, 05 Mar 2009 09:09:27 +0100
>>>>>> Peter Dalgaard <p.dalgaard at biostat.ku.dk> wrote:
>>>>>>
>>>>>>> rhafen at stat.purdue.edu wrote:
>>>>>>>> <<insert bug report here>>
>>>>>>>>
>>>>>>>> This is a CRITICAL bug!!!  I have verified it in R 2.8.1 for mac
>>>>>>>> and for windows.  The problem is with loess degree=0 smoothing.
>>>>>>>> For example, try the following:
>>>>>>>>
>>>>>>>> x <- 1:100
>>>>>>>> y <- rnorm(100)
>>>>>>>> plot(x, y)
>>>>>>>> lines(predict(loess(y ~ x, degree=0, span=0.5)))
>>>>>>>>
>>>>>>>> This is obviously wrong.
>>>>>>> Obvious? How? I don't see anything particularly odd (on Linux).
>>>>>> Neither did I on linux; but the OP mentioned mac and windows. On
>>>>>> windows, on running that code, the lines() command added a lot of
>>>>>> vertical lines; most spanning the complete window but some only  
>>>>>> part.
>>>>>> Executing the code a second time (or in steps) gave sensible
>>>>>> results. My guess would be that some memory is not correctly
>>>>>> allocated or
>>>>>> initialised.  Or is it something like an object with storage mode
>>>>>> "integer" being passed to a double?  But then, why doesn't it  
>>>>>> show on
>>>>>> linux?
>>>>>>
>>>>>> Happy bug hunting.  If my guess is correct, then I have no idea  
>>>>>> how to
>>>>>> track down such things under windows.....
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>>    Berwin
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-devel at r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>
>>>>> Please can you folks try under R-devel (to be R-2.9.0 in a couple  
>>>>> of
>>>>> weeks) and report if you still see it. I do not under R-devel  
>>>>> (but do
>>>>> under R-release), so my guess is that something called by loess()  
>>>>> has
>>>>> been fixed in the meantime.
>>>>>
>>>>> Moreover it is not the plot stuff that was wrong under R-2.8.1
>>>>> (release) but the loess computations.
>>>>>
>>>>> Uwe Ligges
>>>>>
>>>>> ______________________________________________
>>>>> R-devel at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>
>>>
>>> --
>>>  O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
>>> c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
>>> (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45)  
>>> 35327918
>>> ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45)  
>>> 35327907
>>>
>>>
>> -- 
>> Brian D. Ripley,                  ripley at stats.ox.ac.uk
>> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
>> University of Oxford,             Tel:  +44 1865 272861 (self)
>> 1 South Parks Road,                     +44 1865 272866 (PA)
>> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list