[Rd] loess returns different standard errors for identical models (PR#7956)
Prof Brian Ripley
ripley at stats.ox.ac.uk
Sat Jun 18 10:57:27 CEST 2005
I've seen many similar things in a report from valgrind. But they went
away when compiled without optimization: it seems optimization often does
a fetch one element off the end of an array when attempting to keep the
pipelines full.
I'd start by re-running the valgrind tests without optimization.
On Sat, 18 Jun 2005, Peter Dalgaard wrote:
> btyner at stat.purdue.edu writes:
>
>> Full_Name: Benjamin Tyner
>> Version: 2.1.0, 4/18/2005
>> OS: i686-redhat-linux-gnu
>> Submission from: (NULL) (4.64.8.220)
>>
>>
>> # Just run my.test() below in a newly opened R session. Once too many models
>> have been fit (~20 on my system), the computed standard error jumps to a
>> different value. This is (superficially) due to a different residual sum of
>> squares, not a different one.delta. No other aspect of the fit is affected, just
>> the computed value of s (I've run extensive testing with all.equal() to make
>> sure). Issuing a garbage collection before doing a loess fit appears to "solve"
>> the problem, which makes me think this is not a problem in loessc.c or loessf.f.
>> My point is that a few loess fits in one session should not cause the estimated
>> standard error computation go awry with no warning.
>
> Right. Valgrind has this to say:
>
>> my.test()
> ==22986== Use of uninitialised value of size 8
> ==22986== at 0x1C97051B: lowesb_ (loessf.f:1542)
> ==22986== by 0x1C95B399: loess_raw (loessc.c:98)
> ==22986== by 0x809C9AE: do_dotCode (dotcode.c:1709)
> ==22986== by 0x80B368F: Rf_eval (eval.c:405)
> [1] "s = 0.857141235910414"
> [1] "s = 0.857141235910414"
>
> and that certainly fits the pattern.
>
> Unfortunately this seems to be in the call to ehg31() in this passage
>
> end if
> setlf=(iv(27).ne.iv(25))
> call ehg131(xx,yy,ww,trl,diagl,iv(20),iv(29),iv(3),iv(2),iv(5),
> + iv(17),iv(4),iv(6),iv(14),iv(19),wv(1),iv(iv(7)),iv(iv(8)),
> + iv(iv(9)),iv(iv(10)),iv(iv(22)),iv(iv(27)),wv(iv(11)),
> + iv(iv(23)),wv(iv(13)),wv(iv(12)),wv(iv(15)),wv(iv(16)),
> + wv(iv(18)),ifloor(iv(3)*wv(2)),wv(3),wv(iv(26)),wv(iv(24)),
> + wv(4),iv(30),iv(33),iv(32),iv(41),iv(iv(25)),wv(iv(34)),
> + setlf)
> if(iv(14).lt.iv(6)+DBLE(iv(4))/2.D0)then
> call ehg183('k-d tree limited by memory; nvmax=',
> + iv(14),1,1)
>
> (line numbers in optimized code are somewhat unreliable), so there are
> quite a few items to check. Dumping out the iv and wv arrays at that
> point is probably a good start if you want to chip in with a bit
> of debugging. Do yourself a favour and use set.seed() with a value
> that gives you a minimal repeat count when you start R in a clean state.
>
> --
> O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
> c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
> (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
> ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list