[Rd] Using response variable in interaction as explanatory variable in glm crashes R
Scott Kostyshak
skostyshak at ufl.edu
Tue Oct 10 19:24:56 CEST 2017
On Mon, Oct 09, 2017 at 03:52:43PM +0000, Martin Maechler wrote:
> >>>>> Jan van der Laan <rhelp at eoos.dds.nl>
> >>>>> on Fri, 6 Oct 2017 12:13:39 +0200 writes:
>
> > It is actually model.matrix that crashes, not glm. Same
> > crash occurs with e.g. lm.
>
> > model.matrix(dob_mon ~ dob_day*dob_mon, data = tab)
>
> > also crashes R.
>
> Yes, segmentation fault.
>
> It only happens when these are *logical* variables, not, e.g., when
> transformed to integer.
>
> The C code in src/library/stats/src/model.c tries to eliminate
> occurances of the LHS of the formula from the RHS when building
> the model matrix and it does work fine in the integer case.
>
> Part of the culprit code may be this (from line 717),
> with the isLogical(.) which in our case, shifts the pointer by
> 1 in the call to firstfactor() :
>
> int adj = isLogical(var_i)?1:0;
> // avoid overflow of jstart * nn PR#15578
> firstfactor(&rx[jstart * nn], n, jnext - jstart,
> REAL(contrast), nrows(contrast),
> ncols(contrast), INTEGER(var_i)+adj);
>
> then in firstfactor(), we see the segfault (when running R with
> '-d gdb') :
>
> > model.matrix(dob_mon ~ dob_day*dob_mon, data = tab)
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x00007fffeafa76b5 in firstfactor (ncx=0, v=0x5c3b37c, ncc=1, nrc=2, c=0x5c90008,
> nrx=8, x=0x5cbf150) at ../../../../../R/src/library/stats/src/model.c:252
> 252 else xj[i] = cj[v[i]-1];
> Missing separate debuginfos, .................
> (gdb) list
> 247 for (int j = 0; j < ncc; j++) {
> 248 xj = &x[j * (R_xlen_t)nrx];
> 249 cj = &c[j * (R_xlen_t)nrc];
> 250 for (int i = 0; i < nrx; i++)
> 251 if(v[i] == NA_INTEGER) xj[i] = NA_REAL;
> 252 else xj[i] = cj[v[i]-1];
> 253 }
> 254 }
> 255
>
> and indeed in the debugger, i=7 and v[i] is "outside", v[]
> being of length 7, hence indexed 0:6.
Dear Martin,
I just wanted to thank you for providing details on your approach to
debugging. Often I see bug fixes and I wonder "how the heck did they
figure that out?" so I am very excited when I see details like these on
the process (and not just the end result), so that I can learn.
Best,
Scott
--
Scott Kostyshak
Assistant Professor of Economics
University of Florida
https://people.clas.ufl.edu/skostyshak/
More information about the R-devel
mailing list