[R-pkg-devel] Determine subset from glm object
Duncan Murdoch
murdoch@dunc@n @ending from gm@il@com
Sun Jul 8 19:10:54 CEST 2018
On 08/07/2018 11:48 AM, Charles Geyer wrote:
> I need to find out from an object returned by R function glm with argument
> x = TRUE
> what the subsetting was. It appears that if gout is that object, then
>
> as.integer(rownames(gout$x))
>
> is a subset vector equivalent to the one actually used.
You don't want the "as.integer". If the dataframe had rownames to start
with, the x component of the fit will have row labels consisting of
those labels, so as.integer may fail. Even if it doesn't, the rownames
aren't necessarily sequential integers. You can index the dataframe by
the character versions of the default numbers, so simply
rownames(gout$x) should always work.
More generally, I'm not sure your question is well posed. What do you
mean by "the subsetting"? If you have something like
df <- data.frame(letters, x = 1:26, y = rbinom(26, 1, 0.5))
df1 <- subset(df, letters > "b" & letters < "y")
gout <- glm(y ~ x, data = df1, subset = letters < "q", x = TRUE)
the rownames(gout$x) are going to be numbers for rows of df, because df1
will get a subset of those as row labels.
> I do also have the call to glm (as a call object) so can determine the
> actual subset argument, but this seems to be not so useful because I don't
> know the length of the original variables before subsetting.
You should be able to evaluate the subset expression in the environment
of the formula, i.e.
eval(gout$call$subset, envir = environment(gout$formula))
This may give incorrect results if the variables used in subsetting
aren't in the dataframe and have changed since glm() was called.
> So now my questions. Is this idea above (using rownames) OK even though I
> cannot find where (if anywhere) it is documented? Is there a better way?
> One more guaranteed to be correct in the future?
>
I would trust evaluating the subset more than grabbing row labels from
gout$x, but I don't know for sure it is likely to be more robust.
Duncan Murdoch
More information about the R-package-devel
mailing list