[R] Fwd: Potential Issue with lm.influence

Jim Lemon drj|m|emon @end|ng |rom gm@||@com
Wed Apr 3 00:36:08 CEST 2019


Hi Eric,
When I run your code (using the MASS library) I find that
rstudent(fit2) also returns NaN in the seventh position. Perhaps the
problem is occurring there and not in the "influence" function.

Jim

On Wed, Apr 3, 2019 at 9:12 AM Eric Bridgeford <ericwb95 using gmail.com> wrote:
>
> I agree the influence documentation suggests NaNs may result; however, as
> these can be manually computed and are, indeed, finite/existing (ie,
> computing the held-out influence by manually training n models for n points
> to obtain n leave one out influence measures), I don't possibly see how the
> function SHOULD return NaN, and given that it is returning NaN, that
> suggests to me that there should be either a) Providing an alternative
> method to compute them that (may be slower) that returns the correct
> results in the even that lm.influence does not return a good approximation
> (ie, a command line argument for type="approx" that does the approximation
> strategy employed currently, or an alternative type="direct" or something
> like that that computes them manually), or b) a heuristic to suggest why
> NaNs might result from one's particular inputs/what can be done to fix it
> (if the approximation strategy is the source of the problem) or what the
> issue is with the data that will cause NaNs. Hence I was looking to start a
> discussion around the specific strategy employed to compute the elements.
>
> Below is the code:
> moon_data <- structure(list(Name = structure(c(8L, 13L, 2L, 7L, 1L, 5L,
> 11L,
>                                                12L, 9L, 10L, 4L, 6L, 3L),
> .Label = c("Ceres ", "Earth", "Eris ",
>
>          "Haumea ", "Jupiter ", "Makemake ", "Mars ", "Mercury ", "Neptune
> ",
>
>          "Pluto ", "Saturn ", "Uranus ", "Venus "), class = "factor"),
>                             Distance = c(0.39, 0.72, 1, 1.52, 2.75, 5.2,
> 9.54, 19.22,
>                                          30.06, 39.5, 43.35, 45.8, 67.7),
> Diameter = c(0.382, 0.949,
>
>            1, 0.532, 0.08, 11.209, 9.449, 4.007, 3.883, 0.18, 0.15,
>
>            0.12, 0.19), Mass = c(0.06, 0.82, 1, 0.11, 2e-04, 317.8,
>
>                                  95.2, 14.6, 17.2, 0.0022, 7e-04, 7e-04,
> 0.0025), Moons = c(0L,
>
>
>                 0L, 1L, 2L, 0L, 64L, 62L, 27L, 13L, 4L, 2L, 0L, 1L), Volume
> = c(0.0291869497930152,
>
>
>
>     0.447504348276571, 0.523598775598299, 0.0788376225681443,
>
>
>
>     0.000268082573106329, 737.393372232996, 441.729261571372,
>
>
>
>     33.6865588825666, 30.6549628355953, 0.00305362805928928,
>
>
>
>     0.00176714586764426, 0.00090477868423386, 0.00359136400182873
>
>
>                 )), row.names = c(NA, -13L), class = "data.frame")
>
> fit <- glm.nb(Moons ~ Volume, data = moon_data)
> rstudent(fit)
>
> fit2 <- update(fit, subset = Name != "Jupiter ")
> rstudent(fit2)
>
> influence(fit2)$sigma
>
> #        1        2        3        4        5        7        8        9
>      10       11       12       13
> # 1.077945 1.077813 1.165025 1.181685 1.077954      NaN 1.044454 1.152110
> 1.187586 1.181696 1.077954 1.165147
>
> Sincerely,
> Eric
>



More information about the R-help mailing list