Hi Eric, When I run your code (using the MASS library) I find that rstudent(fit2) also returns NaN in the seventh position. Perhaps the problem is occurring there and not in the "influence" function.
Jim On Wed, Apr 3, 2019 at 9:12 AM Eric Bridgeford <ericw...@gmail.com> wrote: > > I agree the influence documentation suggests NaNs may result; however, as > these can be manually computed and are, indeed, finite/existing (ie, > computing the held-out influence by manually training n models for n points > to obtain n leave one out influence measures), I don't possibly see how the > function SHOULD return NaN, and given that it is returning NaN, that > suggests to me that there should be either a) Providing an alternative > method to compute them that (may be slower) that returns the correct > results in the even that lm.influence does not return a good approximation > (ie, a command line argument for type="approx" that does the approximation > strategy employed currently, or an alternative type="direct" or something > like that that computes them manually), or b) a heuristic to suggest why > NaNs might result from one's particular inputs/what can be done to fix it > (if the approximation strategy is the source of the problem) or what the > issue is with the data that will cause NaNs. Hence I was looking to start a > discussion around the specific strategy employed to compute the elements. > > Below is the code: > moon_data <- structure(list(Name = structure(c(8L, 13L, 2L, 7L, 1L, 5L, > 11L, > 12L, 9L, 10L, 4L, 6L, 3L), > .Label = c("Ceres ", "Earth", "Eris ", > > "Haumea ", "Jupiter ", "Makemake ", "Mars ", "Mercury ", "Neptune > ", > > "Pluto ", "Saturn ", "Uranus ", "Venus "), class = "factor"), > Distance = c(0.39, 0.72, 1, 1.52, 2.75, 5.2, > 9.54, 19.22, > 30.06, 39.5, 43.35, 45.8, 67.7), > Diameter = c(0.382, 0.949, > > 1, 0.532, 0.08, 11.209, 9.449, 4.007, 3.883, 0.18, 0.15, > > 0.12, 0.19), Mass = c(0.06, 0.82, 1, 0.11, 2e-04, 317.8, > > 95.2, 14.6, 17.2, 0.0022, 7e-04, 7e-04, > 0.0025), Moons = c(0L, > > > 0L, 1L, 2L, 0L, 64L, 62L, 27L, 13L, 4L, 2L, 0L, 1L), Volume > = c(0.0291869497930152, > > > > 0.447504348276571, 0.523598775598299, 0.0788376225681443, > > > > 0.000268082573106329, 737.393372232996, 441.729261571372, > > > > 33.6865588825666, 30.6549628355953, 0.00305362805928928, > > > > 0.00176714586764426, 0.00090477868423386, 0.00359136400182873 > > > )), row.names = c(NA, -13L), class = "data.frame") > > fit <- glm.nb(Moons ~ Volume, data = moon_data) > rstudent(fit) > > fit2 <- update(fit, subset = Name != "Jupiter ") > rstudent(fit2) > > influence(fit2)$sigma > > # 1 2 3 4 5 7 8 9 > 10 11 12 13 > # 1.077945 1.077813 1.165025 1.181685 1.077954 NaN 1.044454 1.152110 > 1.187586 1.181696 1.077954 1.165147 > > Sincerely, > Eric > ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.