[R-sig-Geo] Problem with moran.test function
Roger Bivand
Roger.Bivand at nhh.no
Thu Jun 17 08:50:28 CEST 2010
On Thu, 17 Jun 2010, Michael Haenlein wrote:
> Roger,
>
> thanks very much for your reply!
>
> I tried to update spdep and downloaded the new version. Installation worked
> fine and I'm now working with spdep, version 0.5-11, 2010-05-31.
>
> The problem is, however, that I now can no longer use the read.dat2listw
> function, which I use to obtain network information from an external file.
> When I run
> Network <- read.dat2listw("C:/Network.txt")
> I get the error message:
> Error in `[.data.frame`(sn, , 3) : undefined columns selected
>
> I therefore had to move back to spdep, version 0.4-56, 2009-12-14.
No, it is definitely better to find out what is wrong with Network.txt, as
the change made in April to sn2listw() - called by read.dat2listw() was to
trap defective input objects. Please look at traceback() after the error.
Do debug() on read.dat2listw, and summary() on wmat and sn. Are there
locale issues in reading the text file, perhaps (decimal symbol?)?
This would feed downstream into the obviously wrong lagged values seen
below. I'd be interested in access to the input file to strengthen
defences against unusual weights, or weights seen by the system as
unusual. I think that an errant final column is becoming a factor, then
converted to numeric (with large n and many unique weights, their integer
indices will become large). read.dat2listw() needs to check that there are
3 columns, and that the first two are integer, and the third is numeric, I
think. But we need to see why reading the file is failing.
Roger
>
> I also used moran.test under debug() as you suggested but if I understood
> the output correctly, the error message only comes up at the end.
> I have attached the full debug report below.
>
> lag.listw(Network,x) seems to work fine although the values are very, very
> small:
>
> y<-lag.listw(Network,x)
> summary(y)
> Min. 1st Qu. Median Mean 3rd Qu. Max.
> 1.497e-310 1.200e-300 1.698e-300 1.094e-226 2.414e-300 1.994e-223
>
> This is a bit surprising to me as the values of x are several orders of
> magnitude larger:
>
> summary(x)
> Min. 1st Qu. Median Mean 3rd Qu. Max.
> 1082 1543 2119 2250 2886 3965
>
> Thanks,
>
> Michael
>
>
>
>
>
>
>> debug(moran.test)
>> moran.test(x,Network)
> debugging in: moran.test(x, Network)
> debug: {
> alternative <- match.arg(alternative, c("greater", "less",
> "two.sided"))
> if (!inherits(listw, "listw"))
> stop(paste(deparse(substitute(listw)), "is not a listw object"))
> if (!is.numeric(x))
> stop(paste(deparse(substitute(x)), "is not a numeric vector"))
> if (is.null(spChk))
> spChk <- get.spChkOption()
> if (spChk && !chkIDs(x, listw))
> stop("Check of data and weights ID integrity failed")
> xname <- deparse(substitute(x))
> wname <- deparse(substitute(listw))
> NAOK <- deparse(substitute(na.action)) == "na.pass"
> x <- na.action(x)
> na.act <- attr(x, "na.action")
> if (!is.null(na.act)) {
> subset <- !(1:length(listw$neighbours) %in% na.act)
> listw <- subset(listw, subset, zero.policy = zero.policy)
> }
> n <- length(listw$neighbours)
> if (n != length(x))
> stop("objects of different length")
> wc <- spweights.constants(listw, zero.policy = zero.policy,
> adjust.n = adjust.n)
> S02 <- wc$S0 * wc$S0
> res <- moran(x, listw, wc$n, wc$S0, zero.policy = zero.policy,
> NAOK = NAOK)
> I <- res$I
> K <- res$K
> if (rank)
> K <- (3 * (3 * wc$n^2 - 7))/(5 * (wc$n^2 - 1))
> EI <- (-1)/wc$n1
> if (randomisation) {
> VI <- wc$n * (wc$S1 * (wc$nn - 3 * wc$n + 3) - wc$n *
> wc$S2 + 3 * S02)
> tmp <- K * (wc$S1 * (wc$nn - wc$n) - 2 * wc$n * wc$S2 +
> 6 * S02)
> VI <- (VI - tmp)/(wc$n1 * wc$n2 * wc$n3 * S02)
> VI <- VI - EI^2
> }
> else {
> VI <- (wc$nn * wc$S1 - wc$n * wc$S2 + 3 * S02)/(S02 *
> (wc$nn - 1))
> VI <- VI - EI^2
> }
> ZI <- (I - EI)/sqrt(VI)
> statistic <- ZI
> names(statistic) <- "Moran I statistic standard deviate"
> if (alternative == "two.sided")
> PrI <- 2 * pnorm(abs(ZI), lower.tail = FALSE)
> else if (alternative == "greater")
> PrI <- pnorm(ZI, lower.tail = FALSE)
> else PrI <- pnorm(ZI)
> if (!is.finite(PrI) || PrI < 0 || PrI > 1)
> warning("Out-of-range p-value: reconsider test arguments")
> vec <- c(I, EI, VI)
> names(vec) <- c("Moran I statistic", "Expectation", "Variance")
> method <- paste("Moran's I test under", ifelse(randomisation,
> "randomisation", "normality"))
> data.name <- paste(xname, ifelse(rank, "using rank correction",
> ""), "\nweights:", wname, ifelse(is.null(na.act), "",
> paste("\nomitted:", paste(na.act, collapse = ", "))),
> "\n")
> res <- list(statistic = statistic, p.value = PrI, estimate = vec,
> alternative = alternative, method = method, data.name = data.name)
> if (!is.null(na.act))
> attr(res, "na.action") <- na.act
> class(res) <- "htest"
> res
> }
> Browse[1]>
> debug: alternative <- match.arg(alternative, c("greater", "less",
> "two.sided"))
> Browse[1]>
> debug: if (!inherits(listw, "listw"))
> stop(paste(deparse(substitute(listw)), "is not a listw object"))
> Browse[1]>
> debug: if (!is.numeric(x)) stop(paste(deparse(substitute(x)), "is not a
> numeric vector"))
> Browse[1]>
> debug: if (is.null(spChk)) spChk <- get.spChkOption()
> Browse[1]>
> debug: if (spChk && !chkIDs(x, listw)) stop("Check of data and weights ID
> integrity failed")
> Browse[1]>
> debug: xname <- deparse(substitute(x))
> Browse[1]>
> debug: wname <- deparse(substitute(listw))
> Browse[1]>
> debug: NAOK <- deparse(substitute(na.action)) == "na.pass"
> Browse[1]>
> debug: x <- na.action(x)
> Browse[1]>
> debug: na.act <- attr(x, "na.action")
> Browse[1]>
> debug: if (!is.null(na.act)) {
> subset <- !(1:length(listw$neighbours) %in% na.act)
> listw <- subset(listw, subset, zero.policy = zero.policy)
> }
> Browse[1]>
> debug: n <- length(listw$neighbours)
> Browse[1]>
> debug: if (n != length(x)) stop("objects of different length")
> Browse[1]>
> debug: wc <- spweights.constants(listw, zero.policy = zero.policy, adjust.n
> = adjust.n)
> Browse[1]>
> debug: S02 <- wc$S0 * wc$S0
> Browse[1]>
> debug: res <- moran(x, listw, wc$n, wc$S0, zero.policy = zero.policy, NAOK =
> NAOK)
> Browse[1]>
> debug: I <- res$I
> Browse[1]>
> debug: K <- res$K
> Browse[1]>
> debug: if (rank) K <- (3 * (3 * wc$n^2 - 7))/(5 * (wc$n^2 - 1))
> Browse[1]>
> debug: EI <- (-1)/wc$n1
> Browse[1]>
> debug: if (randomisation) {
> VI <- wc$n * (wc$S1 * (wc$nn - 3 * wc$n + 3) - wc$n * wc$S2 +
> 3 * S02)
> tmp <- K * (wc$S1 * (wc$nn - wc$n) - 2 * wc$n * wc$S2 + 6 *
> S02)
> VI <- (VI - tmp)/(wc$n1 * wc$n2 * wc$n3 * S02)
> VI <- VI - EI^2
> } else {
> VI <- (wc$nn * wc$S1 - wc$n * wc$S2 + 3 * S02)/(S02 * (wc$nn -
> 1))
> VI <- VI - EI^2
> }
> Browse[1]>
> debug: VI <- wc$n * (wc$S1 * (wc$nn - 3 * wc$n + 3) - wc$n * wc$S2 + 3 *
> S02)
> Browse[1]>
> debug: tmp <- K * (wc$S1 * (wc$nn - wc$n) - 2 * wc$n * wc$S2 + 6 * S02)
> Browse[1]>
> debug: VI <- (VI - tmp)/(wc$n1 * wc$n2 * wc$n3 * S02)
> Browse[1]>
> debug: VI <- VI - EI^2
> Browse[1]>
> debug: ZI <- (I - EI)/sqrt(VI)
> Browse[1]>
> debug: statistic <- ZI
> Browse[1]>
> debug: names(statistic) <- "Moran I statistic standard deviate"
> Browse[1]>
> debug: if (alternative == "two.sided") PrI <- 2 * pnorm(abs(ZI), lower.tail
> = FALSE) else if (alternative ==
> "greater") PrI <- pnorm(ZI, lower.tail = FALSE) else PrI <- pnorm(ZI)
> Browse[1]>
> debug: if (!is.finite(PrI) || PrI < 0 || PrI > 1) warning("Out-of-range
> p-value: reconsider test arguments")
> Browse[1]>
> debug: vec <- c(I, EI, VI)
> Browse[1]>
> debug: names(vec) <- c("Moran I statistic", "Expectation", "Variance")
> Browse[1]>
> debug: method <- paste("Moran's I test under",
> ifelse(randomisation, "randomisation", "normality"))
> Browse[1]>
> debug: data.name <- paste(xname, ifelse(rank, "using rank correction",
> ""), "\nweights:", wname, ifelse(is.null(na.act), "",
> paste("\nomitted:",
> paste(na.act, collapse = ", "))), "\n")
> Browse[1]>
> debug: res <- list(statistic = statistic, p.value = PrI, estimate = vec,
> alternative = alternative, method = method, data.name = data.name)
> Browse[1]>
> debug: if (!is.null(na.act)) attr(res, "na.action") <- na.act
> Browse[1]>
> debug: class(res) <- "htest"
> Browse[1]>
> debug: res
> Browse[1]>
> exiting from: moran.test(x, Network)
>
> Moran's I test under randomisation
>
> data: x
> weights: Network
>
> Moran I statistic standard deviate = NA, p-value = NA
> alternative hypothesis: greater
> sample estimates:
> Moran I statistic Expectation Variance
> -Inf -0.0002926544 NA
>
> Warning message:
> In moran.test(x, Network) : Out-of-range p-value: reconsider test arguments
>>
>
>
>
>
> -----Original Message-----
> From: r-sig-geo-bounces at stat.math.ethz.ch [mailto:
> r-sig-geo-bounces at stat.math.ethz.ch] On Behalf Of Roger Bivand
> Sent: Wednesday, June 16, 2010 15:08
> To: Michael Haenlein
> Cc: r-sig-geo at stat.math.ethz.ch
> Subject: Re: [R-sig-Geo] Problem with moran.test function
>
> Well, Moran's I is -Inf, and the analytical variance is NA, so something is
> not right. The problem could lie in x, Network, or the lag of x (when x and
> Network are OK but their combination is unhappy). Can you run moran.test()
> under debug() and check which values lead the value of I to go to -Inf? Is
> this what is going on:
>
> data(columbus)
> set.seed(1)
> x <- log(rpois(n=49, 2))
> x
> moran.test(x, nb2listw(col.gal.nb))
>
> where the current spdep release fails reporting:
>
> Error in lag.listw(listw, z, zero.policy = zero.policy, NAOK =
> NAOK): Variable contains non-finite values
>
> which was a fix introduced four weeks ago, changed to a test on |Inf| from a
> test on NA:
>
> https://r-forge.r-project.org/scm/viewvc.php/pkg/src/lagw.c?root=spdep&r1=244&r2=282
>
> If you update spdep, you'll pick up the improvement (made thanks to a bug
> report by Matias Mayor Fernandez), and if this is the case, the problem is
> in the x.
>
> Hope this helps,
>
> Roger
> --
> Roger Bivand
> Economic Geography Section, Department of Economics, Norwegian School of
> Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway.
> voice: +47 55 95 93 55; fax +47 55 95 95 43
> e-mail: Roger.Bivand at nhh.no
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no
More information about the R-sig-Geo
mailing list