[R-sig-ME] Another MCMCglmm question - fixing correlations

Sun Dec 5 17:16:07 CET 2010

Hi Jarrod,

Thanks very much for the suggestions, which did the trick. (And standardising the row weights to 1 seemed to prevent any trouble with any values of rho.)

MCMCglmm does appear to fit spatial simultaneous autoregressive lag models, and about as well as the specialised functions for such models in the "spdep" package. I've tried a few values of rho, and MCMCglmm recovers them well (the estimates from both it and "lagsarlm"  are slightly downward biased--see below for simulations code).

My real/ulterior interest here is actually estimating multilevel models where diffusion takes place across higher-level units, and some preliminary investigations suggest that MCMCglmm can fit such a model--unlike any other R function I've tried--but I'll spare the list the gory details of this work-in-progress, unless somebody wants to know.

Cheers,
Malcolm

library(MCMCglmm)
library(spdep)
N <- 50 # set sample size
rho <- 0.6 # set autocorrelation coefficient
sims <- 100
prior1 = list(R=list(V=1, nu=0.002))
MSE <- res <- matrix(NA, nrow=sims, ncol=6)
for (i in 1:sims) {
	W <- matrix(rbinom(N^2, 1, 0.25), N, N)*upper.tri(matrix(1, N, N))
	W <- W + t(W) # symmetrical connectivity matrix
	W <- W/apply(W, 1, sum) # standardise rows
	listw <- mat2listw(W) # convert contiguities to listw format
	L <- diag(N) - rho*W
	X1 <- runif(N, min=-5, max=5)
	Xbe <- X1+rnorm(N)
	y <- solve(L, Xbe) # generate lagged y
	dat <- data.frame(X1=X1, y=y, W=W)
	autoreg <- lagsarlm(y ~ X1, listw=listw, data=dat) # fit autoregressive model
	MC2 <- MCMCglmm(y ~ X1+sir(~W, ~units), data=dat, prior=prior1, verbose=F)
	res[i,] <- c(coefficients(autoreg), mean(MC2$Lambda), apply(MC2$Sol, 2, mean))
	MSE[i,] <- c((coefficients(autoreg)-c(rho, 0, 1))^2, (mean(MC2$Lambda)-rho)^2, (apply(MC2$Sol, 2, mean) - c(0, 1))^2)
	}

On 4 Dec 2010, at 21:11, Jarrod Hadfield wrote:

> Hi Malcolm,
> 
> I meant replace the sir function in the R directory and rebuild + reinstall MCMCglmm. However, I later realised that for your problem there is a simpler way:
> 
> N <- 50 # set sample size
> W <- matrix(rbinom(N^2, 1, 0.25), N, N)*upper.tri(matrix(1, N, N))
> W <- W + t(W) # symmetrical connectivity matrix
> rho <- 0.2 # set autocorrelation coefficient
> L <- diag(N) - rho*W  # my L is your M^{-1}
> X1 <- runif(N, min=-5, max=5)
> Xbe <- X1+rnorm(N)
> y <- solve(L, Xbe) # generate lagged y
> dat <- data.frame(X1=X1, y=y, W=W)
> 
> prior1 = list(R=list(V=1, nu=0.002))
> MC2 <- MCMCglmm(y ~ X1+sir(~W, ~units), data=dat, prior=prior1, verbose=F)
> 
> This works because ~units sets up an identity matrix and W%*%t(I) = W.
> 
> However, I could not get sensible results from MCMCglmm with rho=0.3 and wasn't sure why (rho=0.03 for example gives sense). The problem is associated with a change in the sign of the determinant of L (or M) (i.e the Jacobian), which is perhaps not that surprising since if W=I then rho=0.3 and rho = 1.7 should be indiscriminable even with a fixed residual variance:
> 
> y = Xbe/(1-0.3) = -Xbe/(1-1.7)
> 
> I would have to think about this harder than I have to offer a solution (this is why the sir models are undocumented!), but perhaps you have some insight?
> 
> Also, I don't think MC3 results are meaningless, the estimate for the sir parameter overlaps zero as expected.
> 
> Cheers,
> 
> Jarrod
> 
> 
> 
> Quoting Malcolm Fairbrother <m.fairbrother at bristol.ac.uk>:
> 
>> Hi Jarrod,
>> 
>> Thanks very much for the suggestion. However, in trying what you said, I'm not getting very far (see code below). I redefine the function "sir", and then re-install the MCMCglmm package. But the MCMCglmm function still seems to use the old "sir" function, not the new one, with the result that when I try calling sir in the middle of a call to MCMCglmm, I get an error message telling me that my call to sir is problematic ("Error in sir(W) : formula not passed to formula1 in sir"). Can you please clarify what I'm doing wrong?
>> 
>> Thanks again for any assistance.
>> - Malcolm
>> 
>> 
>> sir <- function(W) {W}
>> install.packages("MCMCglmm")
>> library(MCMCglmm)
>> 
>> N <- 50 # set sample size
>> W <- matrix(rbinom(N^2, 1, 0.25), N, N)*upper.tri(matrix(1, N, N))
>> W <- W + t(W) # symmetrical connectivity matrix
>> rho <- 0.2 # set autocorrelation coefficient
>> M <- solve(diag(N) - rho*W)
>> X1 <- runif(N, min=-5, max=5)
>> Xbe <- X1+rnorm(N)
>> y <- M %*% Xbe # generate lagged y
>> dat <- data.frame(X1=X1, y=y)
>> 
>> prior1 = list(R=list(V=1, nu=0.002), G=list(G1=list(V=1, nu=0.002)))
>> MC1 <- MCMCglmm(y ~ X1, data=dat, prior=prior1, verbose=F)
>> # works OK, compare with summary(lm(dat$y~dat$X1))
>> 
>> identical(sir(W), W) # TRUE
>> MC2 <- MCMCglmm(y ~ X1 + sir(W), data=dat, prior=prior1, verbose=F)
>> # doesn't work
>> 
>> fac2 <- fac1 < -factor(sample(letters, N, TRUE), levels=letters)
>> MC3 <- MCMCglmm(y ~ X1 + sir(~fac1, ~fac2), data=dat, prior=prior1, verbose=F)
>> # works--results are meaningless, but doesn't return any error
>> 
>> 
>> 
>> 
>> 
>> On 2 Dec 2010, at 11:20, Jarrod Hadfield wrote:
>> 
>>> Hi Malcolm,
>>> 
>>> The suggestion of using SIR models for time series was a throw away remark, but I can't see anything technically incorrect with it. For example the model y[t] = \lambda*y[t-1]+..... can be treated as a SIR model, although there are certainly better ways of fitting it. I'm not familiar with spatial models and connectivity matrices but if W[i,j] is a way of specifying y[i] = \lambda*y[j] + ... then it is possible to fit it as a SIR model. You could modify the sir code to do what you would like (and then reinstall MCMCglmm) , for example:
>>> 
>>> sir<-function(W){W}
>>> 
>>> and then place W in the data.frame.
>>> 
>>> MCMCglmm will associate design matrices with recursive/simultaneous structures if the design matrix is returned from a function called "sir".
>>> 
>>> Cheers,
>>> 
>>> Jarrod
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On 1 Dec 2010, at 20:10, Malcolm Fairbrother wrote:
>>> 
>>>> Dear Jarrod,
>>>> 
>>>> I'm very interested in this SIR feature of MCMCglmm, which I hadn't been aware of until your e-mail earlier today. Do I understand correctly that this should allow for feedback effects between units, reflecting the fact that y(i) and y(not-i) affect each other? If so, this could be a useful way of fitting, for example, spatial autoregressive models, where some units may affect each other by virtue of being neighbours, and we would like to know the magnitude of these effects. (You mentioned time series, but some web searches haven't turned up much on MCMCglmm and time series. And time only flows in one direction, whereas units in space can have two-way effects.)
>>>> 
>>>> However, in looking at the documentation and code, I've been struggling a bit with the "sir(~XX, ~XX)" part. Is there a way to specify a "connectivity" matrix directly, rather than using the "sir" function? Looking at the MCMCglmm code, as I understand it, "sir" is both a way of generating a connectivity matrix AND a way of telling MCMCglmm to treat the sir(~XX, ~XX) differently than other covariates.
>>>> 
>>>> In the case of spatial autoregressive models, each element on the main diagonal of the connectivity matrix is 0, and rows are often standardised to sum to 1. Such a connectivity matrix might, for example, look like:
>>>> 
>>>> N <- 100
>>>> W <- matrix(rbinom(N^2, 1, 0.1), N, N)*upper.tri(matrix(1, N, N)) # chance of being neighbours is 0.1
>>>> W <- W + t(W) # matrix is symmetrical, with zeros on the main diagonal
>>>> 
>>>> But I'm having a hard time figuring out how to use "sir" to specify such a matrix in the middle of a call to MCMCglmm.
>>>> 
>>>> If you see what I mean, can you offer any suggestions? I might take a stab at modifying MCMCglmm to make this possible, but before I do that (and I'm not certain I'll have the know-how) I wanted to check whether you had any ideas.
>>>> 
>>>> Many thanks for any assistance or clarification you can provide.
>>>> - Malcolm
>>>> 
>>>> 
>>>> 
>>>>> Message: 5
>>>>> Date: Wed, 1 Dec 2010 10:28:43 +0000
>>>>> From: Jarrod Hadfield <j.hadfield at ed.ac.uk>
>>>>> To: Szymek Drobniak <geralttee at gmail.com>
>>>>> Cc: r-sig-mixed-models at r-project.org
>>>>> Subject: Re: [R-sig-ME] Another MCMCglmm question - fixing
>>>>> 	correlations
>>>>> Message-ID: <0AF131D6-557C-4324-A65A-A175E41DBA8F at ed.ac.uk>
>>>>> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> There is a way, but its quite involved and you may find it easier to
>>>>> fit this type of model in something like ASReml if you can get a
>>>>> license. If not ......
>>>>> 
>>>>> I have implemented simultaneous-recursive (SIR) mixed models  in
>>>>> MCMCglmm, although currently they are not well tested or documented
>>>>> (See Chapter 9 of the CourseNotes) and cannot be used with non-
>>>>> Gaussian data or certain patterns of missing data. Nevertheless, in
>>>>> this example the existing code should work fine.  A simple SIR model
>>>>> has the form:
>>>>> 
>>>>> y_{i} = mu + \lambda*y_{j} + u_{i} + e_{i}
>>>>> 
>>>>> where mu is the intercept, u is a random effect (e.g. breeding value
>>>>> in your case) and e is a residual. The new part of the model is the
>>>>> structural parameter \lambda multiplied by y_{j}, the jth element of
>>>>> the response vector, which can be useful for modelling the effects of
>>>>> behavioural interactions or time series etc.  However, placing y's on
>>>>> the LHS and RHS complicates the likelihood, but MCMCglmm deals with
>>>>> this.
>>>>> 
>>>>> If we set i=j then the above equation can be rearranged:
>>>>> 
>>>>> y_{i}  - \lambda*y_{i} = mu + u_{i} + e_{i}
>>>>> y_{i}(1  - \lambda) = mu + u_{i} + e_{i}
>>>>> y_{i} = (mu + u_{i} + e_{i})/(1  - \lambda)
>>>>> 
>>>>> from this it can be seen that mu, u, e, and \lambda cannot be uniquely
>>>>> estimated without restrictions.  To illustrate we'll simulate some data:
>>>>> 
>>>>> fac<-gl(25,4)
>>>>> y<-rnorm(25)[fac]+rnorm(100, -1, sqrt(2))
>>>>> 
>>>>> # mu = -1, VAR(u)=1, VAR(e) = 2, \lambda = 0
>>>>> 
>>>>> dat<-data.frame(y=y, fac=fac)
>>>>> 
>>>>> prior1 = list(R=list(V=1, nu=0.002), G=list(G1=list(V=1, nu=0.002)))
>>>>> 
>>>>> m1<-MCMCglmm(y~1, random=~fac,  data=dat, prior=prior1)
>>>>> 
>>>>> # fitting a non-SIR model gives estimates consistent with the true
>>>>> values, which you can verify through summary(m1). Now for the SIR
>>>>> model. Because we know all parameters cannot be uniquely estimated
>>>>> I've set the residual variance to one arbitrarily.
>>>>> 
>>>>> prior2 = list(R=list(V=1, fix=1), G=list(G1=list(V=1, nu=0.002)))
>>>>> 
>>>>> m2<-MCMCglmm(y~1+sir(~units, ~units), random=~fac,  data=dat,
>>>>> prior=prior2)   # sir(~units, ~units) sets i=j.
>>>>> 
>>>>> # The estimates do not look consistent with the real values.  However,
>>>>> we can rescale them by 1-\lambda and the estimates  are close to being
>>>>> identical up to Monte Carlo error (the prior will have slightly
>>>>> different effects)
>>>>> 
>>>>> HPDinterval(m1$VCV[,1])
>>>>> HPDinterval(m2$VCV[,1]/(1-m2$Lambda)^2)
>>>>> 
>>>>> HPDinterval(m1$VCV[,2])
>>>>> HPDinterval(m2$VCV[,2]/(1-m2$Lambda)^2)
>>>>> 
>>>>> HPDinterval(m1$Sol)
>>>>> HPDinterval(m2$Sol/(1-m2$Lambda))
>>>> 
>>>> 
>>>> 	[[alternative HTML version deleted]]
>>>> 
>>>> _______________________________________________
>>>> R-sig-mixed-models at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>> 
>>> 
>>> 
>>> --
>>> The University of Edinburgh is a charitable body, registered in
>>> Scotland, with registration number SC005336.
>>> 
>> 
>> 
>> 
> 
> 
> 
> -- 
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
> 
>