[R-sig-ME] Another MCMCglmm question - fixing correlations

Sat Aug 25 17:55:27 CEST 2012

Hi Malcolm,

I'm aware of the problem - I moved to a sparseMatrix implementation  
which seems to be incompatible with with  
model.matrix/sparse.model.matrix. I will try and sort it out as soon  
as I have time.

Cheers,

Jarrod

Quoting Malcolm Fairbrother <m.fairbrother at bristol.ac.uk> on Tue, 21  
Aug 2012 09:05:01 +0100:

> Dear Jarrod,
>
> A while back we had a discussion on the list about spatial  
> simultaneous autoregressive lag models, using the "sir" function in  
> MCMCglmm. I just tried re-running the code in my message of 5 Dec  
> 2010, which was working at the time, and it now gets hung up on the  
> call to "sir", returning the error:
>
> Error in model.frame.default(object, data, xlev = xlev) :
>   invalid type (S4) for variable 'sir(~W, ~units)'
>
> I thought "sir(~W, ~diag(nrow(W)))" might help, but it doesn't.
>
> And I get the same error if I try the code from pp. 132-3 in the  
> MCMCglmm CourseNotes.
>
> Did you tweak something in MCMCglmm (I'm now running version 2.16)  
> which might have caused this problem? Any assistance appreciated.
>
> Cheers,
> Malcolm
>
>
>
>
> On 5 Dec 2010, at 16:16, Malcolm Fairbrother wrote:
>
>> Hi Jarrod,
>>
>> Thanks very much for the suggestions, which did the trick. (And  
>> standardising the row weights to 1 seemed to prevent any trouble  
>> with any values of rho.)
>>
>> MCMCglmm does appear to fit spatial simultaneous autoregressive lag  
>> models, and about as well as the specialised functions for such  
>> models in the "spdep" package. I've tried a few values of rho, and  
>> MCMCglmm recovers them well (the estimates from both it and  
>> "lagsarlm"  are slightly downward biased--see below for simulations  
>> code).
>>
>> My real/ulterior interest here is actually estimating multilevel  
>> models where diffusion takes place across higher-level units, and  
>> some preliminary investigations suggest that MCMCglmm can fit such  
>> a model--unlike any other R function I've tried--but I'll spare the  
>> list the gory details of this work-in-progress, unless somebody  
>> wants to know.
>>
>> Cheers,
>> Malcolm
>>
>>
>>
>> library(MCMCglmm)
>> library(spdep)
>> N <- 50 # set sample size
>> rho <- 0.6 # set autocorrelation coefficient
>> sims <- 100
>> prior1 = list(R=list(V=1, nu=0.002))
>> MSE <- res <- matrix(NA, nrow=sims, ncol=6)
>> for (i in 1:sims) {
>> 	W <- matrix(rbinom(N^2, 1, 0.25), N, N)*upper.tri(matrix(1, N, N))
>> 	W <- W + t(W) # symmetrical connectivity matrix
>> 	W <- W/apply(W, 1, sum) # standardise rows
>> 	listw <- mat2listw(W) # convert contiguities to listw format
>> 	L <- diag(N) - rho*W
>> 	X1 <- runif(N, min=-5, max=5)
>> 	Xbe <- X1+rnorm(N)
>> 	y <- solve(L, Xbe) # generate lagged y
>> 	dat <- data.frame(X1=X1, y=y, W=W)
>> 	autoreg <- lagsarlm(y ~ X1, listw=listw, data=dat) # fit  
>> autoregressive model
>> 	MC2 <- MCMCglmm(y ~ X1+sir(~W, ~units), data=dat, prior=prior1, verbose=F)
>> 	res[i,] <- c(coefficients(autoreg), mean(MC2$Lambda),  
>> apply(MC2$Sol, 2, mean))
>> 	MSE[i,] <- c((coefficients(autoreg)-c(rho, 0, 1))^2,  
>> (mean(MC2$Lambda)-rho)^2, (apply(MC2$Sol, 2, mean) - c(0, 1))^2)
>> 	}
>>
>>
>>
>> On 4 Dec 2010, at 21:11, Jarrod Hadfield wrote:
>>
>>> Hi Malcolm,
>>>
>>> I meant replace the sir function in the R directory and rebuild +  
>>> reinstall MCMCglmm. However, I later realised that for your  
>>> problem there is a simpler way:
>>>
>>> N <- 50 # set sample size
>>> W <- matrix(rbinom(N^2, 1, 0.25), N, N)*upper.tri(matrix(1, N, N))
>>> W <- W + t(W) # symmetrical connectivity matrix
>>> rho <- 0.2 # set autocorrelation coefficient
>>> L <- diag(N) - rho*W  # my L is your M^{-1}
>>> X1 <- runif(N, min=-5, max=5)
>>> Xbe <- X1+rnorm(N)
>>> y <- solve(L, Xbe) # generate lagged y
>>> dat <- data.frame(X1=X1, y=y, W=W)
>>>
>>> prior1 = list(R=list(V=1, nu=0.002))
>>> MC2 <- MCMCglmm(y ~ X1+sir(~W, ~units), data=dat, prior=prior1, verbose=F)
>>>
>>> This works because ~units sets up an identity matrix and W%*%t(I) = W.
>>>
>>> However, I could not get sensible results from MCMCglmm with  
>>> rho=0.3 and wasn't sure why (rho=0.03 for example gives sense).  
>>> The problem is associated with a change in the sign of the  
>>> determinant of L (or M) (i.e the Jacobian), which is perhaps not  
>>> that surprising since if W=I then rho=0.3 and rho = 1.7 should be  
>>> indiscriminable even with a fixed residual variance:
>>>
>>> y = Xbe/(1-0.3) = -Xbe/(1-1.7)
>>>
>>> I would have to think about this harder than I have to offer a  
>>> solution (this is why the sir models are undocumented!), but  
>>> perhaps you have some insight?
>>>
>>> Also, I don't think MC3 results are meaningless, the estimate for  
>>> the sir parameter overlaps zero as expected.
>>>
>>> Cheers,
>>>
>>> Jarrod
>>>
>>>
>>>
>>> Quoting Malcolm Fairbrother <m.fairbrother at bristol.ac.uk>:
>>>
>>>> Hi Jarrod,
>>>>
>>>> Thanks very much for the suggestion. However, in trying what you  
>>>> said, I'm not getting very far (see code below). I redefine the  
>>>> function "sir", and then re-install the MCMCglmm package. But the  
>>>> MCMCglmm function still seems to use the old "sir" function, not  
>>>> the new one, with the result that when I try calling sir in the  
>>>> middle of a call to MCMCglmm, I get an error message telling me  
>>>> that my call to sir is problematic ("Error in sir(W) : formula  
>>>> not passed to formula1 in sir"). Can you please clarify what I'm  
>>>> doing wrong?
>>>>
>>>> Thanks again for any assistance.
>>>> - Malcolm
>>>>
>>>>
>>>> sir <- function(W) {W}
>>>> install.packages("MCMCglmm")
>>>> library(MCMCglmm)
>>>>
>>>> N <- 50 # set sample size
>>>> W <- matrix(rbinom(N^2, 1, 0.25), N, N)*upper.tri(matrix(1, N, N))
>>>> W <- W + t(W) # symmetrical connectivity matrix
>>>> rho <- 0.2 # set autocorrelation coefficient
>>>> M <- solve(diag(N) - rho*W)
>>>> X1 <- runif(N, min=-5, max=5)
>>>> Xbe <- X1+rnorm(N)
>>>> y <- M %*% Xbe # generate lagged y
>>>> dat <- data.frame(X1=X1, y=y)
>>>>
>>>> prior1 = list(R=list(V=1, nu=0.002), G=list(G1=list(V=1, nu=0.002)))
>>>> MC1 <- MCMCglmm(y ~ X1, data=dat, prior=prior1, verbose=F)
>>>> # works OK, compare with summary(lm(dat$y~dat$X1))
>>>>
>>>> identical(sir(W), W) # TRUE
>>>> MC2 <- MCMCglmm(y ~ X1 + sir(W), data=dat, prior=prior1, verbose=F)
>>>> # doesn't work
>>>>
>>>> fac2 <- fac1 < -factor(sample(letters, N, TRUE), levels=letters)
>>>> MC3 <- MCMCglmm(y ~ X1 + sir(~fac1, ~fac2), data=dat,  
>>>> prior=prior1, verbose=F)
>>>> # works--results are meaningless, but doesn't return any error
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 2 Dec 2010, at 11:20, Jarrod Hadfield wrote:
>>>>
>>>>> Hi Malcolm,
>>>>>
>>>>> The suggestion of using SIR models for time series was a throw  
>>>>> away remark, but I can't see anything technically incorrect with  
>>>>> it. For example the model y[t] = \lambda*y[t-1]+..... can be  
>>>>> treated as a SIR model, although there are certainly better ways  
>>>>> of fitting it. I'm not familiar with spatial models and  
>>>>> connectivity matrices but if W[i,j] is a way of specifying y[i]  
>>>>> = \lambda*y[j] + ... then it is possible to fit it as a SIR  
>>>>> model. You could modify the sir code to do what you would like  
>>>>> (and then reinstall MCMCglmm) , for example:
>>>>>
>>>>> sir<-function(W){W}
>>>>>
>>>>> and then place W in the data.frame.
>>>>>
>>>>> MCMCglmm will associate design matrices with  
>>>>> recursive/simultaneous structures if the design matrix is  
>>>>> returned from a function called "sir".
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Jarrod
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 1 Dec 2010, at 20:10, Malcolm Fairbrother wrote:
>>>>>
>>>>>> Dear Jarrod,
>>>>>>
>>>>>> I'm very interested in this SIR feature of MCMCglmm, which I  
>>>>>> hadn't been aware of until your e-mail earlier today. Do I  
>>>>>> understand correctly that this should allow for feedback  
>>>>>> effects between units, reflecting the fact that y(i) and  
>>>>>> y(not-i) affect each other? If so, this could be a useful way  
>>>>>> of fitting, for example, spatial autoregressive models, where  
>>>>>> some units may affect each other by virtue of being neighbours,  
>>>>>> and we would like to know the magnitude of these effects. (You  
>>>>>> mentioned time series, but some web searches haven't turned up  
>>>>>> much on MCMCglmm and time series. And time only flows in one  
>>>>>> direction, whereas units in space can have two-way effects.)
>>>>>>
>>>>>> However, in looking at the documentation and code, I've been  
>>>>>> struggling a bit with the "sir(~XX, ~XX)" part. Is there a way  
>>>>>> to specify a "connectivity" matrix directly, rather than using  
>>>>>> the "sir" function? Looking at the MCMCglmm code, as I  
>>>>>> understand it, "sir" is both a way of generating a connectivity  
>>>>>> matrix AND a way of telling MCMCglmm to treat the sir(~XX, ~XX)  
>>>>>> differently than other covariates.
>>>>>>
>>>>>> In the case of spatial autoregressive models, each element on  
>>>>>> the main diagonal of the connectivity matrix is 0, and rows are  
>>>>>> often standardised to sum to 1. Such a connectivity matrix  
>>>>>> might, for example, look like:
>>>>>>
>>>>>> N <- 100
>>>>>> W <- matrix(rbinom(N^2, 1, 0.1), N, N)*upper.tri(matrix(1, N,  
>>>>>> N)) # chance of being neighbours is 0.1
>>>>>> W <- W + t(W) # matrix is symmetrical, with zeros on the main diagonal
>>>>>>
>>>>>> But I'm having a hard time figuring out how to use "sir" to  
>>>>>> specify such a matrix in the middle of a call to MCMCglmm.
>>>>>>
>>>>>> If you see what I mean, can you offer any suggestions? I might  
>>>>>> take a stab at modifying MCMCglmm to make this possible, but  
>>>>>> before I do that (and I'm not certain I'll have the know-how) I  
>>>>>> wanted to check whether you had any ideas.
>>>>>>
>>>>>> Many thanks for any assistance or clarification you can provide.
>>>>>> - Malcolm
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Message: 5
>>>>>>> Date: Wed, 1 Dec 2010 10:28:43 +0000
>>>>>>> From: Jarrod Hadfield <j.hadfield at ed.ac.uk>
>>>>>>> To: Szymek Drobniak <geralttee at gmail.com>
>>>>>>> Cc: r-sig-mixed-models at r-project.org
>>>>>>> Subject: Re: [R-sig-ME] Another MCMCglmm question - fixing
>>>>>>> 	correlations
>>>>>>> Message-ID: <0AF131D6-557C-4324-A65A-A175E41DBA8F at ed.ac.uk>
>>>>>>> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> There is a way, but its quite involved and you may find it easier to
>>>>>>> fit this type of model in something like ASReml if you can get a
>>>>>>> license. If not ......
>>>>>>>
>>>>>>> I have implemented simultaneous-recursive (SIR) mixed models  in
>>>>>>> MCMCglmm, although currently they are not well tested or documented
>>>>>>> (See Chapter 9 of the CourseNotes) and cannot be used with non-
>>>>>>> Gaussian data or certain patterns of missing data. Nevertheless, in
>>>>>>> this example the existing code should work fine.  A simple SIR model
>>>>>>> has the form:
>>>>>>>
>>>>>>> y_{i} = mu + \lambda*y_{j} + u_{i} + e_{i}
>>>>>>>
>>>>>>> where mu is the intercept, u is a random effect (e.g. breeding value
>>>>>>> in your case) and e is a residual. The new part of the model is the
>>>>>>> structural parameter \lambda multiplied by y_{j}, the jth element of
>>>>>>> the response vector, which can be useful for modelling the effects of
>>>>>>> behavioural interactions or time series etc.  However, placing y's on
>>>>>>> the LHS and RHS complicates the likelihood, but MCMCglmm deals with
>>>>>>> this.
>>>>>>>
>>>>>>> If we set i=j then the above equation can be rearranged:
>>>>>>>
>>>>>>> y_{i}  - \lambda*y_{i} = mu + u_{i} + e_{i}
>>>>>>> y_{i}(1  - \lambda) = mu + u_{i} + e_{i}
>>>>>>> y_{i} = (mu + u_{i} + e_{i})/(1  - \lambda)
>>>>>>>
>>>>>>> from this it can be seen that mu, u, e, and \lambda cannot be uniquely
>>>>>>> estimated without restrictions.  To illustrate we'll simulate  
>>>>>>> some data:
>>>>>>>
>>>>>>> fac<-gl(25,4)
>>>>>>> y<-rnorm(25)[fac]+rnorm(100, -1, sqrt(2))
>>>>>>>
>>>>>>> # mu = -1, VAR(u)=1, VAR(e) = 2, \lambda = 0
>>>>>>>
>>>>>>> dat<-data.frame(y=y, fac=fac)
>>>>>>>
>>>>>>> prior1 = list(R=list(V=1, nu=0.002), G=list(G1=list(V=1, nu=0.002)))
>>>>>>>
>>>>>>> m1<-MCMCglmm(y~1, random=~fac,  data=dat, prior=prior1)
>>>>>>>
>>>>>>> # fitting a non-SIR model gives estimates consistent with the true
>>>>>>> values, which you can verify through summary(m1). Now for the SIR
>>>>>>> model. Because we know all parameters cannot be uniquely estimated
>>>>>>> I've set the residual variance to one arbitrarily.
>>>>>>>
>>>>>>> prior2 = list(R=list(V=1, fix=1), G=list(G1=list(V=1, nu=0.002)))
>>>>>>>
>>>>>>> m2<-MCMCglmm(y~1+sir(~units, ~units), random=~fac,  data=dat,
>>>>>>> prior=prior2)   # sir(~units, ~units) sets i=j.
>>>>>>>
>>>>>>> # The estimates do not look consistent with the real values.  However,
>>>>>>> we can rescale them by 1-\lambda and the estimates  are close to being
>>>>>>> identical up to Monte Carlo error (the prior will have slightly
>>>>>>> different effects)
>>>>>>>
>>>>>>> HPDinterval(m1$VCV[,1])
>>>>>>> HPDinterval(m2$VCV[,1]/(1-m2$Lambda)^2)
>>>>>>>
>>>>>>> HPDinterval(m1$VCV[,2])
>>>>>>> HPDinterval(m2$VCV[,2]/(1-m2$Lambda)^2)
>>>>>>>
>>>>>>> HPDinterval(m1$Sol)
>>>>>>> HPDinterval(m2$Sol/(1-m2$Lambda))
>>>>>>
>>>>>>
>>>>>> 	[[alternative HTML version deleted]]
>>>>>>
>>>>>> _______________________________________________
>>>>>> R-sig-mixed-models at r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> The University of Edinburgh is a charitable body, registered in
>>>>> Scotland, with registration number SC005336.
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> The University of Edinburgh is a charitable body, registered in
>>> Scotland, with registration number SC005336.
>>>
>>>
>>
>
>
>

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.