[R-sig-Geo] weighted spatial autoregression

Thu Aug 30 08:28:59 CEST 2007

On Wed, 29 Aug 2007, Sam Field wrote:

> Roger,
>
> The reason seems
>> to be that W+G did the same as spautolm() (in SAS?) - find the spatial
>> autoregressive coefficient first (optimise in one dimension), then use GLS
>> to find the regression coefficients.
>
> Don't the weights play a role in the optimization to find lambda? 
> Certainly the location of lambda is influenced by the weights employed 
> by W+G. Wouldn't they also influence the lcoation of rho in the spatial 
> lag model?

Yes, both in the sum of squared errors, and in the spautolm() 
implementation as a separate term in the log likelihood, rather than in a 
combined Jacobian. In the spatial lag case, the auxiliary regressions 
would be weighted, so the sum of squared errors term would be affected, 
but I'm unsure about the extra term in the log likelihood.

>
> A while ago I wrote some SAS code to fit a spatial lag model and 
> calculate the variance covariance matrix of the parameters, so I am a 
> somewhat familiar with the two step procedure in that case. I have never 
> really messed with the spatial error model.
>
> I am actually pretty happy being confined to a weighted spatial error 
> model for the moment, since it would seem to me that spill over effects 
> from the X%*%beta can always be accomodated by including W%*%X%*%tau 
> terms in a spatial error model (though one must assume that the 
> influence of spatially lagged covariates stop at first order 
> neighbors?).  I wondered if including W%*%X%*%tau into a spatial error 
> model would lead to inconsistent paramter estimates since rho*W%*% X is 
> also in the model.  Some quick simulation suggested that the consistency 
> of the parameter estimates was not affected.

The error model is, as you say, doing:

(I - rho W) y = (I - rho W) X beta + u

y = rho W y + X beta - rho W X beta + u

so

y = rho W y + (X beta + W X tau) - rho W (X beta + W X tau) + u

might become unstable if any of the X's are highly autocorrelated, leading 
to aliasing (columns would drop out of the QR solution). However, in the 
lag IV fitting method, W X (and maybe W W X too) are used as instruments 
for W y, so it is worth exploring.

>
> I am not sure that a true spatial lag process is theoretically 
> compelling in my case anyway, now that I think about it. In fact, the 
> idea the spatial correlation among the residuals is due to a "pure" 
> contagion process (as represented by the pWy term) would seem pretty 
> rare in the case of most complex phenomona - which is why the 
> type='mixed" option in lagsarlm() is so useful!
>
> I wonder if
>
> Y = Xbeta + WXtau + pWe + u
>
> isn't a sensible alternative to the spatial lag model when a contagion 
> process is not theoretically plausible but where spill over effects of 
> neighboring covariates are.
>

If you explore this, it would be interesting to hear how you get on.

Best wishes,

Roger

>
> thanks again for your amazing support of the spdep package.
>
> cheers,
>
> Sam
>
>
>
>
>
>
>
>
>
>
>
>
>
> Quoting Roger Bivand <Roger.Bivand at nhh.no>:
>
>> Sam,
>>
>> On Wed, 29 Aug 2007, Sam Field wrote:
>>
>>> Roger,
>>>
>>> One possibility in this limited case might be to replicate the aggregate
>>> level cases based on their respective weights (since they are integers,
>>> i.e. within unit sample sizes), then run a spatial lag model.  This
>>> would be equivalent to recreating the individual level data from the
>>> aggregate data (excluding measures that vary within the aggregate
>>> units).  This would obviously inflate your sample size and one would
>>> have to correct for this somehow in the variance covariance matrix of
>>> the parameters estimates.
>>>
>>> You would have to do the same for your nb object as well of course.  I
>>> have looked into this by creating a list of neighbor ids from the
>>> original nb object, but nb2listw() requires an nb object not a list so I
>>> am stuck.
>>>
>>
>> You could fake it with nb2blocknb, but that was not written for this case,
>> but for the case when the individual level variables were observed, but
>> that there was no address or coordinates, just a postal code. Here the LHS
>> and RHS would be replicated, which doesn't seem desirable.
>>
>>> The other problem would be that you would end up with a potentially
>>> large data set. In my case, 13,000 - maybe more then spautolm() could
>>> handle?  Maybe this whole idea if flawed.
>>>
>>>
>>> Thanks again for your input! The results change quite a bit with the
>>> weighted SAR models.
>>
>> One interesting conclusion that I've reached is that while the spdep code
>> in spautolm() replicates Waller and Gotway for unweighted and weighted SAR
>> and CAR, S-Plus SpatialStats fails on the weighted CAR. The reason seems
>> to be that W+G did the same as spautolm() (in SAS?) - find the spatial
>> autoregressive coefficient first (optimise in one dimension), then use GLS
>> to find the regression coefficients. But S+ seems to try to optimise all
>> the coefficients at once, and gets bitten by the fact that
>> (I - \rho W) %*% diag(wts) in their case is not symmetric (W has to be
>> symmetric, and the wts have to "balance" - see Cressie etc. Now I'm not
>> sure that S+ is right here. If not, then the lag model can be given
>> weights too, by simply passing them to the auxilliary regressions used to
>> set up the framework for optimisation. The analytical covariance matrix of
>> the coefficients remains a problem, though. We'd need to use some other
>> mechanism to get there for the eigen method, though the LR tests used for
>> sparse methods would be, I think, OK. I've also been playing with sampling
>> from a fitted model, to generate synthetic "standard errors", like
>> mcmcsamp() in lme4, but I don't know if it is sensible, or how well it
>> would scale to many observations.
>>
>> So I am thinking about how lagsarlm() could get weights, but it won't
>> happen too fast, maybe.
>>
>> Best wishes,
>>
>> Roger
>>
>>>
>>>
>>> Sam
>>>
>>>
>>>
>>> Roger Bivand wrote:
>>>> On Tue, 21 Aug 2007, Sam Field wrote:
>>>>
>>>>
>>>>> Thanks Roger!
>>>>>
>>>>> Sorry about omitting the subject line.  I have been working with
>> errorsarlm() -
>>>>> did not know about spautolm().  Do you know if there is something
>> analogous
>>>>> possible in the case of the spatial lag model,
>>>>>
>>>>> Y = pWY + XB + e ?
>>>>>
>>>>
>>>> I have not looked at it, but because it is a wierd animal, I don't think
>>>> it will be too easy to provide a theoretical foundation for it. The
>>>> heteroskedasticity is in the error term, but the autoregressive part
>>>> isn't. I don't think there are any examples anywhere, either.
>>>>
>>>> It ought to be possible, though.
>>>>
>>>> Roger
>>>>
>>>>
>>>>> I was going to start looking into it.
>>>>>
>>>>> thanks!
>>>>>
>>>>>
>>>>> Sam
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Quoting Roger Bivand <Roger.Bivand at nhh.no>:
>>>>>
>>>>>
>>>>>> On Tue, 21 Aug 2007, Sam Field wrote:
>>>>>>
>>>>>>
>>>>>>> List,
>>>>>>>
>>>>>>> I am looking for ways of estimating spatial autoregression models that
>>>>>>>
>>>>>> adjust
>>>>>>
>>>>>>> for a known source of heteroskedaticity and the Waller and Gotway
>> (2004)
>>>>>>>
>>>>>> text
>>>>>>
>>>>>>> outline how this can be done in the case of the SAR model.  If I work
>> at
>>>>>>>
>>>>>> it, I
>>>>>>
>>>>>>> think I can implement this myself in R, but I wanted to see if anybody
>> else
>>>>>>>
>>>>>> had
>>>>>>
>>>>>>> done it. It seems like a pretty straightforward generalization of the
>> SAR
>>>>>>>
>>>>>> model
>>>>>>
>>>>>>> and would make a very helpful addition to the spatial regression tools
>> in
>>>>>>> spdep - especially given the effects of heteroskedaticity on the
>>>>>>>
>>>>>> consistency of
>>>>>>
>>>>>>> the SAR parameters!
>>>>>>>
>>>>>> ?spautolm
>>>>>>
>>>>>> The examples reproduce the results in Waller & Gotway, perhaps apart
>> from
>>>>>> a flattish function to optimise in the weighted CAR case. spautolm()
>> now
>>>>>> provides weighted or unweighted SAR, CAR, and SMA. Sparse matrix
>> methods
>>>>>> are available for SAR and CAR, SAR when spatial weights are symmetric
>> or
>>>>>> similar to symmetric (CAR weights have to be symmetric).
>>>>>>
>>>>>> Roger
>>>>>>
>>>>>>
>>>>>>> Sam
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> --
>>>>>> Roger Bivand
>>>>>> Economic Geography Section, Department of Economics, Norwegian School
>> of
>>>>>> Economics and Business Administration, Helleveien 30, N-5045 Bergen,
>>>>>> Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
>>>>>> e-mail: Roger.Bivand at nhh.no
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>> --
>> Roger Bivand
>> Economic Geography Section, Department of Economics, Norwegian School of
>> Economics and Business Administration, Helleveien 30, N-5045 Bergen,
>> Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
>> e-mail: Roger.Bivand at nhh.no
>>
>>
>
>
>

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no