[R-sig-Geo] error message when running errorsarlm

Thu May 22 18:24:48 CEST 2008

On May 22 2008, Roger Bivand wrote:

>>> Does that mean that you get a sensible lambda for your model now - the 
>>> line search leads somewhere other than a boundary of the interval?
>>
>> I apologize for being unclear. I actually upgraded R and updated 
>> packages, then ran errorsarlm with method="Matrix" and got the same 
>> error messages I'd had previously (i.e., the search led to the boundary 
>> of the interval). I then tried your other suggestion and used 
>> method="spam" and got a result with no error messages.
>
>But we do not know why the two are not the same (they should be), so I 
>would still not trust the outcome. I would be interested in off-list 
>access to the data being used - I think that there is some issue with the 
>scaling of the variable values. Do you see the same difference using 
>spautolm(), which is effectively the same as errorsarlm(), but with a 
>different internal structure?

I do see the same difference using spautolm() and get no error messages 
using it. I'll send you then data separately and would appreciate your 
opinion on them.

>>> There are different traditions. Econometricians and some others in 
>>> social science try to trick the standard errors by "magic", while 
>>> epidemiologists (and crime people) typically use case weights - that is 
>>> model the heteroscedasticity directly. spautolm() can include such case 
>>> weights. I don't think that there is any substantive and reliable 
>>> theory for adjusting the SE, that is theory that doesn't appeal to 
>>> assumptions we already know don't hold. Sampling from the posterior 
>>> gives a handle on this, but is not simple, and doesn't really suit 10K 
>>> observations.
>>> 
>> Can you explain "magic" a little further? I'm running this for a 
>> professor who is a bit nervous about black box techniques and I'd like 
>> to be able to offer him a good explanation. I think he'll just have me 
>> calculate White's standard errors and ignore spatial autocorrelation if 
>> I can't be clearer.
>>
>
>If this is all your "professor" can manage, please replace/educate! The 
>model is fundamentally misspecified, and neither "magicing" the standard 
>errors, nor just fitting a simultaneous autoregressive error model will 
>let you make fair decisions on the "significance" or otherwise of the 
>right-hand side variables, which I suppose is the object of the exercise?
>
I agree here, but haven't been able to get much advice on this. I 
appreciate your input.

>(Looking at Johnston & DiNardo (1997), pp. 164-166, it looks as if White's 
>SE only help asymptotically (in Prof. Ripley's well-known remark, 
>asymptotics are a foreign country with spatial data), and not in finite 
>samples, and their performance is unknown if the residuals are 
>autocorrelated, which is the case here).

>The vast number of observations is no help either, because they certainly 
>introduce heterogeneity that has not been controlled for. Is this a grid 
>of global species occurrence data, by any chance? Which RHS variables are 
>covering for differences in environmental drivers? Or is there a better 
>reason for using many observations (instead of careful data collection) 
>than just their being available?
>
This is a hedonic regression with a goal of eliciting economic values for 
different percentages of tree cover on parcels and in the local 
neighborhood as capitalized in home sale prices. We're using all 2005 
residential sales from Ramsey and Dakota counties in Minnesota, USA as our 
observations. This gives us sales from most study area regions and for all 
months. I'll send you a description of the RHS variables with the dataset.

>More observations do not mean more information if meaningful differences 
>across the observations are not captured by included variables (with the 
>correct functional form). Have you tried GAM with flexible functional 
>forms on the RHS variables and s(x,y) on the (point) locations of the 
>observations?

I haven't tried this, but will look into it.  

>You are not alone in your plight, but if the inferences matter, then it's 
>better to be cautious, irrespective of the "professor".
>
Thanks very much for your help.

Regards,
Heather

--- 
Heather Sander
Ph.D. Candidate:  Conservation Biology
Office:  305 Ecology & 420 Blegen
Mail:  
University of Minnesota
Dept. of Geography
414 Social Science Bldg.
267 19th Ave. S.
Minneapolis, MN 55455
USA