[R-sig-ME] glmmTMB spatial model - modified distance calculation
Kim Colyvas
k|m@co|yv@@ @end|ng |rom newc@@t|e@edu@@u
Sun Jan 9 00:29:19 CET 2022
I am assisting a researcher investigating the number of species with deficient data (conservation assessors are unable to allocate a species to a threat status category) for many countries around the world and examining covariates that might explain some of the variation between countries. I have been fitting a negative binomial (nb) model using glmmTMB with a spatial term based on (longitude,latitude) coordinates.
The default distance function in R is dist (from the stats package) is satisfactory for most work where relatively smaller distances were involved. Our country data however covers the whole globe and the default cartesian distance calculation used by the dist function I think differs too much from the more correct great circle distance approach that takes spherical geometry into account.
See Fig 1 in the cloud folder at the link below for a comparison of the two methods. The upper half of the sideways V shape pattern in the graph is because the default cartesian calculation doesn't allow for points on the globe being more than 180 deg of longitude apart in which case the shorter of the two possible distances between two points should be used. The great circle approach does this as well as allowing for the curvature of the earth's surface.
I decided to write an alternative distance function that mimicked the built-in dist function hoping it would be used by glmmTMB and the spatial correlation function would be determined on the great circle distances. I am writing because I don't think this has worked. I would appreciate suggestions for how to get glmmTMB to use the alternative distances in determining the spatial correlation function.
At this link I have provided the essentials of my code to allow replication of my work.
https://1drv.ms/u/s!AhpuyzHadR1ag_tZI15yXnkuFJL9Yw?e=s6tAZq
It contains
- my code for the replacement distance function which draws on the great circle distance functions provided in the geosphere package (dist_gc.R)
- the code for calculating the distances by the two methods and comparing them and then fitting the nb models by both methods and how I evaluated the correlograms for the models by both methods which was part of what led me to think that glmmTMB was not using the alternative distance function (dist_repex.R)
- 3 figures - Fig 1 showing the comparison between methods, and Fig 2,3 the correlograms from the two models.
Re how I implemented the replacement of the built-in distance method with the great circle calculations.
The comparison between methods in Fig 1 was part of satisfying myself that the method I was using was replacing the default distance function with my own. Although this worked in this simple test I am wondering what more I need to do to get glmmTMB to pick up the new function.
> # Built in distance function in R - from the stats package
> d1=dist(lonlat1)
> # Replacement distance function to calculate great circle distances using the Cosine formula
> dist = function(x) {
+ dist_gc(x,columns_to_retain=10)
+ }
> # Confirm presence of 2 functions with name dist
> getAnywhere(dist)
2 differing objects matching 'dist' were found
in the following places
.GlobalEnv
package:stats
namespace:stats
Use [] to view one of them
> # Now try the replacement function as if it was the built-in function using the built-in function name
> # It takes precedence over the dist function in the stats module
> d7 = dist(lonlat1)
> # They don't agree in that there is no straight line
This is the plot in Figure 1
> plot(d1 ~ d7,main="Compare replacement distance function with built in",
+ xlab="Great circle",ylab="Cartesian")
I then fitted the same model twice as follows - first step - remove my new function
> rm(dist, envir=.GlobalEnv)
>
> # Confirm presence of only 1 function with name dist
> getAnywhere(dist)
A single object matching 'dist' was found
It was found in the following places
package:stats
namespace:stats
Fitted the model and as the only distance function present was the default this model should have used cartesian coordinates.
# Fit spatial model using the default distance function
nb2_cart <- glmmTMB(total ~ log_pop + I(log_pop^2) + exp(pos + 0 | group), data=species1, family = nbinom1())
> summary(nb2_cart)$coefficients$cond
Estimate Std. Error z value Pr(>|z|)
(Intercept) 4.15405063 0.515443090 8.059184 7.680550e-16
log_pop -0.24664756 0.072544883 -3.399930 6.740300e-04
I(log_pop^2) 0.01558456 0.002724543 5.720062 1.064851e-08
I then changed the distance calculation method
> # Redefine the distance function to calculate great circle distances
> dist = function(x) {
+ dist_gc(x,columns_to_retain=10)
+ }
>
> # Confirming that 2 dist functions are present again
> getAnywhere(dist)
2 differing objects matching 'dist' were found
in the following places
.GlobalEnv
package:stats
namespace:stats
Use [] to view one of them
> getAnywhere(dist)[1]
function(x) {
dist_gc(x,columns_to_retain=10)
}
I expected that as had occurred in my simple comparison of distance functions above that glmmTMB would use the new distance function in the modelling. The summary of the model coefficients however was exactly the same. I would have expected with a different spatial function that the estimates and SEs would differ. They didn't so I assumed that my replacement distance function was not being used by glmmTMB.
> summary(nb2_gc)$coefficients$cond
Estimate Std. Error z value Pr(>|z|)
(Intercept) 4.15405063 0.515443090 8.059184 7.680550e-16
log_pop -0.24664756 0.072544883 -3.399930 6.740300e-04
I(log_pop^2) 0.01558456 0.002724543 5.720062 1.064851e-08
I used some very helpful code at this link to plot the correlograms from the 2 models.
https://datascienceplus.com/spatial-regression-in-r-part-1-spamm-vs-glmmtmb/
Fig 2 was as expected and if my new distance function had worked Fig 3 should have been similar - a perfect curve with no points deviating from the smooth function, except the function might have had a different shape.
I checked the call to the distance function in the glmmTMB R code and there was no specific call like stats::dist, so I thought my replacement function should have worked.
# Lines 748-750 in glmmTMB source code show the distance function being used
# else if(ss[i] %in% c("exp", "gau", "mat")){
# coords <- parseNumLevels(reTrms$cnms[[i]])
# tmp$dist <- as.matrix( dist(coords) )
A couple of other questions.
a) When fitting the nb model the warning below was given.
An interpretation of this would be appreciated.
Warning message:
In (function (start, objective, gradient = NULL, hessian = NULL, :
NA/NaN function evaluation
b) Is there any way to obtain the parameters of the fitted spatial correlation function?
Thanks for reading this long email,
Mr Kim Colyvas
Casual research assistant specialising in statistical help
University of Newcastle
AUSTRALIA
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models
mailing list