[R-sig-networks] Questions about interpreting ERGM goodness of fit

Thu Jul 28 00:11:59 CEST 2016

Hi Erick,

This indicates a lack of theoretical understanding of what is creating 
your network topology. Most likely, the interplay of endogenous network 
statistics you have chosen collectively produces networks that do not 
look like the network you observe. It is less likely, but may be the 
case, that there is an omitted exogenous variable the inclusion of which 
would solve the problem. I would think more carefully about what 
theoretical mechanisms may be at work with regard to the endogenous 
dependencies. In the absence of a good theory, GWESP or transitiveties 
may be candidate model terms that can potentially solve the issue. The 
weird distribution with peaks at 3 and 7 may indicate that the problem 
may have occurred at the data collection stage, but I am not familiar 
with your field or application. Perhaps because you binarized the 
distance matrix and incurred a severe loss of information? In that case, 
you may want to try the GERGM package for weighted ERGMs, or perhaps 
model the original bipartite network before applying the distance 
measure (in case it was binary). You may also want to look at the xergm 
package, which has a replacement for statnet's gof function that adds a 
couple of interesting extra functions, like more auxiliary statistics 
for comparison, ROC and PR curves for assessing predictive performance, 
and out-of-sample GOF.

Take care,

Philip

On 27/07/16 22:56, Erick LeBrun wrote:
> Hello all,
>
> I apologize in advance as I am fairly new to networks. I have two
> networks generated in R with sample sites as nodes representing
> microbial community relatedness between sites (as bray-curtis distance
> from an OTU counts table). I am interested in looking at two categorical
> groupings as factors in network design and have built a few ERGMs to
> explore this. Everything looks good in mcmc.diagnostics() and summary()
> and the anova.ergm() looks good as well.
>
> My problem occurs in testing goodness of fit (gof()). The degree graph
> is not too bad but the edge-wise shared partners graph for the data
> peaks far to the right of the simulated (for example simulated will peak
> at~3 and actual data has a small peak at ~3 and then a huge one at ~7.
>
> I know that there are probably important factors missing from these
> models and I am wondering if that may be what is causing the issue(?) My
> main interest is in demonstrating importance in my categorical
> variables. Or is this indicative of some larger problem? I have looked
> through many resources explaining how to generate goodness of fit
> metrics but have not found anything to help me interpret this issue.
>
> Thank you in advance!
>
> Erick
>
> _______________________________________________
> R-sig-networks mailing list
> R-sig-networks at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-networks