[R-sig-networks] Questions about interpreting ERGM goodness of fit
Philip Leifeld
philip.leifeld at ipw.unibe.ch
Thu Jul 28 00:11:59 CEST 2016
Hi Erick,
This indicates a lack of theoretical understanding of what is creating
your network topology. Most likely, the interplay of endogenous network
statistics you have chosen collectively produces networks that do not
look like the network you observe. It is less likely, but may be the
case, that there is an omitted exogenous variable the inclusion of which
would solve the problem. I would think more carefully about what
theoretical mechanisms may be at work with regard to the endogenous
dependencies. In the absence of a good theory, GWESP or transitiveties
may be candidate model terms that can potentially solve the issue. The
weird distribution with peaks at 3 and 7 may indicate that the problem
may have occurred at the data collection stage, but I am not familiar
with your field or application. Perhaps because you binarized the
distance matrix and incurred a severe loss of information? In that case,
you may want to try the GERGM package for weighted ERGMs, or perhaps
model the original bipartite network before applying the distance
measure (in case it was binary). You may also want to look at the xergm
package, which has a replacement for statnet's gof function that adds a
couple of interesting extra functions, like more auxiliary statistics
for comparison, ROC and PR curves for assessing predictive performance,
and out-of-sample GOF.
Take care,
Philip
On 27/07/16 22:56, Erick LeBrun wrote:
> Hello all,
>
> I apologize in advance as I am fairly new to networks. I have two
> networks generated in R with sample sites as nodes representing
> microbial community relatedness between sites (as bray-curtis distance
> from an OTU counts table). I am interested in looking at two categorical
> groupings as factors in network design and have built a few ERGMs to
> explore this. Everything looks good in mcmc.diagnostics() and summary()
> and the anova.ergm() looks good as well.
>
> My problem occurs in testing goodness of fit (gof()). The degree graph
> is not too bad but the edge-wise shared partners graph for the data
> peaks far to the right of the simulated (for example simulated will peak
> at~3 and actual data has a small peak at ~3 and then a huge one at ~7.
>
> I know that there are probably important factors missing from these
> models and I am wondering if that may be what is causing the issue(?) My
> main interest is in demonstrating importance in my categorical
> variables. Or is this indicative of some larger problem? I have looked
> through many resources explaining how to generate goodness of fit
> metrics but have not found anything to help me interpret this issue.
>
> Thank you in advance!
>
> Erick
>
> _______________________________________________
> R-sig-networks mailing list
> R-sig-networks at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-networks
More information about the R-sig-networks
mailing list