[R-sig-genetics] Hierfstat: g-statistics permutation

Daniel Schmidt d.schmidt at griffith.edu.au
Wed Nov 2 05:25:58 CET 2016

Using some of the permutation functions in hierfstat (test.g; test.between;
gstat.randtest), I've encountered the same issue raised in this recent post:

In my case (6 pops x 8ind; ~8000 loci), testing for significant
differentiation among pops with small numbers of loci (e.g. 20) appears
normal, but as number of loci increase >100 the permuted g-statistics
mostly converge on a single value or 0.

This appears related to missing data. Demonstrated below on small dataset.

#Simulate 100 loci
dat.sim <-

#perform permutation test with no missing data
g <- test.g(dat.sim[,-1], level = dat.sim$Pop, nperm = 100)

#are all 100 permuted g-statistics are unique? yes

#add one missing genotype per locus. Total missing data = 1%
dat.sim[,-1] <- apply(dat.sim[,-1], 2, function(x){x[sample(1:48,1)] <- NA;

#how many permuted g-statistics are unique? generally less than 20.

Is there any way to use these functions to calculate p-values when missing
data is present? The functions work fine when I remove loci with missing
data, so it's not the end of the world.
Thanks for any advice.
Regards, Dan

Dr. Daniel J. Schmidt
Research Fellow, Australian Rivers Institute
Griffith University 170 Kessels Road, Nathan
Brisbane QLD 4111 Australia

d.schmidt at griffith.edu.au
Office: +61 7 37354165

	[[alternative HTML version deleted]]

More information about the R-sig-genetics mailing list