[Rd] Genetics package: Heterozygosity and PIC incorrect (PR#3639)
gerard.tromp at sanger.med.wayne.edu
gerard.tromp at sanger.med.wayne.edu
Mon Aug 4 01:03:00 MEST 2003
Full_Name: Gerard Tromp
Version: 1.7.0
OS: Windows 2000
Submission from: (NULL) (146.9.5.86)
In function summary.genotype:
Heterozygosity (H) and PIC are computed using af (allele frequency), however, af
is derived as a frequency count whereas the computation of H and PIC are in
terms of frequency as a proportion.
Using the example data provided (genetics.R) H and PIC return large negative
numbers whereas they are bounded 0,1.
Output from buggy version:
*************************************************************************
Marker: MBP2:C-104T (9q35:-104) Type: SNP
Number persons typed: 100 (100%)
Allele Frequency: (2 alleles)
Count Proportion
C 137 0.69
T 63 0.32
Genotype Frequency:
Count Proportion
C/C 59 0.59
C/T 19 0.19
T/T 22 0.22
Heterozygosity (Hu) = -22967 <------ Impossible
Poly. Inf. Content = -1.5e+08 <------ Impossible
***************************************************************************
Output from fixed version:
***************************************************************************
Marker: MBP2:C-104T (9q35:-104) Type: SNP
Number persons typed: 100 (100%)
Allele Frequency: (2 alleles)
Count Proportion
C 137 0.69
T 63 0.32
Genotype Frequency:
Count Proportion
C/C 59 0.59
C/T 19 0.19
T/T 22 0.22
Heterozygosity (Hu) = 0.44 <--- Correct
Poly. Inf. Content = 0.34 <--- Correct
***************************************************************************
Fix is detailed below.
***************** 2539,2554 c 2539,2565
********************************************
### from code submitted by David Duffy <davidD at qimr.edu.au>
#
n.typed<-sum(gf)
correction<-n.typed/max(1,n.typed-1)
# Gerard Tromp 20030803
# Heterozygosity and PIC reported are incorrect.
# Problem due to use of af which is a count rather than prop.table(af) a
# proportion (fraction) less than 1. Definition of af changed?
#
## Original
# ehet<-(1-sum(af*af))
# the following produces desired result
ehet<-(1-sum(prop.table(af)*prop.table(af)))
## Original
# matings<- (af %*% t(af))^2
# the following produces desired result
matings <- (prop.table(af) %*% t(prop.table(af)))^2
uninf.mating.freq <- sum(matings)-sum(diag(matings))
pic<- ehet - uninf.mating.freq
retval$Hu <- correction * ehet
retval$pic <- pic
retval$n.typed <- n.typed
retval$n.total <- length(object)
retval$nallele <- nallele(object)
#
###
More information about the R-devel
mailing list