[Rd] R package {genetics} function summary.genotype -- H and
(PR#3652)
ligges at statistik.uni-dortmund.de
ligges at statistik.uni-dortmund.de
Mon Aug 4 11:12:07 MEST 2003
gerard.tromp at sanger.med.wayne.edu wrote:
> Greetings!
>
> I found a bug in the function summary.genotype (details below). Essentially
> it appears that the original code from David Duffy used af, allele
> frequency, as a proportion but in the genetics package af is allele
> frequency as a count.
>
> I tried to submit a bug report together with the fix to CRAN (below). I am
> unsure of the success since the bug report does not appear in the list. I am
> therefore sending it by e-mail.
Please submit bug reports related to contributed packages to the
maintainer (in this case Gregory Warnes
<gregory_r_warnes at groton.pfizer.com>), not to r-bugs.
Uwe Ligges
> Sincerely,
>
> Gerard
> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> Gerard Tromp, Ph.D. vox: 313-577-8773
> Center for Molecular Medicine and Genetics fax: 313-577-7736
> Wayne State Univ. Schl. of Medicine
> 540 E. Canfield, Detroit, MI 48201 gtromp at cmb.biosci.wayne.edu
> mailto:tromp at sanger.med.wayne.edu
> http://cmmg.biosci.wayne.edu
>
>
>
> In function summary.genotype:
> Heterozygosity (H) and PIC are computed using af (allele frequency),
> however, af is derived as a frequency count whereas the computation of H and
> PIC are in terms of frequency as a proportion.
>
> Using the example data provided (genetics.R) H and PIC return large negative
> numbers whereas they are bounded 0,1.
>
> Output from buggy version:
> *************************************************************************
> Marker: MBP2:C-104T (9q35:-104) Type: SNP
>
> Number persons typed: 100 (100%)
>
> Allele Frequency: (2 alleles)
> Count Proportion
> C 137 0.69
> T 63 0.32
>
>
> Genotype Frequency:
> Count Proportion
> C/C 59 0.59
> C/T 19 0.19
> T/T 22 0.22
>
> Heterozygosity (Hu) = -22967 <------ Impossible
> Poly. Inf. Content = -1.5e+08 <------ Impossible
> ***************************************************************************
>
> Output from fixed version:
> ***************************************************************************
> Marker: MBP2:C-104T (9q35:-104) Type: SNP
>
> Number persons typed: 100 (100%)
>
> Allele Frequency: (2 alleles)
> Count Proportion
> C 137 0.69
> T 63 0.32
>
>
> Genotype Frequency:
> Count Proportion
> C/C 59 0.59
> C/T 19 0.19
> T/T 22 0.22
>
> Heterozygosity (Hu) = 0.44 <--- Correct
> Poly. Inf. Content = 0.34 <--- Correct
> ***************************************************************************
>
>
>
>
> Fix is detailed below.
> ***************** 2539,2554 c 2539,2565
> ********************************************
> ### from code submitted by David Duffy <davidD at qimr.edu.au>
> #
> n.typed<-sum(gf)
> correction<-n.typed/max(1,n.typed-1)
> # Gerard Tromp 20030803
> # Heterozygosity and PIC reported are incorrect.
> # Problem due to use of af which is a count rather than prop.table(af) a
> # proportion (fraction) less than 1. Definition of af changed?
> #
> ## Original
> # ehet<-(1-sum(af*af))
> # the following produces desired result
> ehet<-(1-sum(prop.table(af)*prop.table(af)))
> ## Original
> # matings<- (af %*% t(af))^2
> # the following produces desired result
> matings <- (prop.table(af) %*% t(prop.table(af)))^2
> uninf.mating.freq <- sum(matings)-sum(diag(matings))
> pic<- ehet - uninf.mating.freq
>
> retval$Hu <- correction * ehet
> retval$pic <- pic
> retval$n.typed <- n.typed
> retval$n.total <- length(object)
> retval$nallele <- nallele(object)
> #
> ###
> ****************************************************************************
> ******
>
> ______________________________________________
> R-devel at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list