[BioC] VariantTools callVariants - binomial likelihood ratio
xiaoheyiyh at yahoo.com
Mon Nov 18 00:03:15 CET 2013
I try to use VariantTools package package and I am working on the vignette examples. I have some questions on the variant calling procedure. The vignette says: “The callVariants function uses a binomial likelihood ratio test for this purpose. The ratio is P(D|p = plower)=P(D|p = perror).”
To understand this statement, I located the two internal functions that carry out this step.
function (p0, p1, n = if (C == 1L) 1L else n, C = 1L)
num <- (1/n) * log(C) + log(1 - p0) - log(1 - p1)
denom <- log(p1) - log(p0) + log(1 - p0) - log(1 - p1)
function (p.lower = 0.2, p.error = 1/1000)
freq.cutoff <- lrtFreqCutoff(p.error, p.lower)
sample.freq <- altDepth(x)/totalDepth(x)
passed <- sample.freq >= freq.cutoff
passed[is.na(passed)] <- FALSE
However, I get even more confused. Here are my questions:
The binomial likelihood ratio should be p1^k*(1-p1)^(n-k)/(p0^k*(1-p0)^(n-k)), here n and k are total trial and number of hits. I don’t understand why it becomes num/denom in the lrtFreqCutoff function?
What’s n and C in lrtFreqCutoff? These arguments are not used when BinomialLRFilter calls lrtFreqCutoff, why?
In BinomialLRFilter, the criterion used is sample.freq >= freq.cutoff. But sample.freq <- altDepth(x)/totalDepth(x) is a frequency as defined, yet freq.cutoff seems to be the binomial likelihood ratio. How a frequency can be compared to likelihood ratio?
In BinomialLRFilter, p0=perror = 0.001 and p1=plower = 0.2, the default p0 and p1 values are assumed, why not used the actual sequencing error rate and lowest variant frequency?
Hope someone can explain these problems to me. Thank you!
More information about the Bioconductor