[Bioc-devel] Bugfixes for VariantAnnotation::write_vcf
Richard Pearson
rpearson at well.ox.ac.uk
Thu Mar 29 16:02:11 CEST 2012
Hi
I recognise that writeVcf is currently "under construction", but I've
rather come to rely on it, so have made a few bug fixes to get it
working for me. Could the following (rather clumsy) patches, or some
other work around, be included in time for the next release?
1) In .makeVcfMatrix, the case where ALT is a CompressedCharacterList is
not correctly handled. Adding the following gets around this:
if(is(ALT, "CompressedCharacterList")) {
ALT <- unlist(ALT)
}
2) In .makeVcfMatrix, the dat <- gsub("NA", ".", dat) can have the
effect of modifying names in the INFO field that contain the string "NA"
(e.g. "MYNAMES" gets changed to "MY.MES"). The following gets around this:
dat <- gsub("^NA$", ".", dat)
3) In .makeVcfGeno, fields with mutliple values (e.g. GL) get split into
separate fields. The following got me round this (note use of
recursive=FALSE in unlist, plus extra cleaning up lines, plus changing
nsub to length(geno) in lst line which removes warnings about split):
subj <- lapply(seq_len(nsub),
function(i) {
dat <- unlist(lapply(geno, function(fld) fld[,i]),
use.names=FALSE, recursive=FALSE)
mat <- matrix(dat, ncol=length(geno))
mat <- gsub("^NA$", NA, mat)
lst <- split(mat, rep(seq_len(nrec), length(geno)))
rmna <- lapply(lst, na.omit)
collapsedText <- .pasteCollapse(CharacterList(rmna),
collapse=":")
rmc <- gsub("c\\(([^\\)]*)\\)", "\\1", collapsedText)
gsub(" ", "", rmc)
})
Great package!
Thanks
Richard
More information about the Bioc-devel
mailing list