[Bioc-devel] Bugfixes for VariantAnnotation::write_vcf

Richard Pearson rpearson at well.ox.ac.uk
Thu Mar 29 16:02:11 CEST 2012


I recognise that writeVcf is currently "under construction", but I've 
rather come to rely on it, so have made a few bug fixes to get it 
working for me. Could the following (rather clumsy) patches, or some 
other work around, be included in time for the next release?

1) In .makeVcfMatrix, the case where ALT is a CompressedCharacterList is 
not correctly handled. Adding the following gets around this:
     if(is(ALT, "CompressedCharacterList")) {
         ALT <- unlist(ALT)

2) In .makeVcfMatrix, the dat <- gsub("NA", ".", dat) can have the 
effect of modifying names in the INFO field that contain the string "NA" 
(e.g. "MYNAMES" gets changed to "MY.MES"). The following gets around this:
     dat <- gsub("^NA$", ".", dat)

3) In .makeVcfGeno, fields with mutliple values (e.g. GL) get split into 
separate fields. The following got me round this (note use of 
recursive=FALSE in unlist, plus extra cleaning up lines, plus changing 
nsub to length(geno) in lst line which removes warnings about split):
     subj <- lapply(seq_len(nsub),
         function(i) {
             dat <- unlist(lapply(geno, function(fld) fld[,i]), 
use.names=FALSE, recursive=FALSE)
             mat <- matrix(dat, ncol=length(geno))
             mat <- gsub("^NA$", NA, mat)
             lst <- split(mat, rep(seq_len(nrec), length(geno)))
             rmna <- lapply(lst, na.omit)
             collapsedText <- .pasteCollapse(CharacterList(rmna), 
             rmc <- gsub("c\\(([^\\)]*)\\)", "\\1", collapsedText)
             gsub(" ", "", rmc)

Great package!



More information about the Bioc-devel mailing list