[Bioc-devel] Warning when Reading Example VCF

Obenchain, Valerie Valerie.Obenchain at roswellpark.org
Tue Sep 27 19:07:24 CEST 2016


Hi Dario,

On 09/27/2016 01:00 AM, Dario Strbenac wrote:
> Good day,
>
> When importing a VCF file from VariantAnnotation's data directory into R, a warning is emitted.
>
> library(VariantAnnotation)
> aFile <- system.file("extdata", "hapmap_exome_chr22.vcf.gz", package = "VariantAnnotation")
> aSet <- readVcf(aFile, "hg19")
>
> Warning message:
> In .bcfHeaderAsSimpleList(header) :
>   duplicate keys in header will be forced to unique rownames

Header info is grouped by category (geno, info, meta) and put into
DataFrames. Within each grouping, row names are taken from different
parts of the header.  In this case the warning comes from having 2
'source' lines.

less hapmap_exome_chr22.vcf.gz

...
##source=CalculateGenotypePosteriors
##source=SelectVariants

These end up as element 'META' in the meta() list. This is a catch all
category, as you can see, that holds key value pairs that don't meet
other criteria.

> meta(hdr)$META
DataFrame with 5 rows and 1 column
                                                                                           
Value
                                                                                     
<character>
fileformat                                                                               
VCFv4.1
GVCFBlock                                                  
minGQ=0(inclusive),maxGQ=1(exclusive)
reference 
file:///projects/cidr/Amos/amos_cidr/Analysis_Pipeline_Files/human_g1k_v37_decoy.fasta
source                                                               
CalculateGenotypePosteriors
source.1                                                                          
SelectVariants


>
> Is there some problem with one of the VCF file's format which is distributed with VariantAnnotation ? I wouldn't expect any package data files to emit warnings to the end user.
It's uncommon (I think) for files to have multiple 'source' lines but
not incorrect. The warning was added as a heads up and was probably done
after the file was already in the package. I've added some documentation
to ?scanVcfHeader about this. If others feel strongly about this it
could be changed to a message or just removed.

Valerie

>
> R version 3.3.1 (2016-06-21)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Ubuntu 15.10
> VariantAnnotation 1.18.7
>
> --------------------------------------
> Dario Strbenac
> University of Sydney
> Camperdown NSW 2050
> Australia
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>



This email message may contain legally privileged and/or confidential information.  If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited.  If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you.


More information about the Bioc-devel mailing list