[Bioc-devel] VariantAnnotation: 'readVcf' returns incomplete 'CSQ' entry

Sean Davis sdavis2 at mail.nih.gov
Tue Feb 26 19:29:38 CET 2013


On Tue, Feb 26, 2013 at 1:23 PM, Julian Gehring <julian.gehring at embl.de> wrote:
> Hi Richard,
>
> That is true that is not according to the specs.  However, the 'ensemblVEP'
> package has the 'parseCSQToGRanges' method which extracts this kind of CSQ
> information from a 'VCF' object (as read in by 'readVcf') and fails at the
> prematurely ended entries.

And Ensembl is not being "arbitrary" in this:

http://www.hgvs.org/mutnomen/recs-prot.html#silent

Interesting how these standards work, isn't it....

> Perhaps having a VCF reader that tolerates this
> is useful.
>
> Best wishes
> Julian
>
>
> On 02/26/2013 07:17 PM, Richard Pearson wrote:
>>
>> Hi Julian
>>
>> I think your vcf file is off-spec. From the vcf spec at
>>
>> http://www.1000genomes.org/wiki/Analysis/Variant%20Call%20Format/vcf-variant-call-format-version-41
>>
>> "INFO additional information: (String, no white-space, semi-colons, or
>> equals-signs permitted;"
>>
>> So that equals-sign in your INFO isn't allowed
>>
>> HTH
>>
>> Richard
>>
>>
>> On 26/02/2013 17:50, Julian Gehring wrote:
>>>
>>> Hi,
>>>
>>> I tried to use the latest devel version of 'readVcf' to import a VCF
>>> file with information from the ensembl VEP
>>> (http://www.ensembl.org/info/docs/variation/vep/index.html).
>>>
>>> For a VCF entry with CSQ information like
>>>
>>> ""
>>> 1    887899    .    A    G    .    .
>>>
>>> NS=1;CSQ=G|ENSG00000188976|ENST00000327044|Transcript|synonymous_variant|1134|1084|362|L|Ttg/Ctg||10/19||NOC2L|||||||YES||||ENSP00000317992||CCDS3.1|ENST00000327044.6:c.1084T>C|ENST00000327044.6:c.1084T>C(p.=)|
>>> AR:RR:DP:AAP:RAP    2:14:16:1:1
>>> ""
>>>
>>> the imported info field ends prematurely without any warning:
>>>
>>> ""
>>>
>>> G|ENSG00000188976|ENST00000327044|Transcript|synonymous_variant|1134|1084|362|L|Ttg/Ctg||10/19||NOC2L|||||||YES||||ENSP00000317992||CCDS3.1|ENST00000327044.6:c.1084T>C|ENST00000327044.6:c.1084T>C(p.
>>>
>>> ""
>>>
>>> Versions:
>>> R 2013-02-25 r62062
>>> VariantAnnotation_1.5.39
>>>
>>>
>>> Best wishes
>>> Julian
>>>
>>> _______________________________________________
>>> Bioc-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>
>>>
>>
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel



More information about the Bioc-devel mailing list