[BioC] How to combine VCF-class and data.frame with annotations

Martin Morgan mtmorgan at fhcrc.org
Wed Aug 29 18:20:06 CEST 2012


On 08/29/2012 07:34 AM, Stefan Dentro wrote:
> That works well!
>
> vcf.df = as(elementMetadata(vcf), "DataFrame")

for what it's worth, elementMetadata(vcf) is already a DataFrame, so no 
need for the coercion in the line above.

Martin

> anno.df = as(anno, "DataFrame")
> elementMetadata(vcf) = cbind(vcf.df, anno.df)
>
> Cheers!
>
> Stefan
>
> On Wed, Aug 29, 2012 at 3:24 PM, Vincent Carey
> <stvjc at channing.harvard.edu>wrote:
>
>>
>>
>> On Wed, Aug 29, 2012 at 10:16 AM, Stefan Dentro <sdentro at gmail.com> wrote:
>>
>>> Hello,
>>>
>>> I'm trying to read in a VCF file containing mutation information from one
>>> sample, annotate it with Ensembl gene and GO information and then plot
>>> using ggbio. But I keep running into the problem of how to combine all
>>> information in one single GRange object.
>>>
>>> So I've got a VCF-class object and a data.frame containing for each
>>> mutation whether it is exonic, intronic or intergenic, a gene identifier
>>> (possibly NA) and a GO identifier (possibly NA).  ggbio accepts
>>> GRange-class objects so I would like to merge the VCF-class and data.frame
>>> into one GRange object containing all information.
>>>
>>> I can think of multiple ways of doing this, but none really work or are
>>> satisfactory:
>>> 1) read in the VCF, convert it into a GRange object. cbind elementMetadata
>>> with the data.frame and create a new GRange object.
>>>
>>> Problem: elementMetadata cannot be merged with a data.frame:
>>> Error in FUN(X[[3L]], ...) :
>>>    conversion of list columns to a data.frame is not supported
>>>
>>>
>> try a DataFrame instance
>>
>>
>>
>>> 2) Directly annotate the VCF file through cbind, again:
>>> Error in FUN(X[[3L]], ...) :
>>>    conversion of list columns to a data.frame is not supported
>>>
>>> 3) Convert the VCF to GRange and add each column in the data.frame
>>> separately:
>>> gr$external_gene_id=df$external_gene_id
>>>
>>> There must be a simpler way to do this.
>>>
>>> 4) Convert the VCF into a tab delimited file using vcf-to-tab in vcftools
>>> and read it in as a data.frame. Merge both data.frames and create a
>>> GRange.
>>>
>>> I would think it is all possible within R, without converting the VCF-file
>>> first. This one comes really close though.
>>>
>>> It boils down to the following question: What is the proper way of doing
>>> this using the available R genomics packages?
>>>
>>> Best wishes,
>>> Stefan
>>>
>>>          [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>
>>
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>


-- 
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioconductor mailing list