[BioC] VariantAnnotation: fine define Locating variants in and around genes
Valerie Obenchain
vobencha at fhcrc.org
Thu Jan 31 23:24:07 CET 2013
On 01/31/2013 01:48 PM, Fabrice Tourre wrote:
> Valerie,
>
> Thank you for your reply.
>
> Is there a function in VariantAnnotation to know whether a snp is
> within transcription region but outside coding region? Or is it in
> first exon/intron?
Yes, the function is called locateVariants(). Use AllVariants() as the
'region' argument and subset your result on the utr and intron regions.
From the example below,
myregions <- c("intron", "threeUTR", "fiveUTR")
loc_coding[loc_coding$LOCATION %in% myregions]
Valerie
>
> On Thu, Jan 31, 2013 at 4:30 PM, Valerie Obenchain <vobencha at fhcrc.org> wrote:
>> Hi Fabrice,
>>
>> To identify snps (or any ranges) in introns only, use IntronVariants() as
>> the 'region' argument. The CodingVariants are the exon regions. If you want
>> all regions except coding, I would suggest using AllVariants().
>>
>> This output is from the man page example. The 'loc_coding' name is
>> misleading since AllVariants were use as 'region'. I have changed it to
>> 'loc_all' in the devel branch.
>>
>>> loc_coding <- locateVariants(vcf_adj, txdb, AllVariants())
>>> loc_coding
>> GRanges with 16 ranges and 7 metadata columns:
>> seqnames ranges strand | LOCATION QUERYID
>> <Rle> <IRanges> <Rle> | <factor> <integer>
>> chr1 [ 13220, 13220] * | intron 1
>> chr1 [ 13220, 13220] * | spliceSite 1
>> chr1 [ 13220, 13220] * | intron 1
>> chr1 [ 13220, 13220] * | intron 1
>> chr1 [ 13220, 13220] * | spliceSite 1
>> ...
>> ...
>>
>> This example has variants in splice sites, introns, coding and intergenic
>> regions.
>>
>>> tbl <- table(loc_coding$LOCATION)
>>> tbl[tbl > 0]
>>
>> spliceSite intron coding intergenic
>> 2 7 2 5
>>
>> The result can be subset on LOCATION for the region of interest. The QUERYID
>> column maps back to the row number in the original 'query' argument to
>> locateVariants().
>>
>> introns <- loc_coding[loc_coding$LOCATION == "intron", ]
>>> head(introns, 3)
>> GRanges with 3 ranges and 7 metadata columns:
>> seqnames ranges strand | LOCATION QUERYID TXID
>> <Rle> <IRanges> <Rle> | <factor> <integer> <integer>
>> chr1 [13220, 13220] * | intron 1 1
>> chr1 [13220, 13220] * | intron 1 2
>> chr1 [13220, 13220] * | intron 1 3
>>
>>
>> Valerie
>>
>>
>>
>> On 01/31/2013 12:34 PM, Fabrice Tourre wrote:
>>>
>>> Dear list,
>>>
>>> I am using VariantAnnotation to Locate variants in and around genes.
>>>
>>> In VariantAnnotation, the region is defined as: Coding Variants,
>>> IntronVariants, FiveUTRVariants, ThreeUTRVariants, IntergenicVariants,
>>> SpliceSiteVariants or PromoterVariants.
>>>
>>> If it possible to know whether a snp is in exon/intron within
>>> transcription region but outside coding region?
>>>
>>> Thanks.
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>
More information about the Bioconductor
mailing list