[Bioc-devel] restrictToSNV for VCF
Hervé Pagès
hpages at fhcrc.org
Fri Mar 21 01:20:51 CET 2014
Hi,
On 03/19/2014 01:10 PM, Michael Lawrence wrote:
> You can apparently use 1D extraction for VCF, which is a little surprising;
> I learned it from restrictToSNV.
This is inherited from SummarizedExperiment:
> example(SummarizedExperiment)
> se1
class: SummarizedExperiment
dim: 200 6
exptData(0):
assays(1): counts
rownames: NULL
rowData metadata column names(0):
colnames(6): A B ... E F
colData names(1): Treatment
> se1[1:4]
class: SummarizedExperiment
dim: 4 6
exptData(0):
assays(1): counts
rownames: NULL
rowData metadata column names(0):
colnames(6): A B ... E F
colData names(1): Treatment
To me that means that a SummarizedExperiment has a length
(conceptually), and that this length is the number of rows.
It would actually help if a "length" method was defined:
> length(se1)
[1] 1
That would automatically fix many convenience [ wrappers like head(),
tail(), rev(), etc...
> head(se1)
class: SummarizedExperiment
dim: 1 6
exptData(0):
assays(1): counts
rownames: NULL
rowData metadata column names(0):
colnames(6): A B ... E F
colData names(1): Treatment
> rev(se1)
class: SummarizedExperiment
dim: 1 6
exptData(0):
assays(1): counts
rownames: NULL
rowData metadata column names(0):
colnames(6): A B ... E F
colData names(1): Treatment
Following that logic names(se1) also probably return colnames(se1).
H.
>
>
>
>
> On Wed, Mar 19, 2014 at 1:07 PM, Vincent Carey
> <stvjc at channing.harvard.edu>wrote:
>
>>
>>
>>
>> On Wed, Mar 19, 2014 at 4:00 PM, Michael Lawrence <
>> lawrence.michael at gene.com> wrote:
>>
>>> It would be nice to have functions like isSNV, isIndel, isDeletion, etc
>>> that at least provide precise definitions of the terminology. I've added
>>> these, but they're designed only for VRanges. Should work for ExpandedVCF.
>>>
>>> Also, it would be nice if restrictToSNV just assumed that alt(x) must be
>>> something with nchar() support (with special handling for any List), so
>>> that the 'character' vector of alt,VRanges would work immediately.
>>> Basically restrictToSNV should just be x[isSNV(x)]. Is there even a
>>> use-case for the restrictToSNV abstraction if we did that?
>>>
>>>
>> for VCF instance it would be x[isSNV(x),] and indeed I think that would be
>> sufficient. i like the idea of having this family of predicates for
>> variant classes to allow such selections
>>
>>
>>
>>> Michael
>>>
>>>
>>>
>>> On Tue, Mar 18, 2014 at 10:36 AM, Valerie Obenchain <vobencha at fhcrc.org>wrote:
>>>
>>>> Hi,
>>>>
>>>> I've added a restrictToSNV() function to VariantAnnotation (1.9.46). The
>>>> return value is a subset VCF object containing SNVs only. The function
>>>> operates on CollapsedVCF or ExapandedVCF and the alt(VCF) value must be
>>>> nucleotides (i.e., no structural variants).
>>>>
>>>> A variant is considered a SNV if the nucleotide sequences in both
>>>> ref(vcf) and alt(x) are of length 1. I have a question about how variants
>>>> with multiple 'ALT' values should be handled.
>>>>
>>>> Should we consider row 4 a SNV? One 'ALT' is length 1, the other is not.
>>>>
>>>> ALT <- DNAStringSetList("A", c("TT"), c("G", "A"), c("TT", "C"))
>>>> REF <- DNAStringSet(c("G", c("AA"), "T", "G"))
>>>>
>>>>> DataFrame(REF, ALT)
>>>>>>
>>>>> DataFrame with 4 rows and 2 columns
>>>>> REF ALT
>>>>> <DNAStringSet> <DNAStringSetList>
>>>>> 1 G A
>>>>> 2 AA TT
>>>>> 3 T G,A
>>>>> 4 G TT,C
>>>>>
>>>>
>>>>
>>>> Thanks.
>>>> Valerie
>>>>
>>>> _______________________________________________
>>>> Bioc-devel at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>>
>>>
>>>
>>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the Bioc-devel
mailing list