[BioC] SNP6 data to VCF
Sean Davis
sdavis2 at mail.nih.gov
Mon Feb 13 20:36:16 CET 2012
On Mon, Feb 13, 2012 at 2:28 PM, Vincent Carey
<stvjc at channing.harvard.edu> wrote:
>
>
> On Mon, Feb 13, 2012 at 2:13 PM, Sean Davis <sdavis2 at mail.nih.gov> wrote:
>>
>> Hi, all.
>>
>> I'm a little rusty on my oligo array software tools. I'm interested
>> in taking Affymetrix SNP6 data to VCF format. To do that, I am going
>> to need to:
>>
>> 1. Call SNPs
>> 2. Determine strand and reference allele for each SNP on the array
>> 3. Assign the correct alleles to each SNP for each sample
>
>
> for 2 and 3 pd.genomewidesnp.6 has the metadata
>
>> con = pd.genomewidesnp.6 at getdb()
>> dbListTables(con)
> [1] "featureSet" "featureSetCNV" "fragmentLength"
> [4] "fragmentLengthCNV" "pmfeature" "pmfeatureCNV"
> [7] "sequence" "sequenceCNV" "sqlite_stat1"
> [10] "table_info"
>
>> ss = dbGetQuery(con, "select * from featureSet limit 5")
>> ss
> fsetid man_fsetid affy_snp_id dbsnp_rs_id chrom physical_pos strand
> 1 1 SNP_A-2131660 NA rs2887286 1 1156131 0
> 2 2 SNP_A-1967418 NA rs1496555 1 2234251 0
> 3 3 SNP_A-1969580 NA rs41477744 1 2329564 0
> 4 4 SNP_A-4263484 NA rs3890745 1 2553624 0
> 5 5 SNP_A-1978185 NA rs10492936 1 2936870 1
> cytoband allele_a allele_b
> 1 p36.33 C T
> 2 p36.33 A G
> 3 p36.32 A G
> 4 p36.32 C T
> 5 p36.32 C T
Told you I was rusty. Thanks, Vince.
Sean
>>
>> 4. Write out the VCF file with the correct genotypes (on the positive
>> strand, reference allele correctly specified)
>>
>> What is the best way to do steps 1-3? I'll deal with step 4 since I
>> don't think that has been implemented directly.
>>
>> Thanks,
>> Sean
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
More information about the Bioconductor
mailing list