[BioC] Translating AB/BB/AA into a SNP with Illumina data
Stephanie M. Gogarten
sdmorris at u.washington.edu
Mon Jul 16 18:19:20 CEST 2012
Hi Lavinia,
The GWASTools package was designed to work with this type of data.
You can download annotation for Illumina arrays from their website:
https://icom.illumina.com/. They now require that you register with
their site to download files. Once you have logged in, click
"Downloads" in the menu on the left and then "Genotyping/LOH/CNV" in the
menu on the right, and look for the Human Omni1 Quad link. The file
that you want is called HumanOmni1-Quad_v1-0_H_csv.zip, and looks like this:
IlmnID,Name,IlmnStrand,SNP,AddressA_ID,AlleleA_ProbeSeq,AddressB_ID,AlleleB_ProbeSeq,GenomeBuild,Chr,MapInfo,Ploidy,Species,Source,SourceVersion,SourceStrand,SourceSeq,TopGenomicSeq,BeadSetID,Exp_Clusters,Intensity_Only,RefStrand
200006-0_T_R_1853021091,200006,TOP,[A/G],0060702346,AGACTGTGGATGAATAATGCTGGTGAGTGTCTGGCCCTCGGGGAGGCCCA,,,37.1,9,139926402,diploid,Homo
sapiens,ILLUMINA,0,BOT,ACATGCCCCACTCAGCGCCACCCCCGTCCTCCCCTCCCAGGTTGCCTAGCTGTCCCCAGC[T/C]TGGGCCTCCCCGAGGGCCAGACACTCACCAGCATTATTCATCCACAGTCTCCCAGGATCA,TGATCCTGGGAGACTGTGGATGAATAATGCTGGTGAGTGTCTGGCCCTCGGGGAGGCCCA[A/G]GCTGGGGACAGCTAGGCAACCTGGGAGGGGAGGACGGGGGTGGCGCTGAGTGGGGCATGT,163,3,0,-
The "SNP" column tells you the A/B allele designation for a particular
SNP (format [A/B]) and the "IlmnStrand" column tells you whether that
SNP is on the TOP or BOT strand. (See here for a useful article on how
to convert between different strand designations:
http://www.sciencedirect.com/science/article/pii/S0168952512000704)
Stephanie Gogarten
Research Scientist, Biostatistics
University of Washington
On 7/16/12 3:00 AM, bioconductor-request at r-project.org wrote:
> Message: 3
> Date: Mon, 16 Jul 2012 13:59:33 +1000
> From: "Lavinia Gordon"<lavinia.gordon at mcri.edu.au>
> To:<bioconductor at r-project.org>
> Subject: [BioC] Translating AB/BB/AA into a SNP with Illumina data
> Message-ID:<87223629775F2049917889888F597633FD720F at murmx.mcri.edu.au>
> Content-Type: text/plain; charset="us-ascii"
>
> Dear all,
>
> I am working with Illumina Human Omni1 Quad data. I only have access to
> processed data, e.g:
> ID_REF VALUE Score Theta R B Allele Freq Log R Ratio
> 200006 AB 0.8273118 0.4800678 2.651576
> 0.5337635 0.1516016
>
> I would like to know what the SNP is at this position and wondered if
> there are any components within the Bioconductor packages that can deal
> with this data, taking into account the TOP/BTM strand approach that
> Illumina uses. I have previously had great success with crlmm, but that
> was working from the raw IDAT files.
>
> With thanks for your time,
>
> Lavinia Gordon
> Senior Research Officer
> Quantitative Sciences Core, Bioinformatics
>
> Murdoch Childrens Research Institute
> The Royal Children's Hospital
> Flemington Road Parkville Victoria 3052 Australia
> T 03 8341 6221
> www.mcri.edu.au
More information about the Bioconductor
mailing list