[BioC] help with analysis of genotyping data from Illumina HumanOmni5-4v1_B chip

Abhishek Pratap abhishek.vit at gmail.com
Thu Jan 16 21:03:48 CET 2014

Thanks a lot Stephanie for your quick response. This is was very
useful info. I will follow up with package specific questions if any.


On Tue, Jan 14, 2014 at 1:54 PM, Stephanie M. Gogarten
<sdmorris at u.washington.edu> wrote:
> Hi Abhi,
> 1. The GWASTools package was designed for QC of precalled array data. See
> the "Data Cleaning" vignette for a recommended workflow.  You might also
> want to look at Laurie et al 2010 in Genetic Epidemiology
> (10.1002/gepi.20516), as the vignette implements the QC methods described
> therein.
> 2. I usually get the annotation file from Illumina (it would probably be
> called HumanOmni5-4v1_B.csv).  Your collaborators may have this file, or you
> could register with Illumina's website to download it.  It has rsID,
> chromosome, position, alleles, and probe sequences.
> 3. I don't know of a good way at the moment, but "export GWASTools objects
> as VCF" is going on my to-do list.  I recently used the un-slick way of
> PLINK file -> load in PLINK/SEQ -> export VCF.  You might also try creating
> a VariantAnnotation object from your data and using the writeVcf method.
> Stephanie
> On 1/14/14 11:19 AM, Abhishek Pratap wrote:
>> Hi Guys
>> We have recently obtained from precalled genotype data from our
>> collaborators generated from the Illumina Human Omni5 array chip
>> (HumanOmni5-4v1_B). The genotypes have already been called using the
>> Illumina's Genome Studio.
>> I being new to the array based genotyping data (coming from sequencing
>> arena) would like to know the following.
>> 1. What QC can be done on these genotype data files (200 sampled) to
>> ascertain their quality and filter out the low quality calls.
>> 2. Does bioconductor have a package for annotation of this chip
>> HumanOmni5-4v1_B. I was not able to find "humanomni5quadv1bCrlmm" but
>> not sure if that would give me the annotation on loci / SNP.
>> 3.  Any existing slick way to create VCF files from these 200 genotype
>> files. Our goal is to summarize the information in a single VCF across
>> all the samples tagging the low quality ones.
>> Many thanks!
>> -Abhi
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor

More information about the Bioconductor mailing list