[BioC] Fwd: GWASTools

Stephanie M. Gogarten sdmorris at u.washington.edu
Mon Jan 6 18:19:56 CET 2014

Hi Selina,

On 1/4/14 8:23 AM, Vattathil,Selina wrote:
> Hello Stephanie,
> I am trying to use GWASTools to calculate BAFs and LRRs from some
> Illumina intensity data (thousands of samples, 660W).  I have two
> questions, I'd appreciate any help you can offer!
> 1.  I'd prefer to use the BAFfromClusterMeans function, but I don't have
> a table of cluster means.  I know they are contained in the Illumina
> .egt file, is there a way to convert or output those as text from say
> GenomeStudio, or any other method?

I believe you can output cluster means from GenomeStudio.  However, if you're using GenomeStudio, I highly recommend calculating the BAF and LRR values in GenomeStudio, and exporting those columns along with your intensity data.

> 2.  I ran the example in the manual under BAFfromGenotypes and that
> worked fine.  Then I tried to run it using the Illumina example data by
> simply subbing in the relevant files/objects (code pasted below), and
> the resulting baf and lrr were all NA.  Do you have any insight on what
> might cause that?  I know I can expect all NAs at markers that don't
> meet the specified minimum genotype count.

You need to specify call.method="by.study".  The illumina samples are all controls that were run on different plates, so the default of "by.plate" gives you only one sample per plate and all NAs. (In general, illumina samples should be called by study and not by plate.)


> Thanks!
> Selina Vattathil
> Ph.D. candidate, Human and Molecular Genetics
> Graduate School of Biomedical Sciences
> University of Texas at Houston
> ## Illumina test
> library(GWASdata)
> data(illuminaSnpADF)
> data(illuminaScanADF)
> data(illumina_snp_annot)
> nsamp <- nrow(illuminaScanADF)
> xyfile <- system.file("extdata", "illumina_qxy.nc
> <http://illumina_qxy.nc>", package="GWASdata")
> xyNC <- NcdfIntensityReader(xyfile)
> xyData <- IntensityData(xyNC, snpAnnot=illuminaSnpADF,
> scanAnnot=illuminaScanADF)
> genofile <- system.file("extdata", "illumina_geno.nc
> <http://illumina_geno.nc>", package="GWASdata")
> genoNC <- NcdfGenotypeReader(genofile)
> genoData <- GenotypeData(genoNC, snpAnnot=illuminaSnpADF,
> scanAnnot=illuminaScanADF)
> # create netCDF file to hold BAF/LRR data
> blfile <- tempfile()
> ncdfCreate(illumina_snp_annot, blfile,
> variables=c("BAlleleFreq","LogRRatio"), n.samples=nsamp)
> # calculate BAF and LRR
> BAFfromGenotypes(xyData, genoData, blfile, min.n.genotypes=2,
>                  )
> #call.method="by.plate", plate.name <http://plate.name>="plate")
> blNC <- NcdfIntensityReader(blfile)
> baf <- getBAlleleFreq(blNC)
> lrr <- getLogRRatio(blNC)
> close(xyData)
> close(genoData)
> close(blNC)
> file.remove(blfile)
> ## End

More information about the Bioconductor mailing list