[BioC] Problem obtaining QC statistics from 'simpleaffy'

Jenny Drnevich drnevich at uiuc.edu
Thu Sep 14 22:48:21 CEST 2006


See Crispin Miller's post to the list on August 2, 2006: 
https://stat.ethz.ch/pipermail/bioconductor/attachments/20060802/64c53b71/attachment.pl

Looking through the code of qc.affy (what qc() actually uses), it wouldn't 
be that hard to modify it to get most of what you want. The output is:

return(new("QCStats", scale.factors = sfs, target = target,
         percent.present = dpv, average.background = meanbg, 
minimum.background = minbg,
         maximum.background = maxbg, spikes = spike.vals, qc.probes = 
qc.probe.vals,
         bioBCalls = biobcalls))

You can get everything* except spikes, qc.probes and bioBCalls for any chip 
type, though you'll have to supply the cutoff alpha1 and alpha2 values when 
dpv is created; mas5calls() uses a generic 0.04 and 0.06 instead of chip 
specific values. *There doesn't appear to be any output on ratios in this 
function.

Seeing as there will always be new chip types, perhaps when the 
chip-specific parameters are missing, instead of stopping with the error 
message, it would just skip those qc metrics and give a warning that some 
of them weren't able to be computed? At the very least, having 
the  documentation for qc() say that not all chips types are supported 
might help to reduce the number of postings on this issue. I might 
volunteer to hack the qc.affy code, but I'm not sure if it would break 
other functions that use qc... I know there's a plot method in the QCReport 
package...

Jenny




At 02:03 PM 9/14/2006, Alvord, Greg \(DMS\) [Contr] wrote:
>Dear List:
>
>
>
>We have a question about obtaining quality control statistics from
>library 'simpleaffy'.
>
>
>
>We are working on differential gene expression analyses for six files
>using the soybean genome (3 Hawaii/Resistant vs. 3 Taiwan/Susceptible).
>We successfully read in the .CEL files and obtained the following
>AffyBatch object, which contains 61,170 affyids.
>
>
>
> > soy.ab
>
>AffyBatch object
>
>size of arrays=1164x1164 features (63516 kb)
>
>cdf=Soybean (61170 affyids)
>
>number of samples=6
>
>number of genes=61170
>
>annotation=soybean
>
>
>
>As we are interested only in the 'Glycine max' subset which corresponds
>to the geneNames for the soybean genome, we followed procedures
>(generously supplied by Dr. Jenny Drnevich) and extracted those genes of
>interest.  This results in the following updated AffyBatch object of the
>same name, soy.ab, which contains only (the correct) 37,744 affyids.
>
>
>
> > soy.ab
>
>AffyBatch object
>
>size of arrays=1164x1164 features (63516 kb)
>
>cdf=Soybean (37744 affyids)
>
>number of samples=6
>
>number of genes=37744
>
>annotation=soybean
>
>
>
>We then attempted to load the library 'simpleaffy' and obtain an object
>named 'soy.ab.qc' with the following commands:
>
>
>
> > library('simpleaffy')
>
>Loading required package: genefilter
>
>Loading required package: survival
>
>Loading required package: splines
>
>
>
>Attaching package: 'simpleaffy'
>
>
>
>
>
>         The following object(s) are masked _by_ .GlobalEnv :
>
>
>
>          getBioC
>
>
>
>
>
>Then, in attempting to construct the soy.ab.qc object, we obtained:
>
>
>
> > soy.ab.qc <- qc(soy.ab)
>
>Background correcting
>
>Retrieving data from AffyBatch...done.
>
>Computing expression calls...
>
>......done.
>
>scaling to a TGT of 100 ...
>
>Scale factor for: hr.a5.24 0.410934556631521
>
>Scale factor for: hr.b5.24 0.236518276485241
>
>Scale factor for: hr.c5.24 0.265672825818162
>
>Scale factor for: ts.a6.24 1.96443561512830
>
>Scale factor for: ts.b6.24 0.320822770974345
>
>Scale factor for: ts.c6.24 0.334603048637387
>
>Error in qc.affy(unnormalised, ...) : I'm sorry, I do not know about
>chip type: soybeancdf
>
>
>
>We received the error message above, indicating that qc.affy does not
>recognize chip type soybeancdf.  Further queries show that the soy.ab.qc
>object does not exist at this point.
>
>
>
>Can anyone help us to successfully obtain the soy.ab.qc object such that
>avbg(...), sfs(...), percent.present(...), ratios(...), etc. may be
>used?
>
>
>
>Thank you very much in advance.
>
>
>
>             Greg
>
>
>
>
>
>
>
>W. Gregory Alvord, Ph.D.
>
>Director, Statistical Consulting Services
>
>Computer and Statistical Services
>
>National Cancer Institute at Frederick
>
>Post Office Box B
>
>Frederick, MD 21702-1201
>
>Phone: 301.846.5101
>
>Facsimile: 301.846.6196
>
>E-Mail gwa at css.ncifcrf.gov <mailto:gwa at css.ncifcrf.gov>
>
>
>
>
>
>
>         [[alternative HTML version deleted]]
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives: 
>http://news.gmane.org/gmane.science.biology.informatics.conductor

Jenny Drnevich, Ph.D.

Functional Genomics Bioinformatics Specialist
W.M. Keck Center for Comparative and Functional Genomics
Roy J. Carver Biotechnology Center
University of Illinois, Urbana-Champaign

330 ERML
1201 W. Gregory Dr.
Urbana, IL 61801
USA

ph: 217-244-7355
fax: 217-265-5066
e-mail: drnevich at uiuc.edu



More information about the Bioconductor mailing list