[BioC] mas5call select genes
Naomi Altman
naomi at stat.psu.edu
Wed Feb 2 15:43:29 CET 2005
Hey Stephen,
Do you have a reference that documents this? We are getting a lot of flack
from a reviewer about our using the coefficient of variation instead of the
MAS5 calls. This same reviewer seems to be happy enough for us to use RMA
or gcRMA, but did not accept my argument that filtering based on the
coefficient of variation is the natural measure with these normalization,
and is asking us to redo many tables due to this point.
For my part: an experimenter came to me with an 8 array experiment and was
in distress because 7 of the arrays had about 20% MAS5 absent calls, but
the 8th array had about 35% MAS5 absent calls. After RMA normalization,
all 8 arrays appeared pretty good, in the sense that the correlation
between array 8 and the others was only slightly lower than the correlation
between other pairs. I took this to indicate that there had been some
problem in the 8th array that had lowered the hybridization (or scanner
detection) rate, but that this was proportional to the rate on the other
arrays. It also made me further question the validity of the MAS5 calls.
Comments from the mailing list (and a published reference that I could show
the reviewers) would be very welcome.
--Naomi
p.s. Sadly, almost all the experiments I deal with have only 2
reps/treatment, which presents a problem for the coefficient of variation
idea, but I don't think plant biologists are going to have the funding for
huge Affy experiments in the near future.
At 08:09 AM 1/13/2005, Stephen Henderson wrote:
>On a slight tangent...
>
>Is there really any point using the MAS5 calls in any case as though a
>useful quality control for the whole experiment singly they seem to have a
>poor sensitivity/specificity? If you use a variance based filter i.e.
>filtering all data below a given coefficient of variance this should remove
>non-expressed data (and inherently uninteresting data depending upon the
>experiment and design) as it should not vary by much more than the baseline.
>
>This is what I do but if anyone thinks this is wrong (or right) I'm
>interested to know??
>
>
>
>
>
>-----Original Message-----
>From: Claire Wilson
>To: Katleen De Preter
>Cc: bioconductor at stat.math.ethz.ch
>Sent: 1/13/05 8:49 AM
>Subject: RE: [BioC] mas5call select genes
>
>Hi Katleen,
>
>This is the function I use on a matrix of Affymetrix Absent/Present
>calls (rows are probesets, columns are chips)
>
>
>number.pres <- function(x) {
> x[x=="P"] <- 1;
> x[x!=1] <- 0;
> return(apply(x,1,function(a) { sum(as.integer(a))}))
>}
>
># Calculate present/absent calls
># They are stored in the exprs slot of PAcalls
>PAcalls <- mas5calls(raw.data)
>
># This returns a vector where each probeset is listed with the total
>number of present calls
>np <- number.pres(PAcalls at exprs)
>
># Get those probesets with at least 1 present call
>my.list <- names(np[np>0])
>
>The simpleaffy library also provides a function (pairwise.filter) to
>filter Affymetrix data and one of the parametres allows you to specify a
>minimum number of chips on which a probeset must be called present
>
>Hope this helps
>
>Claire
>
> > -----Original Message-----
> > From: bioconductor-bounces at stat.math.ethz.ch
> > [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of
> > Katleen De Preter
> > Sent: 13 January 2005 08:08
> > To: bioconductor at stat.math.ethz.ch
> > Subject: [BioC] mas5call select genes
> >
> > Dear Colleagues,
> > I would like to select genes based on the Affymetrix calls
> > (mas5call-function in affy package). For example, how can I
> > obtain the
> > list of genes/probeIds that have in at least 1 of 20 experiments a
> > present call?
> > Best regards,
> > Katleen De Preter
> >
> >
> >
> >
> > --
> > No virus found in this outgoing message.
> > Checked by AVG Anti-Virus.
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> >
>
>--------------------------------------------------------
>
>
>This email is confidential and intended solely for the use\ ...{{dropped}}
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
Naomi S. Altman 814-865-3791 (voice)
Associate Professor
Bioinformatics Consulting Center
Dept. of Statistics 814-863-7114 (fax)
Penn State University 814-865-1348 (Statistics)
University Park, PA 16802-2111
More information about the Bioconductor
mailing list