[BioC] Two populations on microarray

Naomi Altman naomi at stat.psu.edu
Mon Feb 13 02:54:07 CET 2012


Did you also remove all control spots?

--Naomi


At 09:19 AM 2/7/2012, Ben Tupper wrote:
>Hi,
>
>On Jan 21, 2012, at 2:59 PM, Naomi Altman wrote:
>
> > I agree with Gordon.
> >
> > I doubt that the double cloud has anything to do with 
> differential expression.  There is something odd going on 
> technically.  The usual types of normalization are not going to fix 
> the problem.
>
>Thanks for the assistance - we took up the suggestions that Gordon 
>proposed.  We have successfully assigned weight = 0 to the 
>problematic points.  I encouraged us to use a brute force 
>identify-and-kill approach, but Joaquin's more nuanced inter-slide 
>comparison approach prevailed.  The MA plots look great now but the 
>subsequent between-array normalizations seem problematic, or at 
>least the diagnostic plotDensities() graphics points to continuing 
>issues.  This plot shows 4 diagnostic plots for one array ...
>
>http://dl.dropbox.com/u/8433654/slide-52-MA-diagnostics.png
>
>In the left column are shown the results of plotMA(MA,...) with 
>zero.weights set to TRUE/FALSE so that we can show/hide the weight = 0 spots.
>
>In the right column are shown the results of a slightly modified 
>plotDensities(MA,...) where I have added a zero.weights argument to 
>the original plotDensities() function.  The upper plot is identical 
>to the output from the original plotDensities() function, while the 
>lower plot simply removes the weight = 0 spots before computing the 
>density distribution.  Because the MA-to-RG transformation in the 
>original plotDensities() function doesn't take weights into account, 
>it becomes difficult to use the function with our data to visually 
>diagnose the effect the normalization functions.
>
>The upper right plot leads us to believe that we have some serious 
>issues.  But the lower right plot tells us that we are ok - 
>obviously we like the lower right one better!
>
>So, are we fooling ourselves by thinking the histogram at lower 
>right is enough to tell us that we are good to go on to the next 
>step?  If we are fooling ourselves, then what would you advise us to 
>do instead?
>
>Thanks so much!
>Ben Tupper
>
>
>
>
>
> >
> > --Naomi
> >
> >
> > At 12:03 AM 1/20/2012, Gordon K Smyth wrote:
> >> Dear Joaquin,
> >>
> >> What I had in mind was that you would make a vector z which 
> takes values TRUE or FALSE depending on whether each probe on the 
> array belongs to group 1 or group 2 according to your MA plot.  Then
> >>
> >>  imageplot(z,layout,low="white",high="blue")
> >>
> >> There is no way for you normalize out this problem, and certainly not
> >> within the limited capabilities of GenePix software.
> >>
> >> Best wishes
> >> Gordon
> >>
> >> ---------------------------------------------
> >> Professor Gordon K Smyth,
> >> Bioinformatics Division,
> >> Walter and Eliza Hall Institute of Medical Research,
> >> 1G Royal Parade, Parkville, Vic 3052, Australia.
> >> smyth at wehi.edu.au
> >> http://www.wehi.edu.au
> >> http://www.statsci.org/smyth
> >>
> >>
> >> On Thu, 19 Jan 2012, Joaquin Martinez wrote:
> >>
> >>> Dear Naomi, Gordon and Ben,
> >>>
> >>>
> >>>
> >>> Thank you for your replies to Ben Tupper's (and my) question.
> >>>
> >>>
> >>>
> >>> We are using spotted oligonucleotide microarrays containing 
> probes for both
> >>> host and virus genes. In our experiment we had cultures grown under high
> >>> and low phosphate conditions, inoculated with 2 different viruses
> >>> (separately) or kept virus-free, in triplicate. RNA purified from those
> >>> cultures at different time points was fluorescently labeled 
> (with Cy-dyes)
> >>> and hybridized onto the microarray slides. You can see a flow 
> chart of our
> >>> experimental design here:
> >>>
> >>> http://dl.dropbox.com/u/8433654/design-concept.pdf
> >>>
> >>>
> >>>
> >>> One slide contains 2 samples which had different experimental treatments.
> >>> Each sample was split into 3, labeled (dye swap) and hybridized onto 3
> >>> different microarray slides in combination with another sample to allow
> >>> technical replication.
> >>>
> >>>
> >>>
> >>> I quantified labeling efficiency prior to hybridizing the 
> samples onto the
> >>> microarray slide, for both dyes I got between 30 and 60 dye molecules per
> >>> 1000 nt (what is the range indicated by the manufacturer for good
> >>> labeling). Also we produced FB plots for the green and the red channels,
> >>> both had similar z-range and saturation range, which we interpreted as a
> >>> proof of good labeling (?). See example:
> >>>
> >>> http://dl.dropbox.com/u/8433654/R-G-imageplot.png
> >>>
> >>>
> >>>
> >>> Both MA clusters that we observe contain a mixture of both host and virus
> >>> probes, ruling out that one complete set of probes failed. 
> Naomi mentioned
> >>> that the nondifferentially expressing genes should cluster around M=0, so
> >>> does that mean that the top cluster corresponds to 
> differentially expressed
> >>> genes?
> >>>
> >>>
> >>>
> >>> We used GenePix Pro to scan and analyze the microarrays. Could we use the
> >>> normalization function in the software (normalize the data in 
> each image so
> >>> that the mean of the median of ratios of all features is equal 
> to 1) as an
> >>> alternative to MA? Or would that simply hide the problem? And then do
> >>> normalization between arrays using the quantile method?
> >>>
> >>>
> >>> Thanks,
> >>>
> >>> Joaquin
> >>>
> >>>
> >>>
> >>>>> From: Naomi Altman <naomi at stat.psu.edu>
> >>>>> Date: January 18, 2012 9:56:45 AM EST
> >>>>> To: Gordon K Smyth <smyth at wehi.EDU.AU>, Ben Tupper 
> <btupper at bigelow.org>
> >>>>> Cc: Bioconductor mailing list <bioconductor at r-project.org>
> >>>>> Subject: Re: [BioC] Two populations on microarray
> >>>>>
> >>>>> Dear Ben,
> >>>>> A typical MA plot has most of the points scattered around the line M=0.
> >>>> Even if you have 2 populations of probes, the 
> nondifferentially expressing
> >>>> genes should be in that central ellipse.  (The lower cluster does look
> >>>> somewhat like the typical MA plot for raw data.)  I suggest that you do
> >>>> separate MA plots for each population of probes, to see if one set of
> >>>> probes failed.  Or, as Gordon suggests, a population for which labelling
> >>>> failed.
> >>>>>
> >>>>> --Naomi
> >>>>>
> >>>>>
> >>>>> At 05:48 PM 1/14/2012, Gordon K Smyth wrote:
> >>>>>> Dear Ben,
> >>>>>>
> >>>>>> Are you saying that you have deliberately designed two different
> >>>> populations of probes onto your arrays?
> >>>>>>
> >>>>>> Your MA-plot suggests that there is substantial body of spots on the
> >>>> array for which the green channel has failed, hence the 
> 45-degree line at
> >>>> the top of the plot.  These dots likely represent spots with a 
> normal red
> >>>> channel value but close to zero for green.  Normally this would have a
> >>>> technical rather than biological cause.  An imageplot may help 
> you identify
> >>>> where the offending spots are on your array.
> >>>>>>
> >>>>>> On the other hand, if you have deliberately spotted your arrays with
> >>>> two quite different populations of probes, then they probably need to be
> >>>> analysed as separate arrays.
> >>>>>>
> >>>>>> Best wishes
> >>>>>> Gordon
> >>>>>>
> >>>>>>> Date: Thu, 12 Jan 2012 14:28:36 -0500
> >>>>>>> From: Ben Tupper <btupper at bigelow.org>
> >>>>>>> To: bioconductor at r-project.org
> >>>>>>> Subject: [BioC] Two populations on microarray
> >>>>>>>
> >>>>>>> Hello,
> >>>>>>>
> >>>>>>> By virtue of experiment design we have two populations to analyze on
> >>>> each of a suite of Genepix microarrays.  You can see an example in an MA
> >>>> plot here (generated using the excellent limma package) :
> >>>>>>>
> >>>>>>>       http://dl.dropbox.com/u/8433654/BE%20T46h%20slide%2052.png
> >>>>>>>
> >>>>>>> We have been following the steps in the limma user guide, and Ben
> >>>> Bolstad's helpful notes http://tinyurl.com/7346mh9 All of the 
> examples we
> >>>> see appear to have just one population to contend with, which 
> gives us an
> >>>> inkling that we are being naive about our analysis.  We 
> suspect that we'll
> >>>> have to separate the two populations before normalization and analysis.
> >>>> Are there any guides available for managing two populations like this?
> >>>>>>>
> >>>>>>> Thanks!
> >>>>>>> Ben
> >>>>>>>
> >>>>
> >>
> >> ______________________________________________________________________
> >> The information in this email is confidential and intended 
> solely for the addressee.
> >> You must not disclose, forward, print or use it without the 
> permission of the sender.
> >> ______________________________________________________________________
> >>
> >
> >
> >
>
>Ben Tupper
>Bigelow Laboratory for Ocean Sciences
>180 McKown Point Rd. P.O. Box 475
>West Boothbay Harbor, Maine   04575-0475
>http://www.bigelow.org



More information about the Bioconductor mailing list