[Bioc-devel] Can I analyze with bioconductor a microarray experiement where the distribution of probes intesisties follow a bimodel distribution?

Sun Jun 9 03:46:12 CEST 2013

Dear Miguel,

> From: "Miguel Moreno-Risueno" <miguelangel.moreno at upm.es>
> To: "'James W. MacDonald'" <jmacdon at uw.edu>
> Cc: bioc-devel at r-project.org
> Subject: Re: [Bioc-devel] Can I analyze with bioconductor a microarray
> 	experiement where the distribution of probes intesisties follow a
>       bimodal distribution?
>
> Hi James
>
> Thank you very much for the answer. Yes I understand that it is across
> experiments that a given probe should follow a normal distribution, but if
> this is true shouldn't the population of those probes similarly follow a
> normal distribution in turn, or not necessarily.

No, this does not follow at all, nor it is needed.  Standard microarray 
analysis methods (eg limma package) make no assumptions at all about the 
distribution of intensities over all probes.

You should however be checking whether you are using the best possible 
background correction and preprocessing methods.

> I am concern that because of this overall bimodal distribution of the 
> population of probes for a given sample (but that happens for every 
> sample) the probes may follow a bimodal distribution themselves.

No, there is no such implication.  The within probe (over samples for each 
probe) and between probe (over probes for each sample) distributions are 
almost unrelated.

> I noticed that for other microarray experiments in the same (Nimblegen) 
> and other platforms (Affymetrix) the distribution of probes within every 
> sample follow a normal or normal-like distribution.

No they don't.  The popular RMA algorithm for Affymetrix data, for 
example, assumes that the distribution of intensities for each probe 
approximately follows an exponential distribution, and that is not 
normal-like.

Best wishes
Gordon

BTW. This is not the right list for this sort of question.  You should be 
mailing to the main Bioconductor list.

> Thank you again,
>
> Miguel
>
> -----Original Message-----
> From: James W. MacDonald [mailto:jmacdon at uw.edu]
> Sent: Friday, June 07, 2013 5:27 PM
> To: Miguel Moreno-Risueno
> Cc: bioc-devel at r-project.org
> Subject: Re: [Bioc-devel] Can I analyze with bioconductor a microarray
> experiement where the distribution of probes intesisties follow a bi modal
> distribution?
>
> Hi Miguel,
>
> On 6/7/2013 5:11 AM, Miguel Moreno-Risueno wrote:
>>
>> Hello all,
>>
>> We have recently received a microarray experiment in the Nimblegen
>> platform where the intensity of the probe sets follow a bi-modal
>> distribution. We have been said from the facility that this is because
>> of the dynamic range of the Agilent scanner they use. We are concerned
>> about the statistical analysis with bioconductor as it is our
>> understanding that these statistical analyses are developed for normal
>> or normal-like distribution. We appreciate any information on this regard.
>
> If I understand your question correctly, you are noting that the overall
> distribution of probes within a sample has a bi-modal distribution. This
> doesn't really have anything to do with any statistical tests you might be
> computing, as you are not doing any statistics within a sample (e.g., one
> usually doesn't test to see if probe X is differentially expressed as
> compared to probe Z in sample Q).
>
> Instead, what you should be concerned with are the distributions of the
> individual probes across samples. With microarray data we usually don't have
> enough data to even begin to assess the across-sample, within probe
> distributions (e.g., if you have three replicates for two sample types, good
> luck trying to discern if those probes follow a normal distribution, or are
> even 'hump-shaped'). In addition, there are usually tens of thousands of
> probes on a given chip. I have never heard of anybody looking at each probe,
> trying to assess if it follows a reasonable distribution across samples. I
> suppose you could do it, but to what point?
>
> Instead we simply assume that the data follow a reasonable distribution and
> then do the test. This is one of the reasons that it is imperative to follow
> up promising leads with confirmatory testing, preferably with new samples.
>
> Best,
>
> Jim
>
>> Thank you in advance for your help,
>>
>> Miguel
>>

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}