On Fri, Sep 11, 2009 at 11:56 AM, Steve Lianoglou <
mailinglist.honeypot@gmail.com> wrote:

> Hi Sean,
>
> On Sep 11, 2009, at 11:44 AM, Sean Davis wrote:
>
>  On Fri, Sep 11, 2009 at 11:20 AM, Sean Davis <seandavi@gmail.com> wrote:
>>
>>
>>>
>>> On Fri, Sep 11, 2009 at 9:47 AM, Tefina Paloma <tefina.paloma@gmail.com
>>> >wrote:
>>>
>>>  To be able to fit the same model to all arrays, an additional
>>>> between-array
>>>> normalization would be necessary, so to make all the arrays really
>>>> comparable
>>>> and I don't want to over-normalize the data either.....
>>>>
>>>> therefore I just thought of an sensible p value adjustment
>>>>
>>>>
>>>>  You can adjust the entire list of p-values from all lists, if you like,
>>> as
>>> an alternative.  However, assuming that the arrays are of the same
>>> technology, the probe-level variances should be similar, so you could
>>> also
>>> combine the normalized data.  I'm not sure what "model" you mean, as each
>>> test is done within a probe and, therefore, would not cross arrays.  But
>>> I
>>> may have misunderstood what you are trying to do.
>>>
>>>
>>>  I made a further assumption above, which I should probably make
>> explicit.
>> While the array technology is important in determing the variance, the
>> biologic behavior of the probes on the array contributes, also.
>>
>
> Sorry if this is too noob-ish of a question, but I'm curious about your
> choice of words. Could you explain this point a bit further? It sounds like
> you are referring to the actual probes that are synthesized onto the array,
> no?
>
> What biologic behavior do you expect these probes to have? Are you
> referring to them forming some secondary structure or something? If so, why
> would one expect some explicitly differing behavior between the same probes
> on different arrays (assuming no array impurities and the arrays were
> performed using the same protocol, or whatever).
>
>
The classic example that I can think of is the hgu133a and b where the
probes on the a array were "refseq-based" and so represented well-validated
genes while the probes on the b array were generally ESTs and, being less
"qualified" as probesets, had much different error qualities than those on
the a array.  If using something like limma or SAM that has some sort of
"variance pooling", the variances will be inflated in one array of the set
and decreased in the other array of the set.

I hope that helps.  I have done a particularly bad job of explaining myself
above--sorry about confusion.

Sean

	[[alternative HTML version deleted]]