[BioC] What to do with this data? Question on deconfouding and GO analysis
Wolfgang Huber
whuber at embl.de
Sat May 29 11:48:40 CEST 2010
Dear January
some suggestions below.
On 28/05/10 16:02, January Weiner wrote:
> Hello,
>
> I've been asked to analyze data from the following experiment.
>
> Two types of cells were analyzed either separately (A, B) or in a
> mixture (AB). In each experiment, either the separated cell types or
> the mixture was subjected to a treatment. From each such experiment, a
> single Agilent two-color microarray was prepared, with untreated cells
> used as a control.
>
> Of course, proper significance analysis cannot be done, and I can only
> use the technical p-values generated by the Agilent software. Due to
> the nature of the experiment, it is unlikely that another data set can
> be generated in a foreseeable future. However, the results in general
> show the expected response to treatment and activation of a number of
> genes that are supposed to be activated; thus, the technical p-values
> still give a meaningful "general picture".
>
> By manually going through the data it is obvious that in many cases,
> the response in AB is a weighted average of the responses A and B. I
> tried to estimate this global weights in a very naive manner, by
> looking at the correlation between the fold change in experiment AB,
> and the fold change estimated from experiments A and B for different
> values of p, the proportion of cells of type A in the mixture AB.
>
> My first question is therefore -- is there a recommended solution
> within Bioconductor that I could apply in such a case?
I am not sure there is, or there needs to be. It seems that your most
basic model is
AB = pA + (1-p)B
where AB, A and B are the fold changes observed in samples AB, A and B
respectively. You can rearrange this to:
p = (AB-B) / (A-B)
Hence I would do a scatterplot of (A-B) on the x-axis versus (AB-B) on
the y-axis and see if you can reasonably fit a regression line.
>
> Furthermore, I'd like to look for an interaction effect -- to predict
> genes, GO terms or pathways that behave "not according to predictions"
> in the mixture AB. For this, I assume that the technical p-values are
> meaningful (because I do not have another choice),
Yes, you do: ignore the p-values, and work with the fold-changes.
> and run a GO / SPIA
> analysis on the three microarrays separately. Then, I manually look
> through the results to find enriched terms which are different for the
> AB experiment.
>
> I wonder whether there is a possibility to compare results of two
> GO-analyses. One could, for example, look for changes in rank
> positions of different GO terms (since the p-values in such a set up
> would probably be not very meaningful).
>
Have a look at the Category package, in particular its vignette, which
takes a slightly more abstracted view of gene set enrichments than "sets
of genes with low p-values" - i.e. you can look at enrichment of
arbitrarily constructed comparison statistics.
Also, at this one, from your (and my) neighbours:
Nucleic Acids Res. 2010
GOing Bayesian: model-based gene set analysis of genome-scale data.
Bauer S, Gagneur J, Robinson PN.
> Thanks in advance for any help, suggestions, material for further reading etc.,
>
> j.
>
--
Wolfgang Huber
EMBL
http://www.embl.de/research/units/genome_biology/huber
More information about the Bioconductor
mailing list