[BioC] normalisation assumptions (violation of)
J.delasHeras at ed.ac.uk
J.delasHeras at ed.ac.uk
Tue Aug 8 23:06:21 CEST 2006
Quoting Henrik Bengtsson <hb at maths.lth.se>:
>
> In the bigger picture, given that you can identify those 20-30% DEs,
> how are you going interpret such a large list of genes?
>
> /H
The number of "useful" genes is quite smaller. This is because my
experiment consists of 4 separate sub-experiments, all using a common
reference (untransfected cells, in this case). Three of the
subexperiments consist on teh hybridisation of transfected cells vs.
untransfected. The transfection is of a construct expressing a fusion
protein, teh first part contains a DNA-binding domain with certain
sequence specificity (that we expect to occur in many promoters), the
second is a strong transactivator. I'm hoping to detect teh binding of
these protein domains by looking at what genes are upregulated,
especially those that are only expressed after transfection. There are
three subexperiments because they are slightly different proteins. The
fourth experiment is a control, one of the previous fusion proteins
with a couple of point mutations that we know to abolish strong
specific DNA binding. Transfection of this construct still results in
upregulation of many genes. What i do is analyse all data together
(same common reference), and remove the DE genes (using an FDR of 0.05%
or 0.01% as cut off) of the control experiment from the other three.
Thsi reduces substantially the number of genes. From the remainder,
then I focus on those that have negligible expression on teh
untransfected cells, and decent expression afterwards. I then contrast
this to what happened on teh control experiment (despite not being
picked as DE in it). At the end I have tens of candidates. Less than
100. It's not a crazy number and then proceed to verification by RT
etc, and the biology starts.
When we started the experiment we were not sure what we would get. IN
theory we could get thousands of genes. It all depends on how good our
control is. that's why I used a simple common reference design, as it
allows us to add easily another control if we find a better one.
I already analysed a set of data on a cell line, with RNA prepared by
somebody else. It worked pretty well, but the effect wasn't as great as
I am seeing here. The transfection efficiency may have something to do
with it. I checked all my transfections by Western blot and only used
the ones that gave me strong expression of teh fusion protein, I
suspect the other person wasn't so picky.
Jose
--
Dr. Jose I. de las Heras Email: J.delasHeras at ed.ac.uk
The Wellcome Trust Centre for Cell Biology Phone: +44 (0)131 6513374
Institute for Cell & Molecular Biology Fax: +44 (0)131 6507360
Swann Building, Mayfield Road
University of Edinburgh
Edinburgh EH9 3JR
UK
More information about the Bioconductor
mailing list