[BioC] Experimental design for RNA-Seq
Tan, Yifang
Yifang.Tan at nrc-cnrc.gc.ca
Fri May 28 16:08:07 CEST 2010
Hello list:
I post my questions here again to get help about my experiment design, as I am new and have been struggling with my analysis. I could not find the example design from LIMMA user's guide.
The first part of my experiment consists of a loop design to compare the gene expression of at different development stages (10DAP, 22DAP and 35DAP, day-after-pollenization) of the same Brassica line. The purpose is to see the differentiation at different development stages of the same single line. Pooled sample of the same line was used for each stage and treated as biological replicates. I have dye swap plus two technical replicates. The target file consists of three or four columns as file name, Cy3 and Cy5. here is the target file:
FileName Cy3 Cy5
AT Oligo 02.11.02.176.gpr.fixed 10DAP 22DAP
AT Oligo 02.11.02.177.gpr.fixed 22DAP 10DAP
AT Oligo 02.11.02.178.gpr.fixed 22DAP 10DAP
AT Oligo 02.11.02.179.gpr.fixed 10DAP 22DAP
AT Oligo 02.11.02.180.gpr.fixed 22DAP 35DAP
AT Oligo 02.11.02.181.gpr.fixed 22DAP 35DAP
AT Oligo 02.11.02.182.gpr.fixed 35DAP 22DAP
AT Oligo 02.11.02.183.gpr.fixed 35DAP 22DAP
AT Oligo 02.11.02.184.gpr.fixed 10DAP 35DAP
AT Oligo 02.11.02.185.gpr.fixed 10DAP 35DAP
AT Oligo 02.11.02.186.gpr.fixed 35DAP 10DAP
AT Oligo 02.11.02.187.gpr.fixed 35DAP 10DAP
This experiment is very similar to the design in LIMMA User's Guide section 7.4, except I have technical replicates. From the Guide, should I have to use one sample like "10DAP" as reference, or any sample for a reference? My goal is to see which genes are differentiated from 10DAP, 22DAP and 35 DAP. How do I get the results of: 1)which genes are consistently up/down-regulated across the 3 stages? 2) which genes are up-down-regulated at each development stage?
The second part of my experiment is:
FileName DPA Cy3 Cy5
2009-07-10-atq3.7.3.145-15.gpr 15DPA WT MUTANT
2009-07-15-atq3.7.3.146-15.gpr 15DPA MUTANT WT
2009-07-15-atq3.7.3.147-15.gpr 15DPA MUTANT WT
2009-07-15-atq3.7.3.148-15.gpr 15DPA WT MUTANT
2009-07-15-atq3.7.3.149-15.gpr 15DPA WT MUTANT
2009-07-17-atq3.7.3.151-20.gpr 20DPA MUTANT WT
2009-07-17-atq3.7.3.152-20.gpr 20DPA WT MUTANT
2009-07-17-atq3.7.3.153-25.gpr 25DPA MUTANT WT
2009-07-17-atq3.7.3.154-25.gpr 25DPA WT MUTANT
2009-07-17-atq3.7.3.155-10.gpr 10DPA MUTANT WT
2009-07-17-atq3.7.3.156-10.gpr 10DPA WT MUTANT
2009-07-17-atq3.7.3.157-30.gpr 30DPA MUTANT WT
2009-07-17-atq3.7.3.158-30.gpr 30DPA WT MUTANT
2009-07-21-atq3.7.3-159-10.gpr 10DPA MUTANT WT
2009-07-21-atq3.7.3-160-20.gpr 20DPA MUTANT WT
2009-07-21-atq3.7.3-164-25.gpr 25DPA MUTANT WT
2009-07-21-atq3.7.3-256-30.gpr 30DPA MUTANT WT
2009-07-22-atq3.7.3-115-10.gpr 10DPA MUTANT WT
2009-07-22-atq3.7.3-116-20.gpr 20DPA MUTANT WT
2009-07-22-atq3.7.3-117-25.gpr 25DPA MUTANT WT
2009-07-22-atq3.7.3-118-30.gpr 30DPA MUTANT WT
2009-07-22-atq3.7.3-119-10.gpr 10DPA WT MUTANT
2009-07-22-atq3.7.3-120-20.gpr 20DPA WT MUTANT
2009-07-22-atq3.7.3-124-25.gpr 25DPA WT MUTANT
2009-07-22-atq3.7.3-125-30.gpr 30DPA WT MUTANT
The target file consists of four columns as file name, time, Cy3 and Cy5. This is a time course experiment. Again I want to see the expression differentiation across the stage (10, 15 20, 25 and 30DAP).
1)Can I split the analysis into five sub-groups by time course (say, 10, 15, 20 25 and 30DPA separately) instead of a whole?
2)If I split the 25 slides into 5 sub-experiments, my feeling is the variance and normalization would be different from each other. Is this correct?
3)How do I prepare the biolrep as I treated the pooled sample as biological replicates.?
I would appreciate very much if you could give me some suggestions on these questions. Thanks a lot!
Yifang
________________________________________
From: bioconductor-bounces at stat.math.ethz.ch [bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Naomi Altman [naomi at stat.psu.edu]
Sent: Friday, May 28, 2010 9:17 AM
To: michael watson (IAH-C); bioconductor
Subject: Re: [BioC] Experimental design for RNA-Seq
At least from the stat theory point of view, the best design is equal
numbers of biological samples (the more the better) for each
condition and no technical reps.
So far, there is little indication that there are flowcell
effects. However, to be on the safe side, you should use the
blocking principle - as much as possible distribute the reps from the
different conditions across different flow cells (unless the whole
experiment fits on a single flow cell).
--Naomi
At 04:02 AM 5/28/2010, michael watson (IAH-C) wrote:
>Dear List
>
>I'm about to design a simple experiment (knockout vs wild-type) and
>we plan to use RNA-Seq. We're interested in gene expression, for
>mRNA and microRNAs in particular, and calculating stats for
>differential expression.
>
>I'm aware of DEseq, DEGseq and edgeR. I wanted to ask those who
>have a lot of experience of this type of analysis if they have any
>advice for experimental design, in particular, the number of
>replicates they have used and why (I was planning on going for all
>biological replicates, no technical).
>
>Thanks
>Mick
>
>
>
> [[alternative HTML version deleted]]
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives:
>http://news.gmane.org/gmane.science.biology.informatics.conductor
Naomi S. Altman 814-865-3791 (voice)
Associate Professor
Dept. of Statistics 814-863-7114 (fax)
Penn State University 814-865-1348 (Statistics)
University Park, PA 16802-2111
_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list