To help clarify further here is a dataframe of the design.
subject group times
1 1 Treated 0hr
2 2 Treated 0hr
3 3 Control 0hr
4 4 Treated 0hr
5 5 Control 0hr
6 6 Control 0hr
7 1 Treated 1hr
8 2 Treated 1hr
9 3 Control 1hr
...
17 5 Control 2hr
18 6 Control 2hr
My thought process has been as follows:
In the edgeR userguide there is the treatment combination example
> targets
Sample Treat Time
1 Sample1 Placebo 0h
2 Sample2 Placebo 0h
3 Sample3 Placebo 1h
4 Sample4 Placebo 1h
5 Sample5 Placebo 2h
6 Sample6 Placebo 2h
7 Sample1 Drug 0h
8 Sample2 Drug 0h
9 Sample3 Drug 1h
10 Sample4 Drug 1h
11 Sample5 Drug 2h
12 Sample6 Drug 2h
which combines the groups to produce a single group (ex. Drug.1,
Placebo.1, Drug.2, etc)
This seems potentially appropriate but this appears to assume
independence between samples whereas my data consists of what you
could call 'true repeated measures' on the same sample. This seems to
draw on the paired samples and blocked examples. These proceed by
having the 'subject' as a factor as well, for example:
design <- model.matrix(~Subject+Treatment)
This leads me to guess that a combination of these techniques is
required. Perhaps merging the times and group factors in my dataset
(see above) as 'newgroup' (e.g. Control.0, Control.1, Treatment.0,
etc). Then create the model formula:
design <- model.matrix(~Subject+newgroup)
Does this seem appropriate or am I way off base and over thinking
this? Thanks for any suggestions.
Regards,
Charles
On Tue, Jun 25, 2013 at 11:11 PM, Gordon K Smyth wrote:
> Charles,
>
> Are there only 2 biological units in your experiment? (One for treatment
> and one for control?) Or do you have multiple biological units in each
> group? Surely it must be the latter but, if so, your model does not take
> this into account.
>
> What questions do you want to test?
>
> Best
> Gordon
>
>
>
> On Tue, 25 Jun 2013, Charles Determan Jr wrote:
>
> Gordon,
>>
>> I apologize for not being more definitive with my description. Your
>> initial definition is my intention, consecutive measurements on the same
>> biological units. I will look over the comments in the link you provided.
>> Thank you for your insight, I appreciate any further thoughts you may
>> have.
>>
>> Regards,
>> Charles
>>
>>
>> On Tue, Jun 25, 2013 at 6:57 PM, Gordon K Smyth
>> wrote:
>>
>> Dear Charles,
>>>
>>> The term "repeated measures" describes a situation in which repeated
>>> measurements are made on the same biological unit. Hence the repeated
>>> measurements are correlated. It is not clear from the brief information
>>> you give whether this is the case, or whether the different time points
>>> derive from independent biological samples.
>>>
>>> The model you give might or might not be correct, depending on the
>>> experimental units and the hypotheses that you plan to test. For most
>>> experiments it is not the right approach, for reasons that I have pointed
>>> out elsewhere:
>>>
>>> https://www.stat.math.ethz.ch/****pipermail/bioconductor/2013-****
>>> June/053297.html>> bioconductor/2013-June/053297.**html
>>> >
>>>
>>>
>>> Best wishes
>>> Gordon
>>>
>>>
>>> Date: Mon, 24 Jun 2013 15:08:48 -0500
>>>
>>>> From: Charles Determan Jr
>>>> To: bioconductor@r-project.org
>>>> Subject: [BioC] Repeated Measures mRNA expression analysis
>>>>
>>>> Greetings,
>>>>
>>>> I need to analyze data collected from an RNA-seq experiment. This
>>>> consists of comparing two groups (control vs. treatment) and repeated
>>>> sampling (1 hour, 2 hours, 3 hours). If this were a univariate problem I
>>>> know I would use a 2-way rmANOVA analysis but this is RNA-seq and I have
>>>> thousands of variables. I am very familiar with multiple packages for RNA
>>>> differential expression analysis (e.g. DESeq2, edgeR, limma, etc.) but I
>>>> have been unable to figure out what the most appropriate way to analyze
>>>> such data in this circumstance. The closest answer I can find within the
>>>> DESeq2 and edgeR manuals (limma is somewhat confusing to me) is to place to
>>>> main treatment of interest at the end of the design formula, for example:
>>>>
>>>> design(dds) <- formula(~ time + treatment)
>>>>
>>>> Is this what is considered the appropriate way to address repeated
>>>> measures
>>>> in mRNA expression experiments? Any thoughts are appreciated.
>>>>
>>>> Regards,
>>>>
>>>>
>>>
>>
>>
