[BioC] Design matrix for time course analysis with maSigPro

Wolfgang Huber whuber at embl.de
Sun May 30 14:41:23 CEST 2010


Hi Matthias

as a comment on a slightly generic level, with time course data (as 
otherwise) I have often found it useful to explore the data using 
heatmaps, clustering, parallel coordinate plots (see e.g. Fig. 4 in PMID 
18615017) before embarking on formal testing.

        Best wishes
	Wolfgang

On 27/05/10 12:04, Matthias Boeck wrote:
> Hello,
>
> I'm working on the analysis of time series data (MAS5) which consists of
> two experiments (expA, expB) on two cell lines (clA, clB) (but a similar
> if not same behavior is expected and therefore they might be used as
> replicates). Each experiment consists of four measurements at different
> points in time (6h, 24h, 72h and 144h) and for each of this measurements
> a control exits too. The controls are cell cultures of the same cell
> line which are untreated but still can show some activity.
>
> At the moment I try to find the differences and especially similarities
> between the two experiments in their reaction on the treatments and
> wanted to use maSigPro (if you have another suggestion I would be glad
> for any further advice).
> I already did some calculations with the package but I'm not sure if I
> got the design of the design matrix right and maybe you could be so kind
> to take a look at my matrix. Replicates are within the experiments and I
> used four dummy variables for the different experiments and cell lines:
>
>
>                         Time Replicate Control expB_clA expB_clB expA_clA
> expA_clB
> clA_6hr_expA_ctr          6         1       1        0        0        0
> 0
> clA_6hr_expA              6         2       0        0        0        1
> 0
> clA_24hr_expA_ctr        24         3       1        0        0        0
> 0
> clA_24hr_expA            24         4       0        0        0        1
> 0
> clA_day3_expA_ctr        72         5       1        0        0        0
> 0
> clA_day3_expA            72         6       0        0        0        1
> 0
> clA_day6_expA_ctrl      144         7       1        0        0        0
> 0
> clA_day6_expA           144         8       0        0        0        1
> 0
> clB_6hr_expA_ctr          6         1       1        0        0        0
> 0
> clB_6hr_expA              6         2       0        0        0        0
> 1
> clB_24hr_expA_ctr        24         3       1        0        0        0
> 0
> clB_24hr_expA            24         4       0        0        0        0
> 1
> clB_day3_expA_ctr        72         5       1        0        0        0
> 0
> clB_day3_expA            72         6       0        0        0        0
> 1
> clB_day6_expA_ctr       144         7       1        0        0        0
> 0
> clB_day6_expA           144         8       0        0        0        0
> 1
> clA_6hr_expB_ctr          6         9       1        0        0        0
> 0
> clA_6hr_expB              6        10       0        1        0        0
> 0
> clA_24hr_expB_ctr        24        11       1        0        0        0
> 0
> clA_24hr_expB            24        12       0        1        0        0
> 0
> clA_day3_expB_ctr        72        13       1        0        0        0
> 0
> clA_day3_expB            72        14       0        1        0        0
> 0
> clA_day6_expB_ctr       144        15       1        0        0        0
> 0
> clA_day6_expB           144        16       0        1        0        0
> 0
> clB_6hr_expB_ctr          6         9       1        0        0        0
> 0
> clB_6hr_expB              6        10       0        0        1        0
> 0
> clB_24hr_expB_ctr        24        11       1        0        0        0
> 0
> clB_24hr_expB            24        12       0        0        1        0
> 0
> clB_day3_expB_ctr        72        13       1        0        0        0
> 0
> clB_day3_expB            72        14       0        0        1        0
> 0
> clB_day6_expB_ctr       144        15       1        0        0        0
> 0
> clB_day6_expB           144        16       0        0        1        0
> 0
>
>
> By using this design I end up with about 1179 probes after the first
> regression step (p.vector() with q-value of 0.0001). I'm not sure if
> this is a realistic amount or if it is because of the design or the lack
> of further replicates (array quality checks have already been performed
> on the data). Would a non specific filtering make sense before the
> analysis?
> I also considered changing the replicates column and grouped the
> controls according to the cell lines but this didn't seem to alter the
> results. Does the algorithm take the mean/median over all given controls
> without considering the replicate grouping? Or could this be a hint that
> the controls are quite similar and could also be combined? If the
> controls are grouped together in the replicates, is maSigPro taking the
> median over those for the calculation or is this just for the
> see.genes() visualization?
>
>
> I'm sorry for all these questions but I haven't worked before with the
> time series packages in R and I'm not sure if I use the methods
> correctly.
> I would be glad for any help!
>
>
> Best wishes,
> Matthias
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 


Wolfgang Huber
EMBL
http://www.embl.de/research/units/genome_biology/huber



More information about the Bioconductor mailing list