[BioC] Cross-comparison of independent intensities from different experiments (genepix) (sorry I don\'t know how to describe the problem better)
Susanne Gerber
gerber.sj at googlemail.com
Fri Feb 3 16:27:32 CET 2012
Dear James,
thank you so much for the very fast and detailed response.
I will start answering your questions:
> I assume that by 'two independent time series' you mean that these experiments were conducted at different times, perhaps in different labs, etc?
The first experiment was performed half a year earlier but within the
same lab and by the same experimenter.
>The second problem is due to the fact that you never hybridized MU and WT samples on the same chip,
>which has introduced another untestable and unquantifiable 'chip' effect.
Well, this is actually the problem I am struggling with.
>You could hypothetically do a single channel analysis with these data,
> but any comparison between MU and WT would include both biological and technical variability,
> and you won't be able to say how much of either.
> Again, you can assume that the technical variability is small, but you won't really be able to say for sure if this assumption is true.
I think I have to do so, since these data are the only dataset I have.
I am not an experimenter and the lab where the data originally came
from can not perform these new experiments (the cells are not
available any more, project ran out last year, no money, no staff...).
Thats what I meant by saying "it is also not possible to repeat the
experiment and produce a direct comparison."
Sorry for being so imprecise. :)
The maanova package is great and I already used it, however I still do
not know how to perform this single channel analysis you were talking
about with my two-colour data.
What would be the best way (or is there already an existing package
for this) to treat the data and to extract the information ?
Thanks again so much for your help
Best regards,
Susanne
--
Dr. Susanne Gerber
Computational Time Series Analysis
Institute of Computational Science
University of Lugano
Via Giuseppe Buffi 13
6904 Lugano
http://www.ics.inf.usi.ch/people/dr-susanne-gerber.html
2012/2/3 James W. MacDonald <jmacdon at med.umich.edu>:
> Hi Susanne,
>
>
> On 2/3/2012 7:53 AM, Susanne Gerber [guest] wrote:
>>
>> Dear all,
>> could please anyone help me with the following problem:
>>
>> Experiments were performed using two color cDNA .gpr files (genepix).
>> We have an experimental setup with two independent time series (each of it
>> with 4 time-points (in the following T1 - T4).
>>
>> In the first time series Wildtype(WT) cells were stressed at time point
>> zero with a certain drug and probes were taken at 4 time points afterwards.
>> These probes were compared with the unstressed WT.
>>
>> In the second time series mutant-cells (MU) were treated identically and
>> compared with the unstressed MU cell.
>>
>>
>> Here is the target file
>>
>>> targets
>>
>> FileName Cy3 Cy5
>> 1 13754122.gpr WT WT_stress_T1
>> 2 13754112.gpr WT_stress_T1 WT
>> 3 14039687.gpr WT WT_stress_T2
>> 4 13754123.gpr WT WT_stress_T2
>> 5 13754109.gpr WT WT_stress_T3
>> 6 14039055.gpr WT_stress_T3 WT
>> 7 14004643.gpr WT WT_stress_T4
>> 8 14039058.gpr WT_stress_T4 WT
>> 9 14039688.gpr MU MU_stress_T1
>> 10 13754114.gpr MU_stress_T1 MU
>> 11 14039061.gpr MU MU_stress_T2
>> 12 14039059.gpr MU_stress_T2 MU
>> 13 13754124.gpr MU MU_stress_T3
>> 14 13754115.gpr MU_stress_T3 MU
>> 15 14039057.gpr MU MU_stress_T4
>> 16 14039056.gpr MU_stress_T4 MU
>>
>> I was working a lot with these data and we had some very interesting
>> results, however, I am not able to solve the following problem:
>>
>> How can a make a comparison between
>> a) MU and WT
>> b) MU_stressed and WT
>
>
> That's because this is an unsolvable problem with the data in hand. I assume
> that by 'two independent time series' you mean that these experiments were
> conducted at different times, perhaps in different labs, etc?
>
> There are two problems here. First, depending on what you mean by
> 'independent time series', a batch effect may have been introduced, which
> you will not be able to account for statistically. However, depending on the
> nature of the independence between these time series, you may be able to get
> away with assuming little or no batch effect. But you will have to make that
> assumption without really being able to test it.
>
> The second problem is due to the fact that you never hybridized MU and WT
> samples on the same chip, which has introduced another untestable and
> unquantifiable 'chip' effect. You could hypothetically do a single channel
> analysis with these data, but any comparison between MU and WT would include
> both biological and technical variability, and you won't be able to say how
> much of either. Again, you can assume that the technical variability is
> small, but you won't really be able to say for sure if this assumption is
> true.
>
> To a certain extent, both time series have to be independent, as MU and WT
> cells are different. So if 'independent time series' just means that the
> experimenter did the WT time series and then did the MU time series, that's
> a batch effect that people ignore all the time, and I don't see a need to
> repeat the experiment. But if the experimenter really wants to compare the
> MU and WT samples directly, they need to be hybridized to the same chips,
> preferably in one of these 'round-robin' type designs where you do things
> like
>
> MU1 vs WT1
> MU stressed1 vs WT2
> MU stressed2 vs WT stressed1
> MU2 vs WT stressed2
>
> which tends to reduce variability for comparisons. There may be something
> about these types of design in the limma user's guide. The maanova package
> was designed specifically for this type of analysis, so you might look at
> that package as well; I assume there is a vignette that may have helpful
> insights. You could also look at some of Katie Kerr's papers (do a google
> scholar search for kerr anova microarray).
>
> Best,
>
> Jim
>>
>>
>> A am not the experimenter and it is also not possible to repeat the
>> experiment and produce a direct comparison.
>>
>> However, I think - even if it is not the most elegant way - there should
>> be a way to make this comparison with the existing data.
>>
>> I was already thinking of simple "copy and past" the single channel
>> intensities from the .gpr-files into a new matrix, but I guess this would
>> cause a lot of problems concerning normalization steps.
>> Perhaps the answer is very easy, - then sorry for bothering you - but I
>> swear I was reading a lot (tutorials) but actually I even don't know what
>> keywords to search (google) for this problem.
>>
>> What I do right now (after preprocessing) is:
>> #
>> #
>> Average<- avedups(genes, ndups=2, spacing=1)
>> Average$A[ is.na(Average$A) ]<- 0.0
>> Average$M[ is.na(Average$M) ]<- 0.0
>> #
>> designWT<- modelMatrix(targets,ref="WT")
>> designWT<- designWT[1:8,1:4]
>> designWT
>> designMU<- modelMatrix(targets,ref="MU")
>> designMU<- designMU[9:16,6:9]
>> designMU
>>
>> AverageWT<- Average[,1:8]
>> AverageMU<- Average[,9:16]
>> #
>> fit_WT<- lmFit(AverageWT, designWT)
>> fit_WT<- eBayes(fit_WT)
>> topTable(fit_WT)
>> fit_MU<- lmFit(AverageMU, designMU)
>> fit_MU<- eBayes(fit_MU)
>> topTable(fit_MU)
>>
>> #
>> .... and further analysis and evaluation procedures
>> #
>>
>>
>> Please, what would be the best way to make the comparison
>>
>> a) MU_(T1-4) with WT as reference
>> and
>> b) MU_stressed (T1-4 )with WT as a reference ?
>>
>> Thanks a lot in advance for the help !
>> I would be so grateful if someone could give me an answer.
>>
>> Best regards,
>> Susanne
>>
>>
>>
>> -- output of sessionInfo():
>>
>>> sessionInfo()
>>
>> R version 2.13.2 (2011-09-30)
>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>>
>> locale:
>> [1] C/en_US.UTF-8/C/C/C/C
>>
>> attached base packages:
>> [1] splines tcltk stats graphics grDevices utils datasets
>> methods base
>>
>> other attached packages:
>> [1] MASS_7.3-14 calibrate_1.7 Heatplus_1.22.0
>> XML_3.4-3 annaffy_1.24.0 KEGG.db_2.5.0
>> [7] goProfiles_1.14.0 GO.db_2.5.0 annotate_1.30.1
>> yeast2.db_2.5.0 org.Sc.sgd.db_2.5.0 RSQLite_0.10.0
>> [13] DBI_0.2-5 AnnotationDbi_1.14.1 statmod_1.4.14
>> vsn_3.20.0 arrayQuality_1.30.0 convert_1.28.0
>> [19] affy_1.30.0 marray_1.30.0 limma_3.8.3
>> maSigPro_1.24.1 DynDoc_1.30.0 widgetTools_1.30.0
>> [25] Biobase_2.12.2
>>
>> loaded via a namespace (and not attached):
>> [1] Mfuzz_2.10.0 RColorBrewer_1.0-5 affyio_1.20.0
>> grid_2.13.2 gridBase_0.4-4 hexbin_1.26.0
>> [7] lattice_0.19-33 preprocessCore_1.14.0 tkWidgets_1.30.0
>> tools_2.13.2 xtable_1.6-0
>> --
>> Sent via the guest posting facility at bioconductor.org.
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
> --
> James W. MacDonald, M.S.
> Biostatistician
> Douglas Lab
> University of Michigan
> Department of Human Genetics
> 5912 Buhl
> 1241 E. Catherine St.
> Ann Arbor MI 48109-5618
> 734-615-7826
>
> **********************************************************
> Electronic Mail is not secure, may not be read every day, and should not be
> used for urgent or sensitive issues
More information about the Bioconductor
mailing list