[BioC] Data set for comparing statistical tests
Jorge Miró
jorgma86 at gmail.com
Sat Sep 1 00:46:39 CEST 2012
Hi James,
thank you. I checked and found what looks lika some arrays as rows and
the genes in the arrays as columns:
203508_at 204563_at 204513_s_at
12_13_02_U133A_Mer_Latin_Square_Expt1_R1 0.000 0.000 0.000
12_13_02_U133A_Mer_Latin_Square_Expt2_R1 0.125 0.125 0.125.
12_13_02_U133A_Mer_Latin_Square_Expt3_R1 0.250 0.250 0.250
.
. . .
.
. . .
.
. . .
12_13_02_U133A_Mer_Latin_Square_Expt1_R2 0.000 0.000 0.000
12_13_02_U133A_Mer_Latin_Square_Expt2_R2 0.125 0.125 0.125
12_13_02_U133A_Mer_Latin_Square_Expt3_R2 0.250 0.250 0.250
.
. . .
.
. . .
.
. . .
12_13_02_U133A_Mer_Latin_Square_Expt1_R3 0.000 0.000 0.000
12_13_02_U133A_Mer_Latin_Square_Expt2_R3 0.125 0.125 0.125
12_13_02_U133A_Mer_Latin_Square_Expt3_R3 0.250 0.250 0.250
.
. . .
.
. . .
>
What does the numbers in the pData matrix mean? Is that the concentrations?
Is there any paper or lab description with a guide about how to
compare statistical tests when using spike in data? I really can not
figure out how I should go on with the comparison. It seems that the
genes have the same concentrations among the three groups of arrays
(from 0.000 to 512.000) so I guess I should take only some from each
group and compare test for differentially expressed genes, eg four
from group-R1 (concentration 0.000 to 0.500), four from group-R2
(concentrations 4.000 to 32.000) and four from group-R3
(concentrations 64.000 to 512.000).
Am I thinking right?
Also I checked the size of the pData
> dim(pdata)
[1] 42 42
are there really only 42 genes in the SpikeIn133 dataset or am I
missing something here?
Best regards
Jorge
On Fri, Aug 31, 2012 at 9:02 PM, James W. MacDonald <jmacdon at uw.edu> wrote:
> Hi Jorge,
>
> pData(phenoData(SpikeIn133))
>
> Best,
>
> Jim
>
>
>
>
> On 8/31/2012 2:12 PM, Jorge Miró wrote:
>>
>> Hi again,
>>
>> I have been trying to understand how I should go on with the spike in
>> data but in vain.
>> Here are the commands I used:
>>
>>
>> ************ Code *************************
>>>
>>> library(SpikeIn)
>>> data(SpikeIn133)
>>
>> #Checked phenoData as suggested....
>>>
>>> phenoData(SpikeIn133)
>>
>> An object of class "AnnotatedDataFrame"
>> sampleNames: 12_13_02_U133A_Mer_Latin_Square_Expt1_R1
>> 12_13_02_U133A_Mer_Latin_Square_Expt2_R1 ...
>> 12_13_02_U133A_Mer_Latin_Square_Expt14_R3 (42 total)
>> varLabels: 203508_at 204563_at ... AFFX-ThrX-3_at (42 total)
>> varMetadata: labelDescription
>>
>> # ... but I could not see the concentrations for the samples. Is it
>> something else I should do? I tryid with pData too and I could not
>> find any information about the samples concentration.
>>
>> *************************** End of code ******************'
>> I guess the SpikeIn133 is a file with raw intensities so I shoud apply
>> rma on it and then use eg limma to test for differential expression of
>> the genes. Am I right?
>>
>> I read the manual for SpikeIn but I can't see anything about the
>> concentrations for each sample in the data set
>>
>> (http://www.bioconductor.org/packages/2.10/data/experiment/manuals/SpikeIn/man/SpikeIn.pdf)
>>
>>
>> Best regards
>> Jorge
>>
>> On Fri, Aug 31, 2012 at 12:01 PM, Benilton Carvalho
>> <beniltoncarvalho at gmail.com> wrote:
>>>
>>> check the SpikeIn package... in particular the phenoData slot for the
>>> datasets available. b
>>>
>>> On 31 August 2012 10:58, Jorge Miró<jorgma86 at gmail.com> wrote:
>>>>
>>>> Hi everybody,
>>>>
>>>> I need to compare Student's t-test and the test implemented in the
>>>> limma package. Does any body has an idea of how I should do?
>>>>
>>>> I guess I need a data set with already known differentially expressed
>>>> genes (maybe this can be done by specially designing the probesets in
>>>> the used arrays?) and then compare the results of a t-tests and limma
>>>> test with the expected differentially expressed genes. Where can I get
>>>> such a data set?
>>>>
>>>> Sorry if the question is a bit stupid but I'm new to microarray
>>>> analysis and statistics... By the way, should this kind of questions
>>>> be posted here or should I use another forum?
>>>>
>>>>
>>>>
>>>> Best regards
>>>> Jorge
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
> --
> James W. MacDonald, M.S.
> Biostatistician
> University of Washington
> Environmental and Occupational Health Sciences
> 4225 Roosevelt Way NE, # 100
> Seattle WA 98105-6099
>
More information about the Bioconductor
mailing list