[BioC] help with dataset
Saroj K Mohapatra
saroj at vt.edu
Wed May 27 15:17:39 CEST 2009
Hi Alberto:
In line of Vincent's second suggestion, you could read the phenoData
while reading the data (.CEL files). After that, it would propagate from
AffyBatch to expressionSet object. For some time, I have used this
approach (below) which I would like to bounce off the list.
For example, there is a Targets.txt file:
Sample Celfile ES TYPE
SHR.PUFA5 SHR-PUFA5.CEL PUFA SHR
SHR.PUFA6 SHR-PUFA6.CEL PUFA SHR
SHR.st7 SHR-st7.CEL ST SHR
SHR.st8 SHR-st8.CEL ST SHR
WK.PUFA3 WK-PUFA3.CEL PUFA WK
WK.PUFA4 WK-PUFA4.CEL PUFA WK
WK.st1 WK-st1.CEL ST WK
WK.st2 WK-st2.CEL ST WK
You read in this information
> targets=readTargets()
Then create Phenodata object from this information.
The following create an object that is basically same as the targets
> myCovs = data.frame(targets)
> rownames(myCovs) = myCovs[,1]
Find the levels of each column in the targets.
> nlev = as.numeric(apply(myCovs, 2, function(x) nlevels(as.factor(x))))
> finally create the data.frame and AnnotatedDataFrame
metadata = data.frame(labelDescription = paste(colnames(myCovs), ": ",
nlev, " level", ifelse(nlev==1,"","s"), sep=""),
row.names=colnames(myCovs))
phenoData = new("AnnotatedDataFrame", data=myCovs, varMetadata=metadata)
use the phenoData as an argument to ReadAffy
> dat=ReadAffy(sampleNames=myCovs$Sample, filenames=myCovs$Celfile,
phenoData=phenoData)
Then normalize.
> eset = rma(dat)
I hope that works.
Saroj
Sample Celfile ES TYPE
SHR.PUFA5 SHR-PUFA5.CEL PUFA SHR
SHR.PUFA6 SHR-PUFA6.CEL PUFA SHR
SHR.st7 SHR-st7.CEL ST SHR
SHR.st8 SHR-st8.CEL ST SHR
WK.PUFA3 WK-PUFA3.CEL PUFA WK
WK.PUFA4 WK-PUFA4.CEL PUFA WK
WK.st1 WK-st1.CEL ST WK
WK.st2 WK-st2.CEL ST WK
Alberto Goldoni wrote:
>> Hello to everybody,
>> i have a little problem with my dataset. Actually my data using
>> pData(eset.irq.50) is like this:
>>
>>
>>> eset.irq.50
>>>
>> ExpressionSet (storageMode: lockedEnvironment)
>> assayData: 1227 features, 8 samples
>> element names: exprs
>> phenoData
>> sampleNames: SHR-PUFA5.CEL, SHR-PUFA6.CEL, ..., WK-st2.CEL (8 total)
>> varLabels and varMetadata description:
>> sample: arbitrary numbering
>> featureData
>> featureNames: 1367555_at, 1367556_s_at, ..., 1399089_at (1227 total)
>> fvarLabels and fvarMetadata description: none
>> experimentData: use 'experimentData(object)'
>> Annotation: rat2302
>>
>>
>>
>>> pData(eset.irq.50)
>>>
>> sample
>> SHR-PUFA5.CEL 1
>> SHR-PUFA6.CEL 2
>> SHR-st7.CEL 3
>> SHR-st8.CEL 4
>> WK-PUFA3.CEL 5
>> WK-PUFA4.CEL 6
>> WK-st1.CEL 7
>> WK-st2.CEL 8
>>
>> i would like to modify these data in order to obtain:
>>
>>
>>> pData(eset.irq.50)
>>>
>> ES TYPE
>> SHR-PUFA5.CEL PUFA SHR
>> SHR-PUFA6.CEL PUFA SHR
>> SHR-st7.CEL ST SHR
>> SHR-st8.CEL ST SHR
>> WK-PUFA3.CEL PUFA WK
>> WK-PUFA4.CEL PUFA WK
>> WK-st1.CEL ST WK
>> WK-st2.CEL ST WK
>>
>> in order to gave to me the possibility to perform factDesign analysis.
>>
>> Somebody can help me?
>>
>>
>> BEST REGARDS.
>>
>>
>>
>> --
>> -----------------------------------------------------
>> Dr. Alberto Goldoni
>> Bologna, Italy
>> -----------------------------------------------------
>>
>>
>
>
>
>
More information about the Bioconductor
mailing list