[BioC] ExpressionSet class and problems with phenotype and metadata matrices
James W. MacDonald
jmacdon at med.umich.edu
Thu Feb 28 19:40:19 CET 2008
Hi Sean,
Sean MacEachern wrote:
> Hello,
>
> I'm new to R and Bioconductor. I am trying to analyse a simple microarray
> experiment examining two lines: Resistant (R) and susceptible (S) for
> differences in expression levels.
>
> The data I have contains a file with expression for 4 and 3 replicates from
> the R and S lines respectively. I'm trying to create an ExpressionSet object
> to initially complete some exploratory clustering on the data set and I have
> been following the vignette " An Introduction to Bioconductor¹s
> ExpressionSet Class" by Falcon etal.
>
> I have read in my data:
>
>> summary(AffyIn)
> lineA.1 lineB.3
> Min. : 2.0 Min. : 2.0
> 1st Qu.: 18.0 1st Qu.: 18.0
> Median : 38.0 Median : 42.0
> Mean : 139.0 Mean : 143.4
> 3rd Qu.: 96.0 3rd Qu.: 105.0
> Max. :6974.0 ...... Max. :7417.0
>
>> dim(AffyIn)
> [1] 38483 7
>
> Following the vignette I have read in a simple phenotype txt file containing
> seven rows which relate to the 7 lines with two phenotypes R and S
>
>> dim(AffyPheno)
> [1] 7 1
>
>> summary(AffyPheno)
> Pheno
> R:4
> S:3
>
>> all(rownames(AffyPheno) == colnames(AffyIn))
> [1] TRUE
>
>
> #However, it is after this that I start having some problems; as I am using
> my own data I have modified some of the exercises in the vignette.
>
>> AffyPheno[c(3,7),c("Pheno")]
> [1] R S
> Levels: R S
>
> # I was expecting something like the following to be returned:
> Pheno
> lineA.3 R
> LineB.7 S
You shouldn't expect that. You might want to peruse 'An Introduction to
R', which I believe should cover this point. What is happening is the
output is being coerced to a vector, which can be overridden by using
AffyPheno[c(3,7),c("Pheno"), drop=FALSE]
>
> #Also when I try the following command I get this error:
>> AffyPheno[AffyPheno$Pheno == "R"]
>
> Error in `[.data.frame`(AffyPheno, AffyPheno$Pheno == "R") :
> undefined columns selected
The error is supposed to be helpful here. You are trying to select rows
from a data.frame, but you aren't saying which columns you want. The
correct incantation looks like this:
AffyPheno[AffyPheno$Pheno == "R", ]
if you want all columns. This again is something that 'An Introduction
to R' will help with.
>
> #My R programming knowledge is basic at best so I assumed there was
> something wrong there and continued with the metadata and phenoData
>
>> metadata = data.frame(labelDescrition = c("Status"),rownames=c("Phenotype"))
>> metadata
> labelDescrition rownames
> 1 Status Phenotype
>
>> phenoData=new("AnnotatedDataFrame", data = AffyPheno, varMetadata = metadata)
>> phenoData
> An object of class "AnnotatedDataFrame"
> rowNames: line6.1, line6.2, ..., line7.4 (7 total)
> varLabels and varMetadata description:
> Pheno: NA
> additional varMetadata: rownames, labelDescription
>
>
> # As you can see no error was thrown, but I was expecting something in the
> varLabels and varMetadata descrtiptions...
I'd have to check to be sure, but I believe what you want for your
metadata is to explain what the 'Pheno' column contains. So something like
metadata = data.frame(labelDescrition = c("Phenotype"),rownames="Pheno")
Is IIRC correct. I'm actually surprised you didn't get an error. Martin
Morgan may respond as well, and he knows better than, well, everybody
about the ExpressionSet class so he will know for sure.
>
> So I thought it was best to check the list to see if anyone could point out
> any mistakes I've made before I continue.
>
> While I was here I was also wondering if anyone knew of anything in the
> annotation package like the hgu95av2 chip for annotating chicken affy data
> in the annotation library?
Um, what? Not sure what you want here. The hgu95av2 chip is designed for
analyzing human samples, so there is nothing in there for chickens. If
you have chicken affy data, then you might want to look at the chicken
annotation package, which _does_ annotate that chip.
Best,
Jim
>
> Thanks in advance,
>
> Sean MacEachern
>
> R version 2.6.0 (2007-10-03)
> i386-apple-darwin8.10.1
> Biobase_1.16.3
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623
More information about the Bioconductor
mailing list