[BioC] How to display or output expression data with non-unique feature labels?

Kocherginsky, Masha [BSD] - HSD mkocherg at health.bsd.uchicago.edu
Fri Jun 29 21:18:34 CEST 2007


Hello,

I'm learning (slowly) the way eSet and ExpressionSet classes work.  What
I want to do is probably quite simple, but I can't figure out the
following: how do I either display (onscreen) or write into a file the
expression levels of an ExpressionSet object AND include some or all
featureData?  For example, in the ExpressionSet generated below I have
g1-g5 replicated 4 times, and pos1-pos2 is replicated 10 times:

set.seed(31415)
exprs=data.frame(matrix(rnorm(100),nrow=20, ncol=5))
colnames(exprs)=letters[1:5]
pData=data.frame(matrix(rbinom(15,1,.5), nrow=5, byrow=T))
rownames(pData)=letters[1:5]
colnames(pData)=c("sex","agegrp","location")
phenoData=new("AnnotatedDataFrame", data=pData, 
  varMetadata=c("Gender","Age Group", "Location"))

fNames=rep(c("g1","g2","g3","g4","g5"),4)
fLoc=sort(rep(c("pos1","pos2"),10))
fData=data.frame(name=fNames, location=fLoc)
fDatalbl=data.frame(labelDescription=c("Name of Gene", "Location of

  Gene"), row.names=colnames(fData))
featureData=new("AnnotatedDataFrame", data=fData, varMetadata=fDatalbl)
data=new("ExpressionSet", exprs=exprs, phenoData=phenoData, 
  featureData=featureData)

> data
ExpressionSet (storageMode: lockedEnvironment)
assayData: 20 features, 5 samples 
  element names: exprs 
phenoData
  rowNames: a, b, ..., e (5 total)
  varLabels and varMetadata:
    sex: NA
    agegrp: NA
    location: NA
featureData
  rowNames: 1, 2, ..., 20 (20 total)
  varLabels and varMetadata:
    name: Name of Gene
    location: Location of       
      Gene
experimentData: use 'experimentData(object)'
Annotation character(0)


I'd like to get something like this using some accessor function (is
there one?):

  fNames fLoc        a        b        c        d        e
1     g1 pos1  1.64701  0.87221 -2.19476 -0.46615  1.01645
2     g2 pos1 -1.11199  1.05936  1.28965 -0.61097 -0.41773
3     g3 pos1 -1.54855  0.48339 -0.46200  1.17314 -0.09009
4     g4 pos1 -0.74838 -1.46186  0.35942 -0.87757 -1.33541
5     g5 pos1  1.52641 -0.39533  0.07781  1.26703  0.71262
6     g1 pos1 -1.13793  0.36251  0.45685  0.19561  1.16690

The data set I'm working with comes from a custom array with 3
replicates for each gene, and thus the "feature" names are not unique.
Am I setting up the ExpressionData object correctly? Is there a better
way than to incorporate non-unique gene names in exprs than to use gene
names via featureData? 

Thanks for any advice!
Masha
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Masha Kocherginsky, Ph.D.
Research Associate (Assistant Professor)
Department of Health Studies
University of Chicago
5841 S. Maryland Ave.
Chicago, IL  60637


This email is intended only for the use of the individual or...{{dropped}}



More information about the Bioconductor mailing list