[BioC] Pipeline for listing phenoData in exprSet

Martin Morgan mtmorgan at fhcrc.org
Tue Jun 19 16:30:24 CEST 2007


Sergii --

Are you trying to use phenotypic data to subset your expression set?
If so...

By way of reproducible example, here's an expression set we all have
access to

> library(Biobase)
> data(sample.ExpressionSet)
> obj <- sample.ExpressionSet

pData(obj) is a data frame.  It sounds like you want to use column
names that are in a character vector. In this case, use '[[' to access
columns of the data frame.

> df <- pData(obj)
> nm <- names(df)
> okNms <- nm[!nm %in% "type"]
> okVals <- df[[ okNms[2] ]] > 0.9

Use '[' to subset the samples present in the expression set based on
their phenotypic values

> obj1 <- obj[,okVals]

Use ';' to wink in text messages.

Is this (other than ;) helpful?

Martin

Sergii Ivakhno <si2 at sanger.ac.uk> writes:

> hello All,
> I was wandering if you could possibly give me some suggestions with the 
> following problem:
> I would like to build a pipline which opens consequently exprset files 
> (imported from GEO) and extracts and evaluates the phenoData labels 
> (except the fields "sample" and "description").
> The program is below:  the basic problem is that when you use 
> names(pData(eset)), you obtain a character vector and you can
> not use say  "phenonames <- names(pData(eset)); "eset$phenonames[2] or 
> paste("eset",phenonames[2],sep="$")" (remember I need the vector in the 
> first place to remove phenolabels "sample", "description").
>
> Thanks a lot for advice!!
> Best,
> Sergii
>
>
> Wellcome Trust Genome Campus
> Hinxton, Cambridge, CB10 1SA, UK
>
>
>
>
> dfg <-c("sample", "description");
> files <-  dir(getwd(),".RData")
> for (k in 1:length(files)){
> load(files[k]);
> pdateset <- names(pData(eset));
> labels <- pdateset[-which(pdateset %in% dfg)];
> for (m in 1:length(labels)){
> teamp2 <- unique(paste("eset"    ,labels[m],sep="$"); 
>
> teamp<-as.vector(teamp2);
> for (i in 1:length(teamp)){
> for (j in 2:length(teamp)){
> if (i != j){
> teamp1 <-  paste(teamp[i] ,teamp[j],sep="_")
> teamp1 <-  paste(teamp1 ,files[k],sep="_")
> temp <- ( as.character(eset$agent) == 
> teamp[i])|(as.character(eset$agent) == teamp[j]);
> tempeset <-eset[,temp];
> design <- model.matrix(~factor(tempeset$agent));
> fit <- lmFit(tempeset, design);
> ebayes <- eBayes(fit);
> sortebays <- sort.int(ebayes$t[,2], decreasing = TRUE, index.return = TRUE);
> sortebays1 <- ebayes[sortebays$ix,];
> save(sortebays1, file = paste(teamp1,c(".RData"),sep=""));
> rm (sortebays,ebayes,temp,teamp1,tempeset,design,sortebays1,pdatesetlabels);
> }      
> }
> }
> }
> }
>
>
> -- 
> The Wellcome Trust Sanger Institute is operated by Genome Research 
> Limited, a charity registered in England with number 1021457 and a 
> company registered in England with number 2742969, whose registered 
> office is 215 Euston Road, London, NW1 2BE.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org



More information about the Bioconductor mailing list