[BioC] Pipeline for listing phenoData in exprSet
Martin Morgan
mtmorgan at fhcrc.org
Tue Jun 19 16:30:24 CEST 2007
Sergii --
Are you trying to use phenotypic data to subset your expression set?
If so...
By way of reproducible example, here's an expression set we all have
access to
> library(Biobase)
> data(sample.ExpressionSet)
> obj <- sample.ExpressionSet
pData(obj) is a data frame. It sounds like you want to use column
names that are in a character vector. In this case, use '[[' to access
columns of the data frame.
> df <- pData(obj)
> nm <- names(df)
> okNms <- nm[!nm %in% "type"]
> okVals <- df[[ okNms[2] ]] > 0.9
Use '[' to subset the samples present in the expression set based on
their phenotypic values
> obj1 <- obj[,okVals]
Use ';' to wink in text messages.
Is this (other than ;) helpful?
Martin
Sergii Ivakhno <si2 at sanger.ac.uk> writes:
> hello All,
> I was wandering if you could possibly give me some suggestions with the
> following problem:
> I would like to build a pipline which opens consequently exprset files
> (imported from GEO) and extracts and evaluates the phenoData labels
> (except the fields "sample" and "description").
> The program is below: the basic problem is that when you use
> names(pData(eset)), you obtain a character vector and you can
> not use say "phenonames <- names(pData(eset)); "eset$phenonames[2] or
> paste("eset",phenonames[2],sep="$")" (remember I need the vector in the
> first place to remove phenolabels "sample", "description").
>
> Thanks a lot for advice!!
> Best,
> Sergii
>
>
> Wellcome Trust Genome Campus
> Hinxton, Cambridge, CB10 1SA, UK
>
>
>
>
> dfg <-c("sample", "description");
> files <- dir(getwd(),".RData")
> for (k in 1:length(files)){
> load(files[k]);
> pdateset <- names(pData(eset));
> labels <- pdateset[-which(pdateset %in% dfg)];
> for (m in 1:length(labels)){
> teamp2 <- unique(paste("eset" ,labels[m],sep="$");
>
> teamp<-as.vector(teamp2);
> for (i in 1:length(teamp)){
> for (j in 2:length(teamp)){
> if (i != j){
> teamp1 <- paste(teamp[i] ,teamp[j],sep="_")
> teamp1 <- paste(teamp1 ,files[k],sep="_")
> temp <- ( as.character(eset$agent) ==
> teamp[i])|(as.character(eset$agent) == teamp[j]);
> tempeset <-eset[,temp];
> design <- model.matrix(~factor(tempeset$agent));
> fit <- lmFit(tempeset, design);
> ebayes <- eBayes(fit);
> sortebays <- sort.int(ebayes$t[,2], decreasing = TRUE, index.return = TRUE);
> sortebays1 <- ebayes[sortebays$ix,];
> save(sortebays1, file = paste(teamp1,c(".RData"),sep=""));
> rm (sortebays,ebayes,temp,teamp1,tempeset,design,sortebays1,pdatesetlabels);
> }
> }
> }
> }
> }
>
>
> --
> The Wellcome Trust Sanger Institute is operated by Genome Research
> Limited, a charity registered in England with number 1021457 and a
> company registered in England with number 2742969, whose registered
> office is 215 Euston Road, London, NW1 2BE.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org
More information about the Bioconductor
mailing list