[BioC] finding and averaging replicate gene records
Oosting, J. (PATH)
J.Oosting at lumc.nl
Wed Mar 16 09:33:38 CET 2005
I'm not entirely sure this will work in it's current form. I've adapted it from a routine I use to do this with expression sets, so maybe some typecasting or transformation to the proper classtypes is needed. Your data is in the dataf variable
mean.row<-function(rows) {if (length(rows)==1) ex[rows,] else apply(ex[rows,],2,mean,na.rm=TRUE)}
# Select Vector of unigene ids that are in data and have correct (non-empty) mapping
geneIds<-dataf[rownames(dataf),2]
geneIds<-geneIds[geneIds!=""]
# subset the expression values
ex<-dataf[,c(-1,-2)]
# make a list that contains combined rownames for each unigene id
newrows<-split(names(geneIds),geneIds)
# the t() is needed because the dimensions seem to come out wrong of sapply
exn<-t(sapply(newrows,mean.row))
# Put the unigene Ids in the result
cbind(names(newrows),exn) # or rownames(exn)<-names(newrows)
Jan Oosting
> -----Original Message-----
> From: bioconductor-bounces at stat.math.ethz.ch
> [mailto:bioconductor-bounces at stat.math.ethz.ch]On Behalf Of zhihua li
> Sent: woensdag 16 maart 2005 08:33
> To: bioconductor at stat.math.ethz.ch
> Subject: [BioC] finding and averaging replicate gene records
>
>
> Hi netter!
>
> In most microarray slides a single gene will be represented
> by multiple
> items. Sometimes it's unforseable because they have different genbank
> accession numbers and you will not find them until you get a
> unigene list
> for all your gene items.
>
> Now I have a dataframe . The rows are gene records(accession number,
> unigene ID and expression values in different conditions) ;
> the 1st column
> is genbank accession numbers, the 2nd column is unigene IDs, from 3rd
> column on are different conditions). All the accession
> numbers are unique,
> but through unigene IDs i can find that some items, though
> with different
> accession numbers, are in fact sharing the same unigene ID. I
> would like to
> find the gene records containing replicate unigene IDs and
> merge them into
> one record by averaging different expression values in the
> same condition.
>
> Could anyone give me a clue about how to write the code? Or
> are there any
> contributed functions can do this stuff?
>
> Thanks a lot!
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
>
More information about the Bioconductor
mailing list