[BioC] colnames and get means for the columns with the "same" names
Weiwei Shi
helprhelp at gmail.com
Mon Nov 6 23:40:16 CET 2006
hi,
I played around with these two functions but did not get what i want.
So I wrote a function by using a loop to do it and it is done in a
reasonable time:
> system.time(t3 <- iconix.convert(processed, 9, 7486, probes2llid.genego[,c(2,5)]))
[1] 12.356 4.494 16.836 0.000 0.000
> dim(t3)
[1] 129 4255
I am more interested in the approach instead of "averaging". I will
look into the archive since it is a very common problem Microarray
analysis has.
I post my function here in case someone needs it in the future.
iconix.convert <- function(orig, st=9, ed=7486, c.table){
t1 <- orig[, st:ed]
# treat missing
t1 <- sapply(t1, function(x){ x[is.na(x)]<-0; x})
x0 <- unique(c.table[,2])
out <- matrix(0, dim(t1)[1], length(x0))
j = 1
for (i in x0){
avg.col <- c.table[c.table[,2]==i, 1]
if (length(avg.col) > 1){ # has 1:multiple ids
t2 <- apply(t1[, avg.col], 1, mean)
}
else{
t2 <- t1[, avg.col]
}
out[,j] <- t2
j <- j + 1
}
out <- as.data.frame(out)
colnames(out) <- x0
out2 <- cbind(orig[, c(1:(st-1))], out, orig[,c((ed+1):dim(orig)[2])])
colnames(out2)[dim(out2)[2]] <- "Group"
out2
}
On 11/6/06, Davis, Sean (NIH/NCI) [E] <sdavis2 at mail.nih.gov> wrote:
> Hi, Weiwei.
>
> You probably want to look at a combination of merge() to combine your data with your conversion table followed by aggregate(). Read up on the help for those two functions and that should do it, if I understand what you want to do. However, keep in mind that "averaging" the probesets representing the same gene may not represent the best solution. Also, if you search the archive a bit, I know this question has come up before.
>
> Sean
>
>
>
> -----Original Message-----
> From: Weiwei Shi [mailto:helprhelp at gmail.com]
> Sent: Mon 11/6/2006 4:53 PM
> To: r-help
> Cc: bioconductor
> Subject: [BioC] colnames and get means for the columns with the "same" names
>
> hi,
> I have a conversion table for colnames like this:
> Probe_ID HUMAN_LLID
> 1 AF106325_PROBE1 7052
> 2 NM_019386_PROBE1 7052
> 3 NM_012907_PROBE1 339
> 4 AW917796_PROBE1 84196
> 5 L27651_PROBE1 10864
>
> The Probe_ID contains a list of colnames for another data.frame, say x1.
> I need to convert such colnames to another ID's system, HUMAN_LLID by
> using the table. The colnames of x1 with the same names (in
> HUMAN_LLID) need to be averaged. Is there a good way to do it?
>
> I also put this question in bioconductor since I believe it might be
> solved by some package.
>
> thanks.
>
> --
> Weiwei Shi, Ph.D
> Research Scientist
> GeneGO, Inc.
>
> "Did you always know?"
> "No, I did not. But I believed..."
> ---Matrix III
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
--
Weiwei Shi, Ph.D
Research Scientist
GeneGO, Inc.
"Did you always know?"
"No, I did not. But I believed..."
---Matrix III
More information about the Bioconductor
mailing list