[BioC] sort the difference and save to individual files problem
Adaikalavan Ramasamy
ramasamy at cancer.org.uk
Fri Jul 30 15:42:47 CEST 2004
Put everything in a matrix and then use apply() family to find index
with highest. You need to add one more line to my function just before
the return(results) :
rownames(results) <- rownames(m)
so you output will have rownames. Then something like this would work.
pairwise.difference <- function(m){
npairs <- choose( ncol(m), 2 )
results <- matrix( NA, nc=npairs, nr=nrow(m) )
cnames <- rep(NA, npairs)
if( is.null(colnames(m)) ) colnames(m) <- paste("col", 1:ncol(m),
sep="")
k <- 1
for(i in 1:ncol(m)){
for(j in 1:ncol(m)){
if(j <= i) next;
results[ ,k] <- m[ ,i] - m[ ,j]
cnames[k] <- paste(colnames(m)[ c(i, j) ], collapse=".vs.")
k <- k + 1
}
}
colnames(results) <- cnames
rownames(results) <- rownames(m)
return(results)
}
# Example using a matrix with 5 gene/row and 4 columns
mat <- matrix( sample(1:20), nc=4 )
colnames(mat) <- LETTERS[1:4]
rownames(mat) <- paste( "g", 1:5, sep="")
mat
A B C D
g1 10 16 3 15
g2 18 5 12 19
g3 7 4 8 13
g4 14 2 6 11
g5 17 1 20 9
(out <- pairwise.difference(mat))
A.vs.B A.vs.C A.vs.D B.vs.C B.vs.D C.vs.D
g1 -6 7 -5 13 1 -12
g2 13 6 -1 -7 -14 -7
g3 3 -1 -6 -4 -9 -5
g4 12 8 3 -4 -9 -5
g5 16 -3 8 -19 -8 11
# Now show the 3 genes with largest absolute value in each column
apply(abs(out), 2, function(x) names(x[order(-x)]) [ 1:3 ])
A.vs.B A.vs.C A.vs.D B.vs.C B.vs.D C.vs.D
[1,] "g5" "g4" "g5" "g5" "g2" "g1"
[2,] "g2" "g1" "g3" "g1" "g3" "g5"
[3,] "g4" "g2" "g1" "g2" "g4" "g2"
This says that g5 had the largest absolute difference between A and B
followed by g2 and so on. If you want the whole list, remove the [ 1:3 ]
part from the code above.
Viewing this output is easier than viewing 100 files and lets you see
the genes that are picked up most frequently.
On Fri, 2004-07-30 at 13:24, Dr_Gyorffy_Balazs wrote:
> Dear Adaikalavan,
>
> thank you for the help!
>
> However, this way I have a big table with all the data in
> it. The problem is, that I have also the gene names (in the
> first column of the initial table), and I would like to
> have not only the differnce, but also the ranked difference
> with the gene names. So at the end I would know, which gene
> had the biggest difference (or the smallest). I was
> thinking to save in different files in order to keep the
> gene names.
>
> (You are right, I really don't need 100 columns. It seemed
> for me more simple to construct the function to get 100
> results instead of correcting for simmetrical- and
> self-tests. :-))
>
> Balazs
>
>
>
>
>
> ___________________________________________________________ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com
>
More information about the Bioconductor
mailing list