[BioC] Extracting rows where any column has a non-zero value
Sean Davis
sdavis2 at mail.nih.gov
Fri Dec 5 14:04:18 CET 2008
On Fri, Dec 5, 2008 at 12:28 AM, Nathan S. Watson-Haigh
<nhaigh at sheffield.ac.uk> wrote:
> I have what I think should be really obvious, but seem to keep banging my
> head against a wall....ouch!
>
> I want generate a heatplot for a set of genes from a contrast used in my
> limma analysis. However, I want to limit the heatplot to only those arrays
> used in the contrast. So I thought I'd be able to use the contrast and
> design matrix to do just that. However, I'm having problems.....
>
>
> My design matrix:
>> design
> H.N.N L.N.N H.Y.N L.Y.N H.Y.Y L.Y.Y
> 1 1 0 0 0 0 0
> 2 1 0 0 0 0 0
> 3 1 0 0 0 0 0
> 4 0 1 0 0 0 0
> 5 0 1 0 0 0 0
> 6 0 1 0 0 0 0
> 7 0 0 1 0 0 0
> 8 0 0 1 0 0 0
> 9 0 0 1 0 0 0
> 10 0 0 0 1 0 0
> 11 0 0 0 1 0 0
> 12 0 0 0 1 0 0
> 13 0 0 0 0 1 0
> 14 0 0 0 0 1 0
> 15 0 0 0 0 1 0
> 16 0 0 0 0 0 1
> 17 0 0 0 0 0 1
> 18 0 0 0 0 0 1
> attr(,"assign")
> [1] 1 1 1 1 1 1
> attr(,"contrasts")
> attr(,"contrasts")$TG
> [1] "contr.treatment"
>
>
> My contrast matrix:
>> cont.matrix
> Contrasts
> Levels Helin_Acclimation Helin_Freezing Leitrim_Acclimation
> Leitrim_Freezing
> H.N.N -1 0 0
> 0
> L.N.N 0 0 -1
> 0
> H.Y.N 1 -1 0
> 0
> L.Y.N 0 0 1
> -1
> H.Y.Y 0 1 0
> 0
> L.Y.Y 0 0 0
> 1
> Contrasts
> Levels Acclimation Freezing Diff_Acclimation Diff_Freezing Population
> H.N.N -1 0 1 0 -1
> L.N.N -1 0 -1 0 1
> H.Y.N 1 -1 -1 1 -1
> L.Y.N 1 -1 1 -1 1
> H.Y.Y 0 1 0 -1 -1
> L.Y.Y 0 1 0 1 1
>
>
> So I want to do this in a loop for all the contrasts I made:
> for (mycoef in colnames(cont.matrix)) {
> pval.max <- 0.05
> maxheatplot.row <- 100
>
> # get stats for this coef
> TT <- topTable(fit2, adjust.method="BH", coef=mycoef, n=nrow(fit2))
>
> # Just use those with an adj.p-value of < pval.max
> TT <- TT[which(TT$adj.P.Val < pval.max),]
>
> # make a heatplot for for the selected genes and include only those arrays
> used in the contrast
> if (nrow(TT) > 1) {
> heatplot.rows <- if(nrow(TT) > maxheatplot.row) { maxheatplot.row } else {
> nrow(TT) }
>
> # index of those genes to include in the heatplot
> index <- as.numeric(row.names(TT[c(1:heatplot.rows),]))
>
> # Use the Contrast and design matrix to identify the arrays used in the
> contast
> arrays <- which(design[,labels(which(cont.matrix[,mycoef] != 0))] !=0)
>
> png(file = paste("HP_", gsub(":", "_", mycoef), ".png", sep=""), type =
> "cairo1", width=10, height=10, units="in", res=300)
> heatplot(eSet[index,arrays], labCol=sampleNames(eSet)[arrays],
> labRow=TT$Symbol, margins=c(10,10), dend="row")
> dev.off()
> }
> }
>
>
> In the section dealing with creating the heatplot, the following code pulls
> out a subset of the design matrix:
>> design[,labels(which(cont.matrix[,mycoef] != 0))]
> H.N.N H.Y.N
> 1 1 0
> 2 1 0
> 3 1 0
> 4 0 0
> 5 0 0
> 6 0 0
> 7 0 1
> 8 0 1
> 9 0 1
> 10 0 0
> 11 0 0
> 12 0 0
> 13 0 0
> 14 0 0
> 15 0 0
> 16 0 0
> 17 0 0
> 18 0 0
>
> I now want to pull out the row names which have a non-zero value in any one
> of the columns......can anyone help??
> a <- matrix(c(1,0,0,0,0,1),nc=2)
> a
[,1] [,2]
[1,] 1 0
[2,] 0 0
[3,] 0 1
> which(rowSums(a)>0)
[1] 1 3
Does that do approximately what you want?
Sean
More information about the Bioconductor
mailing list