[Bioc-sig-seq] extract non-zero rows

Dario Strbenac D.Strbenac at garvan.org.au
Fri Aug 26 02:00:15 CEST 2011


Hi Estefania,

If you want both columns to be non-zero, you should do 

row.positive.counts <- apply(dup.data$counts, 1, function(a.row) sum(a.row > 0))
filtered <- dup.data[row.positive.counts == ncol(dup.data$counts), ]

It makes a boolean vector for each row, then sums it, because TRUE is the same as 1, so the sum gives you how many columns are greater than zero. Then, the rows that have as many positive numbers as there are columns in the data frame are kept.

To find unchanged genes, you might do 

unchanged <- dup.de.com$table[dup.de.com$table[, "logFC"] > -0.2 & dup.de.com$table[, "logFC"] < 0.2, ]

replacing 0.2 with what you think the biggest fold change that unchanged genes might have.

---- Original message ----
>Date: Thu, 25 Aug 2011 11:39:03 -0300 (ART)
>From: bioc-sig-sequencing-bounces at r-project.org (on behalf of Estefania Mancini <estefania.mancini at indear.com>)
>Subject: [Bioc-sig-seq] extract non-zero rows  
>To: bioc-sig-sequencing at r-project.org
>
>Dear all
>I have loaded and analyzed properly 4 454 dataset, corresponding to control and stress samples with their biological replicates.
>I would like to know if is possible to filter, in my DGEList  object
>
>-which tags dont have zero in any column,
>-which of these tags could be consider "housekeeping" (at least with logFC near 0)
> 
>The object  DGEList  looks like this:
>
>>dup.data
>An object of class "DGEList"
>$samples
>             group lib.size norm.factors
>A8_control control    77953            1
>A8_stress   stress   176860            1
>mq_control control    98109            1
>mq_stress   stress   145839            1
>pi_control control   132479            1
>pi_stress   stress   142484            1
>tj_control control    65827            1
>tj_stress   stress   144278            1
>
>I have tried to filter using the suggested function:
>>dup.de.filter <- dup.data[rowSums(dup.data$counts) >= 0, ]
>or with
>>dup.de.filter <- dup.data[rowSums(dup.data$counts) >= 1, ] 
>but have no changes at all. I have many rows which 0 and 1 read in some column which should be excluded.
>
>Also:
>dup.de.com
>An object of class "DGEExact"
>$table
>                  logConc       logFC   p.value
>Glyma13g11940.8 -2.588833  0.26176050 0.7348221
>Glyma13g11900.1 -2.875548  0.03020441 0.9688072
>Glyma09g24780.1 -3.501041 -0.12108619 0.8754371
>Glyma13g12050.1 -3.224648  0.03036675 0.9691009
>Glyma13g12070.1 -3.743064  0.14416487 0.8521188
>19860 more rows ...
>
>$comparison
>[1] "control" "stress" 
>$genes
>NULL
>
>Thanks in advance,
>Estefania
>
>_______________________________________________
>Bioc-sig-sequencing mailing list
>Bioc-sig-sequencing at r-project.org
>https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing


--------------------------------------
Dario Strbenac
Research Assistant
Cancer Epigenetics
Garvan Institute of Medical Research
Darlinghurst NSW 2010
Australia



More information about the Bioc-sig-sequencing mailing list