[R] Removing the rows where all the elements are 0
arun
smartpink111 at yahoo.com
Mon Aug 5 20:00:19 CEST 2013
Not sure I understand the problem.
dat1<- read.table(text="
gene ZPT.1 ZPT.0 ZPT.2 ZPT.3 PDGT.1 PDGT.0
XLOC_000001 3516 626 1277 770 4309 9030
XLOC_000002 342 82 185 72 835 1095
XLOC_000003 2000 361 867 438 454 687
XLOC_000004 143 30 67 37 90 236
XLOC_000005 0 0.21 0.1 0 0 0
XLOC_000006 0 0.1 0 0.01 0 0
XLOC_000007 0 0 0 0 1 3
XLOC_000008 0 0 0 0 0.15 0
XLOC_000009 0 0 0.12 0 0 0
XLOC_000010 7 1 5 3 0 1
XLOC_000011 63 10 19 15 92 228
",sep="",stringsAsFactors=FALSE,header=TRUE)
mat1<- as.matrix(dat1[,-1])
row.names(mat1)<- dat1[,1]
mat1[rowSums(mat1<=0.2)!=ncol(mat1),]
ZPT.1 ZPT.0 ZPT.2 ZPT.3 PDGT.1 PDGT.0
XLOC_000001 3516 626.00 1277.0 770 4309 9030
XLOC_000002 342 82.00 185.0 72 835 1095
XLOC_000003 2000 361.00 867.0 438 454 687
XLOC_000004 143 30.00 67.0 37 90 236
XLOC_000005 0 0.21 0.1 0 0 0 ##row is selected because at least one of the element is >0.2
XLOC_000007 0 0.00 0.0 0 1 3
XLOC_000010 7 1.00 5.0 3 0 1
XLOC_000011 63 10.00 19.0 15 92 228
as.vector(which(!rowSums(mat1<=0.2)!=ncol(mat1)))
#[1] 6 8 9
mat1[c(6,8,9),]
# ZPT.1 ZPT.0 ZPT.2 ZPT.3 PDGT.1 PDGT.0
#XLOC_000006 0 0.1 0.00 0.01 0.00 0
#XLOC_000008 0 0.0 0.00 0.00 0.15 0
#XLOC_000009 0 0.0 0.12 0.00 0.00 0
A.K.
________________________________
From: Vivek Das <vd4mmind at gmail.com>
To: arun <smartpink111 at yahoo.com>
Sent: Monday, August 5, 2013 1:05 PM
Subject: Re: Removing the rows where all the elements are 0
Hi Arun,
This seems to work only if the values are perfect 0 but if there are values in rows like 0.01, 0.08 and 0.05 then if I want to use the command
res2<-mat1[rowSums(mat1<=0.2)!=ncol(mat1),]
Then it does not work. Can you tell me why? Lets say I want to remove the rows which have values less than 0.2 for the columns then what should be the condition?
----------------------------------------------------------
Vivek Das
PhD Student in Computational Biology
Giuseppe Testa's Lab
European School of Molecular Medicine
IFOM-IEO Campus
Via Adamello, 16
Milan, Italy
emails: vivek.das at ieo.eu
vchris_05 at yahoo.co.in
vd4mmind at gmail.com
On Mon, Aug 5, 2013 at 2:31 PM, arun <smartpink111 at yahoo.com> wrote:
Hi Vivek,
>
>dat1<- read.table(text="
>
>gene ZPT.1 ZPT.0 ZPT.2 ZPT.3 PDGT.1 PDGT.0
>XLOC_000001 3516 626 1277 770 4309 9030
>XLOC_000002 342 82 185 72 835 1095
>XLOC_000003 2000 361 867 438 454 687
>XLOC_000004 143 30 67 37 90 236
>XLOC_000005 0 0 0 0 0 0
>XLOC_000006 0 0 0 0 0 0
>XLOC_000007 0 0 0 0 1 3
>XLOC_000008 0 0 0 0 0 0
>XLOC_000009 0 0 0 0 0 0
>XLOC_000010 7 1 5 3 0 1
>XLOC_000011 63 10 19 15 92 228
>",sep="",stringsAsFactors=FALSE,header=TRUE)
>
>res<- dat1[rowSums(dat1[,-1]==0)!=(ncol(dat1)-1),]
>res
># gene ZPT.1 ZPT.0 ZPT.2 ZPT.3 PDGT.1 PDGT.0
>#1 XLOC_000001 3516 626 1277 770 4309 9030
>#2 XLOC_000002 342 82 185 72 835 1095
>#3 XLOC_000003 2000 361 867 438 454 687
>#4 XLOC_000004 143 30 67 37 90 236
>#7 XLOC_000007 0 0 0 0 1 3
>#10 XLOC_000010 7 1 5 3 0 1
>#11 XLOC_000011 63 10 19 15 92 228
>
>If it is a matrix:
>mat1<- as.matrix(dat1[,-1])
> row.names(mat1)<- dat1[,1]
>
>
> res2<-mat1[rowSums(mat1==0)!=ncol(mat1),]
> res2
># ZPT.1 ZPT.0 ZPT.2 ZPT.3 PDGT.1 PDGT.0
>#XLOC_000001 3516 626 1277 770 4309 9030
>#XLOC_000002 342 82 185 72 835 1095
>#XLOC_000003 2000 361 867 438 454 687
>#XLOC_000004 143 30 67 37 90 236
>#XLOC_000007 0 0 0 0 1 3
>#XLOC_000010 7 1 5 3 0 1
>#XLOC_000011 63 10 19 15 92 228
>
>
>#I don't have an account in stackoverflow. So, it is must be somebody else.
>A.K.
>
>
>
>________________________________
>From: Vivek Das <vd4mmind at gmail.com>
>To: arun <smartpink111 at yahoo.com>
>Sent: Monday, August 5, 2013 6:31 AM
>Subject: Removing the rows where all the elements are 0
>
>
>
>
>Hi Arun,
>Am using a matrix of gene expression, frag counts to calculate differentially expressed genes. I would like to know how to remove the rows which have values as 0. Then my data set will be compact and less spurious results will be given for the downstream analysis I do using this matrix.
>Input
>gene ZPT.1ZPT.0ZPT.2ZPT.3PDGT.1PDGT.0XLOC_000001 3516626127777043099030XLOC_000002 34282185728351095XLOC_000003 2000361867438454687XLOC_000004 14330673790236XLOC_000005 000000XLOC_000006 000000XLOC_000007 000013XLOC_000008 000000XLOC_000009 000000XLOC_000010 715301XLOC_000011 6310191592228
>Desired output
>gene ZPT.1ZPT.0ZPT.2ZPT.3PDGT.1PDGT.0XLOC_000001 3516626127777043099030XLOC_000002 34282185728351095XLOC_000003 2000361867438454687XLOC_000004 14330673790236XLOC_000007 000013XLOC_000010 715301XLOC_000011 6310191592228
>As of now I only want to remove those rows where all the frag count columns are 0 if in any row some values are 0 and others are non zero I would like to keep that row intact as you can see my example above.
>Please let me know how to do this.
>
> Hey arun I did not understand the command you wrote in the R stack overflow forum can you plase write here and help me out.
>----------------------------------------------------------
>
>Vivek Das
>PhD Student in Computational Biology
>Giuseppe Testa's Lab
>European School of Molecular Medicine
>IFOM-IEO Campus
>Via Adamello, 16
>Milan, Italy
>
>emails: vivek.das at ieo.eu
> vchris_05 at yahoo.co.in
> vd4mmind at gmail.com
>
More information about the R-help
mailing list