[R] How to filter data in R based on first column ID

Rui Barradas ru|pb@rr@d@@ @end|ng |rom @@po@pt
Sun Jun 23 17:40:02 CEST 2019


Hello,

The previous 2 e-mails show that there is no match between otu.list and 
the row.names of data. Now I can see why, because otu.list is a 
dataframe, not a vector. Try instead

i <- row.names(data) %in% otu[[1]]

I assume that otu.list is a df with just *one* column.
Check what dim(otu.list) returns and in the code line above use 
otu.list[[n]] where n is the column number where the values you want can 
be found.

Hope this helps,

Rui Barradas

Às 16:27 de 23/06/19, Yogesh Gupta escreveu:
> I have read data table and out.list in R  like this:
> 
>> data = read.table("Mymensingh_root_relative.percent.COUNT.txt",header=T,sep='\t',check.names=F)
> 
>> otu.list = read.table("root_differential.OTU.list.txt",header=T,sep='\t',check.names=F)
> 
> 
> these both are tab delimited text files.
> 
> *
> *
> 
> *Thanks*
> 
> *Yogesh*
> 
> 
> 
> 
> On Sun, Jun 23, 2019 at 4:22 PM Yogesh Gupta <nabiyogesh using gmail.com 
> <mailto:nabiyogesh using gmail.com>> wrote:
> 
>     Hi Rui,
> 
>     Thanks for your help. but still I did not able to get the data.
> 
>     > dput(head(data))
> 
>     structure(list(`Root-1.S35.L001` = c(0, 0, 0, 0, 0, 0.0467945718296678
> 
>     ), `Root-10.S75.L001` = c(0, 0, 0, 0, 0, 0), `Root-13.S5.L001` = c(0,
> 
>     0, 0, 0, 0, 0), `Root-14.S16.L001` = c(0, 0, 0, 0, 0, 0),
>     `Root-17.S26.L001` = c(0,
> 
>     0, 0, 0, 0, 0), `Root-18.S36.L001` = c(0, 0, 0, 0, 0, 0),
>     `Root-19.S46.L001` = c(0,
> 
>     0, 0, 0, 0, 0), `Root-22.S56.L001` = c(0, 0, 0, 0, 0, 0),
>     `Root-24.S66.L001` = c(0,
> 
>     0, 0, 0, 0, 0), `Root-25.S76.L001` = c(0, 0, 0, 0, 0, 0),
>     `Root-26.S6.L001` = c(0,
> 
>     0, 0, 0, 0, 0), `Root-27.S17.L001` = c(0.0293745480838756, 0,
> 
>     0, 0, 0, 0), `Root-3.S45.L001` = c(0, 0, 0, 0, 0, 0),
>     `Root-30.S27.L001` = c(0,
> 
>     0, 0, 0, 0, 0), `Root-32.S37.L001` = c(0, 0, 0, 0, 0, 0),
>     `Root-34.S47.L001` = c(0,
> 
>     0, 0, 0, 0, 0), `Root-39.S57.L001` = c(0, 0, 0, 0, 0, 0),
>     `Root-4.S55.L001` = c(0,
> 
>     0, 0, 0, 0, 0), `Root-40.S67.L001` = c(0.0189409882986783, 0,
> 
>     0, 0, 0, 0), `Root-41.S77.L001` = c(0.0171320884015762, 0, 0,
> 
>     0.0202470135654991, 0, 0), `Root-43.S7.L001` = c(0, 0, 0, 0,
> 
>     0, 0), `Root-45.S18.L001` = c(0, 0, 0, 0, 0, 0), `Root-47.S28.L001`
>     = c(0,
> 
>     0, 0, 0, 0, 0), `Root-50.S38.L001` = c(0, 0, 0, 0, 0, 0),
>     `Root-51.S48.L001` = c(0,
> 
>     0, 0, 0, 0, 0), `Root-54.S58.L001` = c(0.0080481457966323, 0,
> 
>     0, 0, 0, 0), `Root-9.S65.L001` = c(0.00744125132082211, 0, 0,
> 
>     0, 0, 0), Root.11.S5 = c(0, 0, 0, 0, 0, 0), Root.12.S16 = c(0,
> 
>     0, 0, 0, 0, 0), Root.15.S26 = c(0, 0, 0, 0, 0, 0), Root.16.S36 = c(0,
> 
>     0, 0, 0, 0, 0), Root.2.S35 = c(0, 0, 0, 0, 0, 0), Root.20.S46 = c(0,
> 
>     0, 0, 0, 0, 0), Root.21.S56 = c(0, 0, 0, 0, 0, 0), Root.23.S66 = c(0,
> 
>     0, 0, 0.0123632317487791, 0, 0), Root.28.S76 = c(0, 0, 0, 0,
> 
>     0, 0), Root.29.S6 = c(0, 0, 0, 0, 0, 0), Root.31.S17 = c(0, 0,
> 
>     0, 0, 0, 0), Root.33.S27 = c(0.0165520894254218, 0, 0, 0, 0,
> 
>     0), Root.35.S37 = c(0, 0, 0, 0, 0, 0), Root.36.S47 = c(0,
>     0.00731421884142774,
> 
>     0, 0, 0, 0), Root.37.S57 = c(0.00668627975394491, 0, 0,
>     0.00780065971293572,
> 
>     0, 0), Root.38.S67 = c(0, 0, 0, 0, 0, 0), Root.42.S77 =
>     c(0.00368785956630772,
> 
>     0, 0, 0, 0.00553178934946157, 0), Root.44.S7 = c(0, 0,
>     0.0306968177632252,
> 
>     0, 0, 0), Root.46.S18 = c(0, 0, 0, 0, 0, 0), Root.48.S28 = c(0,
> 
>     0, 0, 0, 0, 0), Root.49.S38 = c(0, 0.0133380016339052, 0, 0,
> 
>     0, 0), Root.5.S45 = c(0, 0, 0, 0, 0, 0), Root.52.S48 = c(0, 0,
> 
>     0, 0, 0, 0), Root.53.S58 = c(0, 0, 0, 0, 0, 0), Root.6.S55 =
>     c(0.0150072892547809,
> 
>     0, 0, 0, 0, 0), Root.7.S65 = c(0, 0, 0, 0, 0, 0), Root.8.S75 = c(0,
> 
>     0, 0, 0.0062986835751328, 0, 0)), row.names =
>     c("71f84e7910006f22684121564206e8ca",
> 
>     "03b167b9f86f2519b4263b4125377eed", "54e204fb99c80764e964456dadd6a0e5",
> 
>     "55cd1fc570879d645bbf7a3642e9b0a8", "65f5c31e12c277aec319e2096463f9d2",
> 
>     "7bed62f0fef250fd831dcf13bf43f4fc"), class = "data.frame")
> 
> 
>     > i <- row.names(data) %in% otu.list
> 
>     > data[i, ]
> 
>       [1] Root-1.S35.L001  Root-10.S75.L001 Root-13.S5.L001 
>     Root-14.S16.L001
> 
>       [5] Root-17.S26.L001 Root-18.S36.L001 Root-19.S46.L001
>     Root-22.S56.L001
> 
>       [9] Root-24.S66.L001 Root-25.S76.L001 Root-26.S6.L001 
>     Root-27.S17.L001
> 
>     [13] Root-3.S45.L001  Root-30.S27.L001 Root-32.S37.L001 Root-34.S47.L001
> 
>     [17] Root-39.S57.L001 Root-4.S55.L001  Root-40.S67.L001 Root-41.S77.L001
> 
>     [21] Root-43.S7.L001  Root-45.S18.L001 Root-47.S28.L001 Root-50.S38.L001
> 
>     [25] Root-51.S48.L001 Root-54.S58.L001 Root-9.S65.L001  Root.11.S5
> 
>     [29] Root.12.S16      Root.15.S26      Root.16.S36      Root.2.S35
> 
>     [33] Root.20.S46      Root.21.S56      Root.23.S66      Root.28.S76
> 
>     [37] Root.29.S6       Root.31.S17      Root.33.S27      Root.35.S37
> 
>     [41] Root.36.S47      Root.37.S57      Root.38.S67      Root.42.S77
> 
>     [45] Root.44.S7       Root.46.S18      Root.48.S28      Root.49.S38
> 
>     [49] Root.5.S45       Root.52.S48      Root.53.S58      Root.6.S55
> 
>     [53] Root.7.S65       Root.8.S75
> 
>     <0 rows> (or 0-length row.names)
> 
> 
>     Thanks
> 
> 
> 
> 
>     On Sun, Jun 23, 2019 at 4:20 PM Yogesh Gupta <nabiyogesh using gmail.com
>     <mailto:nabiyogesh using gmail.com>> wrote:
> 
>         Hi Rui,
> 
>         Thanks for your help. but still I did able to get the data.
> 
>         > dput(head(data))
> 
>         structure(list(`Root-1.S35.L001` = c(0, 0, 0, 0, 0,
>         0.0467945718296678
> 
>         ), `Root-10.S75.L001` = c(0, 0, 0, 0, 0, 0), `Root-13.S5.L001` =
>         c(0,
> 
>         0, 0, 0, 0, 0), `Root-14.S16.L001` = c(0, 0, 0, 0, 0, 0),
>         `Root-17.S26.L001` = c(0,
> 
>         0, 0, 0, 0, 0), `Root-18.S36.L001` = c(0, 0, 0, 0, 0, 0),
>         `Root-19.S46.L001` = c(0,
> 
>         0, 0, 0, 0, 0), `Root-22.S56.L001` = c(0, 0, 0, 0, 0, 0),
>         `Root-24.S66.L001` = c(0,
> 
>         0, 0, 0, 0, 0), `Root-25.S76.L001` = c(0, 0, 0, 0, 0, 0),
>         `Root-26.S6.L001` = c(0,
> 
>         0, 0, 0, 0, 0), `Root-27.S17.L001` = c(0.0293745480838756, 0,
> 
>         0, 0, 0, 0), `Root-3.S45.L001` = c(0, 0, 0, 0, 0, 0),
>         `Root-30.S27.L001` = c(0,
> 
>         0, 0, 0, 0, 0), `Root-32.S37.L001` = c(0, 0, 0, 0, 0, 0),
>         `Root-34.S47.L001` = c(0,
> 
>         0, 0, 0, 0, 0), `Root-39.S57.L001` = c(0, 0, 0, 0, 0, 0),
>         `Root-4.S55.L001` = c(0,
> 
>         0, 0, 0, 0, 0), `Root-40.S67.L001` = c(0.0189409882986783, 0,
> 
>         0, 0, 0, 0), `Root-41.S77.L001` = c(0.0171320884015762, 0, 0,
> 
>         0.0202470135654991, 0, 0), `Root-43.S7.L001` = c(0, 0, 0, 0,
> 
>         0, 0), `Root-45.S18.L001` = c(0, 0, 0, 0, 0, 0),
>         `Root-47.S28.L001` = c(0,
> 
>         0, 0, 0, 0, 0), `Root-50.S38.L001` = c(0, 0, 0, 0, 0, 0),
>         `Root-51.S48.L001` = c(0,
> 
>         0, 0, 0, 0, 0), `Root-54.S58.L001` = c(0.0080481457966323, 0,
> 
>         0, 0, 0, 0), `Root-9.S65.L001` = c(0.00744125132082211, 0, 0,
> 
>         0, 0, 0), Root.11.S5 = c(0, 0, 0, 0, 0, 0), Root.12.S16 = c(0,
> 
>         0, 0, 0, 0, 0), Root.15.S26 = c(0, 0, 0, 0, 0, 0), Root.16.S36 =
>         c(0,
> 
>         0, 0, 0, 0, 0), Root.2.S35 = c(0, 0, 0, 0, 0, 0), Root.20.S46 =
>         c(0,
> 
>         0, 0, 0, 0, 0), Root.21.S56 = c(0, 0, 0, 0, 0, 0), Root.23.S66 =
>         c(0,
> 
>         0, 0, 0.0123632317487791, 0, 0), Root.28.S76 = c(0, 0, 0, 0,
> 
>         0, 0), Root.29.S6 = c(0, 0, 0, 0, 0, 0), Root.31.S17 = c(0, 0,
> 
>         0, 0, 0, 0), Root.33.S27 = c(0.0165520894254218, 0, 0, 0, 0,
> 
>         0), Root.35.S37 = c(0, 0, 0, 0, 0, 0), Root.36.S47 = c(0,
>         0.00731421884142774,
> 
>         0, 0, 0, 0), Root.37.S57 = c(0.00668627975394491, 0, 0,
>         0.00780065971293572,
> 
>         0, 0), Root.38.S67 = c(0, 0, 0, 0, 0, 0), Root.42.S77 =
>         c(0.00368785956630772,
> 
>         0, 0, 0, 0.00553178934946157, 0), Root.44.S7 = c(0, 0,
>         0.0306968177632252,
> 
>         0, 0, 0), Root.46.S18 = c(0, 0, 0, 0, 0, 0), Root.48.S28 = c(0,
> 
>         0, 0, 0, 0, 0), Root.49.S38 = c(0, 0.0133380016339052, 0, 0,
> 
>         0, 0), Root.5.S45 = c(0, 0, 0, 0, 0, 0), Root.52.S48 = c(0, 0,
> 
>         0, 0, 0, 0), Root.53.S58 = c(0, 0, 0, 0, 0, 0), Root.6.S55 =
>         c(0.0150072892547809,
> 
>         0, 0, 0, 0, 0), Root.7.S65 = c(0, 0, 0, 0, 0, 0), Root.8.S75 = c(0,
> 
>         0, 0, 0.0062986835751328, 0, 0)), row.names =
>         c("71f84e7910006f22684121564206e8ca",
> 
>         "03b167b9f86f2519b4263b4125377eed",
>         "54e204fb99c80764e964456dadd6a0e5",
> 
>         "55cd1fc570879d645bbf7a3642e9b0a8",
>         "65f5c31e12c277aec319e2096463f9d2",
> 
>         "7bed62f0fef250fd831dcf13bf43f4fc"), class = "data.frame")
> 
> 
>         > i <- row.names(data) %in% otu.list
> 
>         > data[i, ]
> 
>           [1] Root-1.S35.L001  Root-10.S75.L001 Root-13.S5.L001 
>         Root-14.S16.L001
> 
>           [5] Root-17.S26.L001 Root-18.S36.L001 Root-19.S46.L001
>         Root-22.S56.L001
> 
>           [9] Root-24.S66.L001 Root-25.S76.L001 Root-26.S6.L001 
>         Root-27.S17.L001
> 
>         [13] Root-3.S45.L001  Root-30.S27.L001 Root-32.S37.L001
>         Root-34.S47.L001
> 
>         [17] Root-39.S57.L001 Root-4.S55.L001  Root-40.S67.L001
>         Root-41.S77.L001
> 
>         [21] Root-43.S7.L001  Root-45.S18.L001 Root-47.S28.L001
>         Root-50.S38.L001
> 
>         [25] Root-51.S48.L001 Root-54.S58.L001 Root-9.S65.L001  Root.11.S5
> 
>         [29] Root.12.S16      Root.15.S26      Root.16.S36      Root.2.S35
> 
>         [33] Root.20.S46      Root.21.S56      Root.23.S66      Root.28.S76
> 
>         [37] Root.29.S6       Root.31.S17      Root.33.S27      Root.35.S37
> 
>         [41] Root.36.S47      Root.37.S57      Root.38.S67      Root.42.S77
> 
>         [45] Root.44.S7       Root.46.S18      Root.48.S28      Root.49.S38
> 
>         [49] Root.5.S45       Root.52.S48      Root.53.S58      Root.6.S55
> 
>         [53] Root.7.S65       Root.8.S75
> 
>         <0 rows> (or 0-length row.names)
> 
> 
>         Thanks
> 
>         Yogesh
> 
> 
> 
> 
>         On Sun, Jun 23, 2019 at 3:34 PM Rui Barradas
>         <ruipbarradas using sapo.pt <mailto:ruipbarradas using sapo.pt>> wrote:
> 
>             Hello,
> 
>             Please always cc the list.
> 
>             The data you have posted has the values you want to match as
>             *row.names*, not as the first column values.
> 
>             This is why it is important to follow the posting guide and
>             post
>             datasets as it suggests, in dput() format.
> 
>             dput(head(data))    # post the output of this
> 
> 
>             In the mean time, try
> 
>             i <- row.names(data) %in% otu.list
>             data[i, ]
> 
> 
>             Also, dim(i) doesn't make sense, it's a vector and in R it's
>             not
>             expected to have a dim attribute. What would have made sense
>             would have
>             been
> 
>             length(i)
>             sum(i)
> 
> 
>             Hope this helps,
> 
>             Rui Barradas
> 
> 
>             Às 11:41 de 23/06/19, Yogesh Gupta escreveu:
>              > Hi Rui,
>              >
>              > I used the code as you suggested , it is not giving the
>             row values, only
>              > showing the header of rows.
>              >
>              >> head(data)
>              >
>              >                                   Root-1.S35.L001
>             Root-10.S75.L001
>              >
>              > 71f84e7910006f22684121564206e8ca      0.00000000         
>                    0
>              >
>              > 03b167b9f86f2519b4263b4125377eed      0.00000000         
>                    0
>              >
>              > 54e204fb99c80764e964456dadd6a0e5      0.00000000         
>                    0
>              >
>              > 55cd1fc570879d645bbf7a3642e9b0a8      0.00000000         
>                    0
>              >
>              > 65f5c31e12c277aec319e2096463f9d2      0.00000000         
>                    0
>              >
>              > 7bed62f0fef250fd831dcf13bf43f4fc      0.04679457         
>                    0
>              >
>              >
>              > .....................................................
>              >
>              >
>              >> head(otu.list)
>              >
>              >    9d94ce7d60e59034941e3a12bc37865e
>              >
>              > 1 8e60d301122d7aa359eb6b0b00f37f62
>              >
>              > 2 bfad6370d28182cc6304844e9bec7fb6
>              >
>              > 3 088571139af8e0a63fb6652df8a7438b
>              >
>              > 4 ccd70d57890f276a542ef4b5e5142d4c
>              >
>              > 5 cab4ca278af34bf722e0c7cb3219370b
>              >
>              > 6 72f30f3780145f16ac882eb8e2d189a5
>              >
>              >
>              > ....................................
>              >
>              >
>              >> i <- data[[1]] %in% otu.list
>              >
>              >> dim(i)
>              >
>              > NULL
>              >
>              >> data[i, ]
>              >
>              >   [1] Root-1.S35.L001  Root-10.S75.L001 Root-13.S5.L001 
>             Root-14.S16.L001
>              >
>              >   [5] Root-17.S26.L001 Root-18.S36.L001 Root-19.S46.L001
>             Root-22.S56.L001
>              >
>              >   [9] Root-24.S66.L001 Root-25.S76.L001 Root-26.S6.L001 
>             Root-27.S17.L001
>              >
>              > [13] Root-3.S45.L001  Root-30.S27.L001 Root-32.S37.L001
>             Root-34.S47.L001
>              >
>              > [17] Root-39.S57.L001 Root-4.S55.L001  Root-40.S67.L001
>             Root-41.S77.L001
>              >
>              > [21] Root-43.S7.L001  Root-45.S18.L001 Root-47.S28.L001
>             Root-50.S38.L001
>              >
>              >
>              > .............
>              >
>              >
>              > Could you please suggest how I can get the row values
>             along with the harder?
>              >
>              >
>              > Kind Regards
>              >
>              > Yogesh
>              >
>              >
>              >
>              >
>              > *Yogesh Gupta*
>              > *Research Fellow*
>              > *Institute for Global Food Security*
>              > *Queen's University*
>              > *Belfast, UK*
>              >
>              >
>              >
>              > On Fri, Jun 21, 2019 at 9:29 PM Rui Barradas
>             <ruipbarradas using sapo.pt <mailto:ruipbarradas using sapo.pt>
>              > <mailto:ruipbarradas using sapo.pt
>             <mailto:ruipbarradas using sapo.pt>>> wrote:
>              >
>              >     Hello,
>              >
>              >     Please don't post in HTML like the posting guide asks
>             you to, the data
>              >     is unreadable.
>              >
>              >     Not tested:
>              >
>              >     i <- data[[1]] %in% ID.list
>              >     data[i, ]
>              >
>              >
>              >     That's it.
>              >
>              >     Hope this helps,
>              >
>              >     Rui Barradas
>              >
>              >     Às 21:09 de 21/06/19, Yogesh Gupta escreveu:
>              >      > Hi,
>              >      >
>              >      > I do need to filter data based on first column
>             could you please
>              >     suggest ,
>              >      > How I can do it in R.
>              >      >
>              >      >
>              >      >
>              >      >
>              >      >
>              >      >
>              >      >
>              >      > head(data)
>              >      >
>              >      >
>              >      >
>              >      > Root-1.S35.L001 Root-10.S75.L001 Root-13.S5.L001
>             Root-14.S16.L001
>              >      > Root-17.S26.L001
>              >      > 719e20e1e3d19fbc5dd50844c6d29a8f 0.049304262 0 0 0
>             0.033765858
>              >      > bfffa74c0971b30ccf3b19f65911801f 0 0 0.017210717 0 0
>              >      > 80087a003188288a9b0e3990d72113e3 0 0 0 0.010248632 0
>              >      > aed34f36d535976c247278bec289fd12 0.580694642
>             0.113445793 0.56221674
>              >      > 0.06764097 0.525782644
>              >      > 42df8f30203380e9a604cb5c4000411b 0 0 0 0.004099453 0
>              >      > d47199c9d9c5417c7cb82ff4159cf497 0 0.019729703 0 0 0
>              >      > 6272cdc38bd517a77ba3887619c36b9f 0 0 0 0.016397811
>             0.019294776
>              >      > 62a5c6f5e0adda662739a7ac521ea790 0.076695519 0
>             0.271546861
>              >     0.254166069
>              >      > 0.463074623
>              >      > 3b5d8860f6a03668df80d364e9591bab 0.125999781
>             0.019729703 0.061193659
>              >      > 0.038944801 0.02411847
>              >      >
>              >      > I need to extract data for the below IDs
>              >      >
>              >      > head(ID.list)
>              >      >
>              >      > 719e20e1e3d19fbc5dd50844c6d29a8f
>              >      > bfffa74c0971b30ccf3b19f65911801f
>              >      > 62a5c6f5e0adda662739a7ac521ea790
>              >      >
>              >      > I will be tankful for your help.
>              >      >
>              >      > Thanks
>              >      > Yogesh
>              >      >
>              >      > *Yogesh Gupta*
>              >      > *Research Fellow*
>              >      > *Institute for Global Food Security*
>              >      > *Queen's University*
>              >      > *Belfast, UK*
>              >      >
>              >      >       [[alternative HTML version deleted]]
>              >      >
>              >      > ______________________________________________
>              >      > R-help using r-project.org <mailto:R-help using r-project.org>
>             <mailto:R-help using r-project.org <mailto:R-help using r-project.org>>
>             mailing list
>              >     -- To UNSUBSCRIBE and more, see
>              >      > https://stat.ethz.ch/mailman/listinfo/r-help
>              >      > PLEASE do read the posting guide
>              > http://www.R-project.org/posting-guide.html
>              >      > and provide commented, minimal, self-contained,
>             reproducible code.
>              >      >
>              >
>



More information about the R-help mailing list