[BioC] newbie subset question

Ben Tupper btupper at bigelow.org
Mon Feb 27 22:04:56 CET 2012


Hi,

You can use %in% 

cap1 = traces[traces$well.id %in% c("H1","H3","H5","H7","H9","H11"), ]

or %in% with subset() 

cap1 <- subset(traces, traces$well.id %in% c("H1","H3","H5","H7","H9","H11"))

Cheers,
Ben

P.S.  The easiest way to share example data is to paste the output of dput(traces) in your email.  If it is very large then consider using dput on a small subset of the original data.  Others can then cut-and-paste into their own R session - you'll get waaaaay better assistance by doing that than simply dumping your data into the email.  dput() a great tool and fits the purpose perfectly!


On Feb 27, 2012, at 3:49 PM, Tom Keller wrote:

> Greetings,
> I have a dataframe:
>> str(traces)
> 'data.frame': 2366 obs. of  14 variables:
> $ sample.name             : chr  "leechi_CH001_" "leechi_CH002" "leechi_CH003" "leechi_CH004" ...
> $ well.id                 : Factor w/ 96 levels "A1","A10","A11",..: 1 13 25 37 49 61 73 85 5 17 ...
> $ clear.range.length      : int  807 188 825 779 853 864 0 776 369 50 ...
> $ signal.noise            : num  195.98 9.22 169.21 126.44 158.65 ...
> $ contiguous.read.length  : int  976 502 990 923 976 979 -1 966 439 621 ...
> $ clear.range.start       : int  15 168 14 27 8 11 0 11 12 268 ...
> $ clear.range.stop        : int  822 356 839 806 861 875 0 787 381 318 ...
> $ num.low.quality.bases   : int  155 286 181 242 144 161 5 192 470 216 ...
> $ num.high.quality.bases  : int  907 343 923 832 918 918 0 897 389 358 ...
> $ num.medium.quality.bases: int  42 46 30 56 42 19 0 35 14 73 ...
> $ sample.score            : num  53.6 41.9 53.7 44.2 54.8 ...
> $ comment                 : Factor w/ 1787 levels "","162194","162195",..: 2 3 4 5 6 7 8 9 10 11 ...
> $ container_name          : Factor w/ 37 levels "111201a","111201arr",..: 1 1 1 1 1 1 1 1 1 1 ...
> $ file.name               : chr  "/Users/kellert/Desktop/1112/111201a/leechi_CH001__A01.ab1" "/Users/kellert/Desktop/1112/111201a/leechi_CH002_B01.ab1" "/Users/kellert/Desktop/1112/111201a/leechi_CH003_C01.ab1" "/Users/kellert/Desktop/1112/111201a/leechi_CH004_D01.ab1" ...
> 
> I would like to compare the $ num.high.quality.bases for all rows where $ well.id is for example a member of
> c("H1","H3","H5","H7","H9","H11")
> 
> I thought this would work:
> cap1 = traces[traces$well.id = c("H1","H3","H5","H7","H9","H11"), ]
> or
> cap1 = traces[traces$well.id == match("H1","H3","H5","H7","H9","H11"), ]
> but both give errors.
> The data itself looks like:
>     sample.name well.id clear.range.length signal.noise contiguous.read.length clear.range.start clear.range.stop num.low.quality.bases num.high.quality.bases num.medium.quality.bases sample.score comment container_name
> 1  leechi_CH001_      A1                807      195.983                    976                15              822                   155                    907                       42       53.629  162194        111201a
> 2   leechi_CH002      B1                188        9.220                    502               168              356                   286                    343                       46       41.940  162195        111201a
> 3   leechi_CH003      C1                825      169.206                    990                14              839                   181                    923                       30       53.665  162196        111201a
> 4   leechi_CH004      D1                779      126.441                    923                27              806                   242                    832                       56       44.197  162197        111201a
> 5   leechi_CH005      E1                853      158.646                    976                 8              861                   144                    918                       42       54.815  162198        111201a
> 6   leechi_CH006      F1                864      161.874                    979                11              875                   161                    918                       19       54.474  162199        111201a
> 7   leechi_CH007      G1                  0        3.916                     -1                 0                0                     5                      0                        0        0.000  162200        111201a
> 8   leechi_CH008      H1                776      156.605                    966                11              787                   192                    897                       35       53.025  162201        111201a
> 9   leechi_CH009      A2                369      177.872                    439                12              381                   470                    389                       14       52.632  162202        111201a
> 10  leechi_CH010      B2                 50        6.514                    621               268              318                   216                    358                       73       33.080  162203        111201a
> 11  leechi_CH011      C2                853      154.255                    998                12              865                   177                    917                       42       53.154  162204        111201a
> 12  leechi_CH012      D2                773      121.261                    933                32              805                   232                    840                       57       43.304  162205        111201a
> 13  leechi_CH013      E2                850      201.700                    923                10              860                   176                    872                       29       55.949  162206        111201a
> 14  leechi_CH014      F2                863      186.988                    980                11              874                   162                    922                       30       53.485  162207        111201a
> 15  leechi_CH015      G2                  0        4.001                     -1                 0                0                     5                      0                        0        0.000  162208        111201a
> ...........
> How do I subset based on a match to specific values of $well.id?
> thanks,
> Tom
> kellert at ohsu.edu<mailto:kellert at ohsu.edu>
> 503-494-2442
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

Ben Tupper
Bigelow Laboratory for Ocean Sciences
180 McKown Point Rd. P.O. Box 475
West Boothbay Harbor, Maine   04575-0475 
http://www.bigelow.org



More information about the Bioconductor mailing list