[R] How to subset() from data frame using specific rows

R. Michael Weylandt michael.weylandt at gmail.com
Tue Oct 4 20:46:52 CEST 2011


This isn't going to be the most elegant, but it should work:

## Get the factors as characters

ff <- as.character(chemdata$site)

## Identify those that match what you want
ff <- grepl(ff, "BC-")

now use this logical vector to subset

chemdata[ff, ]

Can't test, but should be good to go assuming that "BC-" entirely
identifies those sites you want. If you have other "BC-" things read
through the ?regex documentation and I think it describes how to do
selective wildcards

Michael

On Tue, Oct 4, 2011 at 2:39 PM, Rich Shepard <rshepard at appl-ecosys.com> wrote:
>  I have a data frame called chemdata with this structure:
>
>> str(chemdata)
>
> 'data.frame':   14886 obs. of  4 variables:
>  $ site    : Factor w/ 148 levels "BC-0.5","BC-1",..: 104 145 126 115 114
> 128 124 2 3 3 ...
>  $ sampdate: Date, format: "1996-12-27" "1996-08-22" ...
>  $ param   : Factor w/ 8 levels "As","Ca","Cl",..: 1 1 1 1 1 1 1 1 1 1 ...
>  $ quant   : num  0.06 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 ...
>
>  I've looked in the R Cookbook and Dalgaard's intro book without finding a
> way to use wildcards (e.g., like "BC-*") or explicitly witing each site ID
> when subdsetting a data frame..
>
>  I need to create subsets (as data frames) based on sites, but including
> all sites on each stream. For example, using the initial site factor shown
> above, I want a subset containing all data for sites "BC-0.5", "BC-1".
> "BC-2", "BC-3", "BC-4", "BC-5", and "BC-6".
>
> Pointers appreciated,
>
> Rich
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list