[R] Subset using grepl
Prof Brian Ripley
ripley at stats.ox.ac.uk
Sat Jan 29 11:54:46 CET 2011
The grep comdition is "[A-J]"
BTW, why there are lots of unnecessary steps here, including using
cbind() and subset():
x <- rep(LETTERS[1:20],3)
y <- rep(1:3, 20)
z <- paste(x,y, sep="")
random.data <- rnorm(60)
data <- data.frame(z, random.data)
data[grepl("[A-J]", z), ]
Now (for the paranoid and not needed in this example) in general the
effect of "[A-Z]" depends on the locale, so you could write out
"[ABCDEFIJK]" or create it by
cond <- paste("[", paste(LETTERS[1:10], collapse=""), "]", sep="")
Or use repl("[A-J]", z, perl=TRUE).
On Sat, 29 Jan 2011, Kang Min wrote:
> Hi all,
>
> I would like to subset a dataframe by using part of the level name.
>
> x <- rep(LETTERS[1:20],3)
> y <- rep(1:3, 20)
> z <- paste(x,y, sep="")
> random.data <- rnorm(60)
> data <- as.data.frame(cbind(z, random.data))
>
> I need rows that contain the letters A to J, so I tried:
>
> subset(data, grepl(LETTERS[1:10], z)) # got only rows with A
> subset(data, z %in% LETTERS[1:10]) # got no rows
>
> I think I'm getting close to the solution but need a little bit of
> help here, thanks in advance.
>
> Kang Min
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list