[R] subset using noncontiguous variables by name (not index)
Gabor Grothendieck
ggrothendieck at gmail.com
Sun Aug 26 23:09:40 CEST 2007
Using builtin data frame anscombe try this. First we set up a data frame
anscombe.seq which has one row containing 1, 2, 3, ... . Then select
out from that data frame and unlist it to get the desired index vector.
> anscombe.seq <- replace(anscombe[1,], TRUE, seq_along(anscombe))
> idx <- unlist(subset(anscombe.seq, select = c(x1, x3:x4, y2)))
> anscombe[idx]
x1 x3 x4 y2
1 10 10 8 9.14
2 8 8 8 8.14
3 13 13 8 8.74
4 9 9 8 8.77
5 11 11 8 9.26
6 14 14 8 8.10
7 6 6 8 6.13
8 4 4 19 3.10
9 12 12 8 9.13
10 7 7 8 7.26
11 5 5 8 4.74
On 8/26/07, Muenchen, Robert A (Bob) <muenchen at utk.edu> wrote:
> Hi All,
>
> I'm using the subset function to select a list of variables, some of
> which are contiguous in the data frame, and others of which are not. It
> works fine when I use the form:
>
> subset(mydata,select=c(x1,x3:x5,x7) )
>
> In reality, my list is far more complex. So I would like to store it in
> a variable to substitute in for c(x1,x3:x5,x7) but cannot get it to
> work. That use of the c function seems to violate R rules, so I'm not
> sure how it works at all. A small simulation of the problem is below.
>
> If the variable names & orders were really this simple, I could use
> indices like
>
> summary( mydata[ ,c(1,3:5,7) ] )
>
> but alas, they are not.
>
> How does the c function work this way in the first place, and how can I
> make this substitution?
>
> Thanks,
> Bob
>
> mydata <- data.frame(
> x1=c(1,2,3,4,5),
> x2=c(1,2,3,4,5),
> x3=c(1,2,3,4,5),
> x4=c(1,2,3,4,5),
> x5=c(1,2,3,4,5),
> x6=c(1,2,3,4,5),
> x7=c(1,2,3,4,5)
> )
> mydata
>
> # This does what I want.
> summary(
> subset(mydata,select=c(x1,x3:x5,x7) )
> )
>
> # Can I substitute myVars?
> attach(mydata)
> myVars1 <- c(x1,x3:x5,x7)
>
> # Not looking good!
> myVars1
>
> # This doesn't do the right thing.
> summary(
> subset(mydata,select=myVars1 )
> )
>
> # Total desperation on this attempt:
> myVars2 <- "x1,x3:x5,x7"
> myVars2
>
> # This doesn't work either.
> summary(
> subset(mydata,select=myVars2 )
> )
>
>
>
> =========================================================
> Bob Muenchen (pronounced Min'-chen), Manager
> Statistical Consulting Center
> U of TN Office of Information Technology
> 200 Stokely Management Center, Knoxville, TN 37996-0520
> Voice: (865) 974-5230
> FAX: (865) 974-4810
> Email: muenchen at utk.edu
> Web: http://oit.utk.edu/scc,
> News: http://listserv.utk.edu/archives/statnews.html
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list