[R] subset using noncontiguous variables by name (not index)

Gabor Grothendieck ggrothendieck at gmail.com
Sun Aug 26 23:09:40 CEST 2007


Using builtin data frame anscombe try this. First we set up a data frame
anscombe.seq which has one row containing 1, 2, 3, ... .  Then select
out from that data frame and unlist it to get the desired index vector.

> anscombe.seq <- replace(anscombe[1,], TRUE, seq_along(anscombe))
> idx <- unlist(subset(anscombe.seq, select = c(x1, x3:x4, y2)))
> anscombe[idx]
   x1 x3 x4   y2
1  10 10  8 9.14
2   8  8  8 8.14
3  13 13  8 8.74
4   9  9  8 8.77
5  11 11  8 9.26
6  14 14  8 8.10
7   6  6  8 6.13
8   4  4 19 3.10
9  12 12  8 9.13
10  7  7  8 7.26
11  5  5  8 4.74


On 8/26/07, Muenchen, Robert A (Bob) <muenchen at utk.edu> wrote:
> Hi All,
>
> I'm using the subset function to select a list of variables, some of
> which are contiguous in the data frame, and others of which are not. It
> works fine when I use the form:
>
> subset(mydata,select=c(x1,x3:x5,x7) )
>
> In reality, my list is far more complex. So I would like to store it in
> a variable to substitute in for c(x1,x3:x5,x7) but cannot get it to
> work. That use of the c function seems to violate R rules, so I'm not
> sure how it works at all. A small simulation of the problem is below.
>
> If the variable names & orders were really this simple, I could use
> indices like
>
> summary( mydata[ ,c(1,3:5,7) ] )
>
> but alas, they are not.
>
> How does the c function work this way in the first place, and how can I
> make this substitution?
>
> Thanks,
> Bob
>
> mydata <- data.frame(
>  x1=c(1,2,3,4,5),
>  x2=c(1,2,3,4,5),
>  x3=c(1,2,3,4,5),
>  x4=c(1,2,3,4,5),
>  x5=c(1,2,3,4,5),
>  x6=c(1,2,3,4,5),
>  x7=c(1,2,3,4,5)
> )
> mydata
>
> # This does what I want.
> summary(
>  subset(mydata,select=c(x1,x3:x5,x7) )
> )
>
> # Can I substitute myVars?
> attach(mydata)
> myVars1 <- c(x1,x3:x5,x7)
>
> # Not looking good!
> myVars1
>
> # This doesn't do the right thing.
> summary(
>  subset(mydata,select=myVars1 )
> )
>
> # Total desperation on this attempt:
> myVars2 <- "x1,x3:x5,x7"
> myVars2
>
> # This doesn't work either.
> summary(
>  subset(mydata,select=myVars2 )
> )
>
>
>
> =========================================================
> Bob Muenchen (pronounced Min'-chen), Manager
> Statistical Consulting Center
> U of TN Office of Information Technology
> 200 Stokely Management Center, Knoxville, TN 37996-0520
> Voice: (865) 974-5230
> FAX: (865) 974-4810
> Email: muenchen at utk.edu
> Web: http://oit.utk.edu/scc,
> News: http://listserv.utk.edu/archives/statnews.html
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list