[R] Listing Variables

Marc Schwartz (via MN) mschwartz at mn.rr.com
Wed May 3 17:18:16 CEST 2006


On Wed, 2006-05-03 at 10:46 -0400, Farrel Buchinsky wrote:
> How does one create a vector whose contents is the list of variables in a
> dataframe pertaining to a particular pattern?
> This is so simple but I cannot find a straightforward answer.
> I want to be able to pass the contents of that list to a "for" loop.
> 
> So let us assume that one has a dataframe whose name is Data. And let us
> assume one had the height of a group of people measured at various ages.
> 
> It could be made up of vectors Data$PersonalID, Data$FirstName,
> Data$LastName, Data$Height.1, Data$Height.5, Data$Height.9,
> Data$Height.10,Data$Height.12,Data$Height.20....many many more variables.
> 
> How would one create a vector of all the Height variable names.
> 
> The simple workaround is to not bother creating the vector "Data$Height.1"
> "Data$Height.5" "Data$Height.9" "Data$Height.10"
> "Data$Height.12""Data$Height.20"...but rather just to use the sapply
> function. However with some functions the sapply will not work and it is
> necessary to supply each variable name to a function (see thread at 
> Repeating tdt function on thousands of variables)
> 
> 
> This is such a core capability. I would like to see it in the R-Wiki but 
> could not find it there.


I may be misunderstanding what you want to do, but to simply get the
names of the columns in Data that contain "Height", you can do this:

>  grep("Height", names(Data), value = TRUE)
[1] "Height.1"  "Height.5"  "Height.9"  "Height.10" "Height.12"
[6] "Height.20"


Now you could use something like the following:

  for (i in grep("Height", names(Data), value = TRUE))
    YourFunctionHere(Data[[i]])

If it makes for easier reading, you could first assign the subset of the
column names to a vector and then use that in the for() loop, rather
than the above.

HTH,

Marc Schwartz




More information about the R-help mailing list