[R] Problem subsetting: undefined columns

R. Michael Weylandt <michael.weylandt@gmail.com> michael.weylandt at gmail.com
Fri Dec 2 18:30:05 CET 2011


How about this?

d[, v[v %in% colnames(d)]]

Michael

On Dec 2, 2011, at 12:01 PM, Aurélien PHILIPPOT <aurelien.philippot at gmail.com> wrote:

> Hi Paul and Jim,
> Thanks for your messages.
> 
> I just wanted R to give me the columns of my data frame d, whose names
> appear in v. I do not care about the names of v that are not in d. In
> addition, every time, there will be at least one element of v that has a
> corresponding column in d, for sure, so I know there is at least one match
> between the 2.
> 
> Initially, I  tried something in the spirit:
> sub<- subset(d, colnames(d) %in% v)
> 
> but I could not make it work properly.
> 
> 
> Best,
> Aurelien
> 
> 2011/12/2 Paul Hiemstra <paul.hiemstra at knmi.nl>
> 
>> On 12/02/2011 07:20 AM, Aur�lien PHILIPPOT wrote:
>>> Dear R-users,
>>> -I am new to R, and I am struggling with the following problem.
>>> 
>>> -I am repeating the following  operations hundreds of times, within a
>> loop:
>>> I want to subset a data frame by columns. I am interested in the columns
>>> names that are given by the rows of another data frame that was built in
>>> parallel. The solution I have so far works well as long as the elements
>> of
>>> the second data frame are included in the column names of the first data
>>> frame but if an element from the second object is not a column name of
>> the
>>> first one, then it bugs.
>> 
>> Hi Aurelien,
>> 
>> I would call this a feature, not a bug. I think R does what it should
>> do, you request a non-existent column and it throws an error. What kind
>> of behavior are you looking for instead of this error?
>> 
>> regards,
>> Paul
>> 
>>> 
>>> -More concretely, I have the following data frames d and v:
>>> yyyymmdd<-c("19720601", "19720602", "19720605")
>>> sret.10006<-c(1,2,3)
>>> sret.10014<-c(5,9,7)
>>> sret.10065<-c(10,2,11)
>>> 
>>> 
>>> d<- data.frame(yyyymmdd=yyyymmdd, sret.10006=sret.10006,
>>> sret.10014=sret.10014, sret.10065=sret.10065)
>>> 
>>> v<- data.frame(V1="sret.10006", V2="sret.10090")
>>> v<- sapply(v, function(x) levels(x)[x])
>>> 
>>> -I want to do the following subsetting:
>>> sub<- subset(d, select=c(v))
>>> 
>>> 
>>> and I get the following error message:
>>> Error in `[.data.frame`(x, r, vars, drop = drop) :
>>>  undefined columns selected
>>> 
>>> 
>>> 
>>> Any help would be very much appreciated,
>>> 
>>> Best,
>>> Aurelien
>>> 
>>>      [[alternative HTML version deleted]]
>>> 
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> 
>> 
>> --
>> Paul Hiemstra, Ph.D.
>> Global Climate Division
>> Royal Netherlands Meteorological Institute (KNMI)
>> Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39
>> P.O. Box 201 | 3730 AE | De Bilt
>> tel: +31 30 2206 494
>> 
>> http://intamap.geo.uu.nl/~paul
>> http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770
>> 
>> 
> 
>    [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list