[R] subsetting data-frame by vector of characters
Peter Dalgaard
P.Dalgaard at biostat.ku.dk
Fri Jun 13 16:51:21 CEST 2008
james perkins wrote:
> Thanks a lot for that. Its the %in% I needed to work out mainly
>
> large didn't mean anything in particular, just that it gets quite long
> with the real data.
> I did mean: names = c("John", "Phil", "Robert")
>
> The only problem is that using the method you suggest is that I lose
> the indexing, ie in the example, instead of:
>
> (index) Name Fave.Number
> 1 John 7
> 2 Phil 14
> 3 Robert 23
>
>
> I end up with
>
>
> (index) Name Fave.Number
> 1 John 7
> 3 Phil 14
> 5 Robert 23
>
> This isnt a problem at the moment but I guess it could be if I used
> the table later in loops. Is there an easy way to re-index the table?
>
Notice that these are names, not numbers: result[2,1] is "Phil" in both
cases. If it bothers you, just set rownames(result) <- NULL
(BTW, are your names unique? in that case you could set them as rownames
and use them for indexing:
rownames(names.and.numbers) <- names.and.numbers$Name
names.and.numbers[names, ]
> Kind regards
>
> Jim
>
> Wacek Kusnierczyk wrote:
>> james perkins wrote:
>>
>>> Hi,
>>>
>>> I have a very simple problem but I can't think how to solve it without
>>> using a for loop and creating a large logical vector. However given
>>> the nature of the problem I am sure there is a "1-liner" that could do
>>> the same thing much more efficiently.
>>>
>>> bascially I have a dataframe with characters in, eg
>>>
>>>
>>>> names.and.numbers
>>>>
>>> (index) Name Fave.Number
>>> 1 John 7
>>> 2 Tony 12
>>> 3 Phil 14
>>> 4 Adam 22
>>> 5 Robert 23
>>>
>>>
>>> Now, imagine I have a vector of names, ie:
>>>
>>>
>>>> names = c("John,Phil,Robert")
>>>>
>>
>> this is a one-element vector of string(s) that are concatenated names
>> (strings with names).
>> or you mean: names = c("John", "Phil", "Robert")
>>
>>
>>
>>> All I want to do is get the subset of the dataframe which corresponds
>>> to the names in the vector "Names". IE
>>>
>>> (index) Name Fave.Number
>>> 1 John 7
>>> 2 Phil 14
>>> 3 Robert 23
>>>
>>
>> this should do:
>> names.and.numbers[names.and.numbers$Name %in% names,]
>>
>> if names is as you say above, do
>> names.and.numbers[names.and.numbers$Name %in% strsplit(names,","), ]
>>
>> you do create a logical vector here (what does 'large' mean?), but no
>> loop is involved at the surface.
>>
>> vQ
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help
mailing list