[R] choosing multiple columns
David Winsemius
dwinsemius at comcast.net
Sun Aug 12 19:37:15 CEST 2012
On Aug 11, 2012, at 6:01 AM, Ista Zahn wrote:
> On Sat, Aug 11, 2012 at 8:51 AM, Sachinthaka Abeywardana
> <sachin.abeywardana at gmail.com> wrote:
>> I should have mentioned that I do not know the number index of the
>> columns,
>> but regardless, thanks for the responses
>
> Right, so use my first method. This does not depend on the position of
> the columns.
I would counsel greater consideration of the possible ranges of the
column names. Even using a variation on Ista Zahn's method intended to
deliver on the first 8 will fail if the range of possible values is
greater than 10 in number or the numbers do not start from 1.
If the numbers of the columns do start from 1, you could try this
grep("^OFB[1-8]", paste0("OFB", 1:100) , value=TRUE )[1:8]
Otherwise consider these efforts;
> set.seed(123); test <- sample( paste0("OFB", 1:100), 20)
> sort(test)[1:8]
[1] "OFB21" "OFB27" "OFB29" "OFB4" "OFB41" "OFB42" "OFB5" "OFB50
> grep("^OFB[1-8]", test , value=TRUE )[1:8]
[1] "OFB29" "OFB79" "OFB41" "OFB86" "OFB5" "OFB50" "OFB83" "OFB51"
Note that even this does not get what you want which is =
> test[order(as.numeric( sub("OFB", "", test)))][1:8]
[1] "OFB4" "OFB5" "OFB9" "OFB21" "OFB27" "OFB29" "OFB41" "OFB42"
There is also a function named mixedsort in Greg Warnes package gtools
which automatically splits the alpha and numeric components of of an
alphanumeric vector and then orders by the two of them separately.
Something like this might achieve:
> test[ order( sub("[0-9]+","", test), # an alpha sort .. followed
by numeric sort
as.numeric(gsub("[[:alpha:]]*([[:digit:]]*)", '\\1',
test) ) )]
[1] "OFB4" "OFB5" "OFB9" "OFB21" "OFB27" "OFB29" "OFB41" "OFB42"
"OFB50" "OFB51" "OFB60" "OFB77" "OFB78"
[14] "OFB79" "OFB83" "OFB86" "OFB87" "OFB91" "OFB94" "OFB98"
gtools::ixedsort is based on gtools::mixedorder and has more
sophistication, for instance the attempt to identify spaces and
delimiters.
--
David.
>
> Best,
> Ista
>
>>
>>
>> On Sat, Aug 11, 2012 at 10:46 PM, Ista Zahn <istazahn at gmail.com>
>> wrote:
>>>
>>> Hi Sachin,
>>>
>>> There are at least two ways. The safer way is to use a regular
>>> expression to find the matching columns, like this:
>>>
>>> a <- initial_data[grep("^OFB[0-9]+", names(initial_data))]
>>>
>>> Alternatively, if you know that the columns you want are the first 8
>>> you can select them by position, like this:
>>>
>>> a <- initial_data[1:8]
>>>
>>> Best,
>>> Ista
>>>
>>> On Sat, Aug 11, 2012 at 7:59 AM, Sachinthaka Abeywardana
>>> <sachin.abeywardana at gmail.com> wrote:
>>>> Hi all,
>>>>
>>>> I have a data frame that has the columns OFB1, OFB2, OFB3,...
>>>> OFB10.
>>>>
>>>> How do I select the first 8 columns efficiently without typing
>>>> each and
>>>> every one of them. i.e. I want something like:
>>>>
>>>> a<-data.frame(initial_data$OFB1-10) #i know this is wrong, what
>>>> would be
>>>> the correct syntax?
>>>>
>>>> Thanks,
>>>> Sachin
David Winsemius, MD
Alameda, CA, USA
More information about the R-help
mailing list