[R] Control the variable order after multiple declarations using within

Sebastien Bihorel @eb@@t|en@b|hore| @end|ng |rom cogn|gencorp@com
Thu Jul 4 12:14:59 CEST 2019


Thanks all for your inputs.

----- Original Message -----
From: "Duncan Murdoch" <murdoch.duncan using gmail.com>
To: "Jeff Newmiller" <jdnewmil using dcn.davis.ca.us>, r-help using r-project.org, "Eric Berger" <ericjberger using gmail.com>, "Richard O'Keefe" <raoknz using gmail.com>
Cc: "Sebastien Bihorel" <sebastien.bihorel using cognigencorp.com>
Sent: Wednesday, July 3, 2019 12:52:55 PM
Subject: Re: [R] Control the variable order after multiple declarations using within

On 03/07/2019 12:42 p.m., Jeff Newmiller wrote:
> Dummy columns do have some drawbacks though, if you find yourself working with large data frames. The dummy columns waste memory and time as compared to either reorganizing columns after the `within` or using separate sequential `with` expressions as I previously suggested. I think mutate avoids this overhead also.

I think mutate() has only a very small advantage over within().  Neither 
one of them is flexible about the order of columns in the final result. 
In the OPs example, mutate creates the variables in the desired order, 
but it would be no better if the desired order had been a, c, b, because 
b is needed for the calculation of c, so it would be created first.

Eric's suggestion

within(df, {b<-a*2; c<-b*3})[c("a","b","c")]

is the best so far, though I'd probably write it as

within(df, {b<-a*2; c<-b*3})[, c("a","b","c")]

just to avoid confusing my future self and make clear that I'm talking 
about specifying an order for the columns.

And if you really, really want everything to happen within the call, 
just create the variables in the reverse order to what you want, e.g.

within(df, {c <- a; b<-a*2; c<-b*3})

but to me that is a lot less clear than Eric's solution.

Duncan Murdoch


> 
> On July 3, 2019 8:25:32 AM PDT, Eric Berger <ericjberger using gmail.com> wrote:
>> Nice suggestion, Richard.
>>
>> On Wed, Jul 3, 2019 at 4:28 PM Richard O'Keefe <raoknz using gmail.com>
>> wrote:
>>
>>> Why not set all the new columns to dummy values to get the order you
>>> want and then set them to their final values in the order that works
>>> for that?
>>>
>>>
>>> On Thu, 4 Jul 2019 at 00:12, Kevin Thorpe <kevin.thorpe using utoronto.ca>
>>> wrote:
>>>
>>>>
>>>>> On Jul 3, 2019, at 3:15 AM, Sebastien Bihorel <
>>>> sebastien.bihorel using cognigencorp.com> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> The within function can be used to modify data.frames (among
>> other
>>>> objects). One can even provide multiple expressions to modify the
>>>> data.frame by more than one expression. However, when new variables
>> are
>>>> created, they seem to be inserted in the data.frame in the opposite
>> order
>>>> they were declared:
>>>>>
>>>>>> df <- data.frame(a=1)
>>>>>> within(df, {b<-a*2; c<-b*3})
>>>>>   a c b
>>>>> 1 1 6 2
>>>>>
>>>>> Is there a way to insert the variables in an order consistent
>> with the
>>>> order of declaration (ie, a, b, c)?
>>>>>
>>>>
>>>> One way is to use mutate() from the dplyr package.
>>>>
>>>>
>>>>> Thanks
>>>>>
>>>>> Sebastien
>>>>>
>>>>> ______________________________________________
>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible
>> code.
>>>>
>>>>
>>>> --
>>>> Kevin E. Thorpe
>>>> Head of Biostatistics,  Applied Health Research Centre (AHRC)
>>>> Li Ka Shing Knowledge Institute of St. Michael's
>>>> Assistant Professor, Dalla Lana School of Public Health
>>>> University of Toronto
>>>> email: kevin.thorpe using utoronto.ca  Tel: 416.864.5776  Fax:
>> 416.864.3016
>>>>
>>>> ______________________________________________
>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>>          [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> 	[[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list