[Rd] oddity in transform
Gabor Grothendieck
ggrothendieck @ending from gm@il@com
Tue Jul 24 13:59:10 CEST 2018
The idea is that one wants to write the line of code below
in a general way which works the same
whether you specify ix as one column or multiple columns but the naming entirely
changes when you do this and BOD[, 1] and transform(BOD, X=..., Y=...) or
other hard coding solutions still require writing multiple cases.
ix <- 1:2
transform(BOD, X = BOD[ix] * seq(6))
On Tue, Jul 24, 2018 at 7:14 AM, Emil Bode <emil.bode using dans.knaw.nl> wrote:
> I think you meant to call BOD[,1]
> From ?transform, the ... arguments are supposed to be vectors, and BOD[1] is still a data.frame (with one column). So I don't think it's surprising transform gets confused by which name to use (X, or Time?), and kind of compromises on the name "Time". It's also in a note in ?transform: "If some of the values are not vectors of the appropriate length, you deserve whatever you get!"
> And if you want to do it with multiple extra columns (and are not satisfied with these labels), I think the proper way to go would be " transform(BOD, X=BOD[,1]*seq(6), Y=BOD[,2]*seq(6))"
>
> If you want to trace it back further, it's not in transform but in data.frame. Column-names are prepended with a higher-level name if the object has more than one column.
> And it uses the tag-name if simply supplied with a vector:
> data.frame(BOD[1:2], X=BOD[1]*seq(6)) takes the name of the only column of BOD[1], Time. Only because that column name is already present, it's changed to Time.1
> data.frame(BOD[1:2], X=BOD[,1]*seq(6)) gives third column-name X (as X is now a vector)
> data.frame(BOD[1:2], X=BOD[1:2]*seq(6)) or with BOD[,1:2] gives columns names X.Time and X.demand, to show these (multiple) columns are coming from X
>
> So I don't think there's much to fix here. I this case having X.Time in all cases would have been better, but in general the column-naming of data.frame works, changing it would likely cause a lot of problems.
> You can always change the column-names later.
>
> Best regards,
> Emil Bode
>
> Data-analyst
>
> +31 6 43 83 89 33
> emil.bode using dans.knaw.nl
>
> DANS: Netherlands Institute for Permanent Access to Digital Research Resources
> Anna van Saksenlaan 51 | 2593 HW Den Haag | +31 70 349 44 50 | info using dans.knaw.nl <mailto:info using dans.kn> | dans.knaw.nl <applewebdata://71F677F0-6872-45F3-A6C4-4972BF87185B/www.dans.knaw.nl>
> DANS is an institute of the Dutch Academy KNAW <http://knaw.nl/nl> and funding organisation NWO <http://www.nwo.nl/>.
>
> On 23/07/2018, 16:52, "R-devel on behalf of Gabor Grothendieck" <r-devel-bounces using r-project.org on behalf of ggrothendieck using gmail.com> wrote:
>
> Note the inconsistency in the names in these two examples. X.Time in
> the first case and Time.1 in the second case.
>
> > transform(BOD, X = BOD[1:2] * seq(6))
> Time demand X.Time X.demand
> 1 1 8.3 1 8.3
> 2 2 10.3 4 20.6
> 3 3 19.0 9 57.0
> 4 4 16.0 16 64.0
> 5 5 15.6 25 78.0
> 6 7 19.8 42 118.8
>
> > transform(BOD, X = BOD[1] * seq(6))
> Time demand Time.1
> 1 1 8.3 1
> 2 2 10.3 4
> 3 3 19.0 9
> 4 4 16.0 16
> 5 5 15.6 25
> 6 7 19.8 42
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com
More information about the R-devel
mailing list