[Rd] transform
Gabor Grothendieck
ggrothend|eck @end|ng |rom gm@||@com
Sun Sep 8 14:38:58 CEST 2024
Suggest you look at dplyr::mutate as this functionality is widely used
there and has shown itself to be useful.
On Tue, Aug 27, 2024 at 9:16 AM Sebastian Meyer <seb.meyer using fau.de> wrote:
>
> Am 27.08.24 um 11:55 schrieb peter dalgaard:
> > Yes. A quirk, rather than a bug I'd say. One issue is that the internal logic of transform() relies on
> >
> > e <- eval(substitute(list(...)), `_data`, parent.frame())
> > tags <- names(e)
> >
> > so untagged entries in ... will not be included.
>
> ... unless at least one is tagged:
>
> R> transform(BOD, 0:5, 1:6)
> Time demand
> 1 1 8.3
> 2 2 10.3
> 3 3 19.0
> 4 4 16.0
> 5 5 15.6
> 6 7 19.8
>
> R> transform(BOD, 0:5, 1:6, foo = 1)
> Time demand 0:5 1:6 foo
> 1 1 8.3 0 1 1
> 2 2 10.3 1 2 1
> 3 3 19.0 2 3 1
> 4 4 16.0 3 4 1
> 5 5 15.6 4 5 1
> 6 7 19.8 5 6 1
>
> But as transform.data.frame is only documented for tagged vector
> expressions, all examples provided in this thread were formal misuses.
> (It might make sense to warn about untagged entries.)
>
> Personally, I'd be quite confused about what to expect from syntax like
>
> transform(BOD, data.frame(y = 1:6))
>
> as really no transformation is specified. Looks like cbind() or
> data.frame() was meant.
>
> Sebastian
>
>
> > The other part is a direct consequence of a quirk in data.frame:
> >
> >> data.frame(head(airquality), y=data.frame(x=rnorm(6)))
> > Ozone Solar.R Wind Temp Month Day x
> > 1 41 190 7.4 67 5 1 0.3075402
> > 2 36 118 8.0 72 5 2 0.7765265
> > 3 12 149 12.6 74 5 3 0.3909341
> > 4 18 313 11.5 62 5 4 0.4733170
> > 5 NA NA 14.3 56 5 5 -0.6947709
> > 6 28 NA 14.9 66 5 6 0.1126040
> >
> > whereas (the wisdom of this escapes me)
> >
> >> data.frame(head(airquality), y=data.frame(x=rnorm(6),z=rnorm(6)))
> > Ozone Solar.R Wind Temp Month Day y.x y.z
> > 1 41 190 7.4 67 5 1 -0.9250228 0.46483406
> > 2 36 118 8.0 72 5 2 -0.5035793 0.28822668
> > ...
> >
> > On the whole, I think that transform was never designed (nor documented) to take data frame arguments, so caveat emptor.
> >
> > - Peter
> >
> >
> >> On 24 Aug 2024, at 16:41 , Gabor Grothendieck <ggrothendieck using gmail.com> wrote:
> >>
> >> One oddity in transform that I recently noticed. It seems that to include
> >> a one-column data frame in the arguments one must name it even though the
> >> name is ignored. If the data frame has more than one column then it must
> >> also be named but in that case it is not ignored and the names are made up of
> >> a combination of that name and the data frame's names. I would have thought
> >> that if we did not want a combination of names we would just not name the
> >> argument.
> >>
> >> # ignores second argument returning BOD unchanged
> >> transform(BOD, data.frame(y = 1:6)) |> names()
> >> ## [1] "Time" "demand"
> >>
> >> # ignores second argument returning BOD unchanged
> >> transform(BOD, data.frame(y = 1:6, z = 6:1)) |> names()
> >> ## [1] "Time" "demand"
> >>
> >> # with one column in data frame it adds the column and names it y ignoring x
> >> transform(BOD, x = data.frame(y = 1:6)) |> names()
> >> ## [1] "Time" "demand" "y"
> >>
> >> # with multiple columns in data frame it uses x.y and x.z as names
> >> transform(BOD, data.frame(y = 1:6, z = 6:1)) |> names()
> >> ## [1] "Time" "demand" "x.y" "x.z"
> >>
> >>
> >> --
> >> Statistics & Software Consulting
> >> GKX Group, GKX Associates Inc.
> >> tel: 1-877-GKX-GROUP
> >> email: ggrothendieck at gmail.com
> >>
> >> ______________________________________________
> >> R-devel using r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
> >
--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com
More information about the R-devel
mailing list