[R] is this a bug?
Brian Diggs
diggsb at ohsu.edu
Fri Jun 17 23:58:44 CEST 2011
On 6/17/2011 2:24 PM, (Ted Harding) wrote:
> And the extra twist in the tale is exemplified by this
> mini-version of Albert-Jan's first example:
> DF<- data.frame(A=c(1,2,3))
> DF$B<- c(4,5,6)
> DF$C<- c(7,8,9)
> DF
> # A B C
> # 1 1 4 7
> # 2 2 5 8
> # 3 3 6 9
> DF$D<- DF["A"]/DF["B"]
> DF
> # A B C A
> # 1 1 4 7 0.25
> # 2 2 5 8 0.40
> # 3 3 6 9 0.50
> ##And why:
> DF["A"]/DF["B"]
> # A
> # 1 0.25
> # 2 0.40
> # 3 0.50
> ##So the ratio DF["A"]/DF["B"] comes out with the name of
> ##the numerator, "A". This is then the name given to DF$D
It's even slightly weirder than that:
#'data.frame': 3 obs. of 4 variables:
# $ A: num 1 2 3
# $ B: num 4 5 6
# $ C: num 7 8 9
# $ D:'data.frame': 3 obs. of 1 variable:
# ..$ A: num 0.25 0.4 0.5
There is a column D in DF which is itself a data frame with a single
column whose name is A (because of what Ted said). When formatted for
printing out, the column name of the inner data frame is used (as a
result of how data.frame() itself handles named arguments when the
argument is itself a data.frame: "If a list or data frame or matrix is
passed to data.frame it is as if each component or column had been
passed as a separate argument...").
So not a bug, but a convoluted set of circumstances that can happen when
non-atomic vectors are assigned to columns of a data.frame. That's one
of those /you shouldn't do that even though it is technically legal or
at least you shouldn't be surprised when things don't work the way you
thought they would/ things.
> Thus Albert-Jan's
> df["weight"] / ave(df["weight"], df["sex"], FUN=sum)*100
> comes through with name "weight".
> Ted.
> On 17-Jun-11 21:06:42, William Dunlap wrote:
>> df$varname is a column of df.
>> df["varname"] is a one-column df containing that column.
>> df[["varname"]] is a column of df (same as df$varname).
>> df[,"varname"] is a column of df (same as df$varname).
>> df[,"varname",drop=FALSE] is a one-column df (same as df$varname).
>> df$newVarname<- df["varname"] inserts a new component
>> into df, the component being a one-column data.frame,
>> not the column in that data.frame.
>> Bill Dunlap
>> Spotfire, TIBCO Software
>> wdunlap tibco.com
>>> -----Original Message-----
>>> From: r-help-bounces at r-project.org
>>> [mailto:r-help-bounces at r-project.org] On Behalf Of Albert-Jan Roskam
>>> Sent: Friday, June 17, 2011 1:49 PM
>>> To: R Mailing List
>>> Subject: [R] is this a bug?
>>> Hello,
>>> Is the following a bug? I always thought that df$varname<-
>>> does the same as
>>> df["varname"]<-
>>>> df<- data.frame(weight=round(runif(10, 10, 100)),
>>> sex=round(runif(100, 0,
>>> 1)))
>>>> df$pct<- df["weight"] / ave(df["weight"], df["sex"], FUN=sum)*100
>>>> names(df)
>>> [1] "weight" "sex" "pct" ### ----------> ok
>>>> head(df)
>>> weight sex weight ### ----------> huh!?!
>>> 1 86 0 2.4002233
>>> 2 19 1 0.5643006
>>> 3 32 0 0.8931063
>>> 4 87 0 2.4281328
>>> 5 45 0 1.2559308
>>> 6 95 0 2.6514094
>>>> rm(df)
>>>> df<- data.frame(weight=round(runif(10, 10, 100)),
>>> sex=round(runif(100, 0,
>>> 1)))
>>>> df["pct"]<- df["weight"] / ave(df["weight"], df["sex"],
>>> FUN=sum)*100 ###
>>>> -----> this does work
>>>> names(df)
>>> [1] "weight" "sex" "pct"
>>>> head(df)
>>> weight sex pct
>>> 1 15 0 0.5246590
>>> 2 43 0 1.5040224
>>> 3 17 1 0.9284544
>>> 4 44 1 2.4030584
>>> 5 76 1 4.1507373
>>> 6 59 0 2.0636586
>>>> do.call(c, R.Version())
>>> platform arch
>>> "i686-pc-linux-gnu" "i686"
>>> os system
>>> "linux-gnu" "i686, linux-gnu"
>>> status major
>>> "" "2"
>>> minor year
>>> "11.1" "2010"
>>> month day
>>> "05" "31"
>>> svn rev language
>>> "52157" "R"
>>> version.string
>>> "R version 2.11.1 (2010-05-31)"
>>>> # Thanks!
>>> Cheers!!
>>> Albert-Jan
>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> All right, but apart from the sanitation, the medicine,
>>> education, wine, public
>>> order, irrigation, roads, a fresh water system, and public
>>> health, what have the
>>> Romans ever done for us?
>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> [[alternative HTML version deleted]]
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> --------------------------------------------------------------------
> E-Mail: (Ted Harding)<ted.harding at wlandres.net>
> Fax-to-email: +44 (0)870 094 0861
> Date: 17-Jun-11 Time: 22:24:41
> ------------------------------ XFMail ------------------------------
Brian S. Diggs, PhD
Senior Research Associate, Department of Surgery
Oregon Health & Science University
More information about the R-help
mailing list