[R] Bug in colnames of data.frames?
Arne Henningsen
ahenningsen at email.uni-kiel.de
Tue Aug 17 18:57:19 CEST 2004
Thank you for all your answers!
I agree with you that it is not a bug. My mistake was that I thought that a
data frame is similar to a matrix, but as ?data.frame says they "... share
many of the properties of matrices and of lists".
I never used the feature that a data.frame can contain objects with a
rectangular structure so far. And I wonder if this feature is rather useful
or confusing.
I think Peter's idea (see below) to show the structure of a data frame is very
appealing, e.g.:
> myData[,"var4"] <- cbind(xyzzy=5:2)
> myData
var1 var2 var3 var4
xyzzy
1 1 5 6 5
2 2 6 8 4
3 3 7 10 3
4 4 8 12 2
I think the current presentation
> myData
var1 var2 var3 xyzzy
1 1 5 6 5
2 2 6 8 4
3 3 7 10 3
4 4 8 12 2
is confusing because it is not directly (without another command like str())
apparent why myData[[ "var1" ]] works fine while myData[[ "xyzzy" ]] does
not.
Best wishes,
Arne
On Tuesday 17 August 2004 16:24, Peter Dalgaard wrote:
> Arne Henningsen <ahenningsen at email.uni-kiel.de> writes:
> > Hi,
> >
> > I am using R 1.9.1 on on i686 PC with SuSE Linux 9.0.
> >
> > I have a data.frame, e.g.:
> > > myData <- data.frame( var1 = c( 1:4 ), var2 = c (5:8 ) )
> >
> > If I add a new column by
> >
> > > myData$var3 <- myData[ , "var1" ] + myData[ , "var2" ]
> >
> > everything is fine, but if I omit the commas:
> > > myData$var4 <- myData[ "var1" ] + myData[ "var2" ]
> >
> > the name shown above the 4th column is not "var4":
> > > myData
> >
> > var1 var2 var3 var1
> > 1 1 5 6 6
> > 2 2 6 8 8
> > 3 3 7 10 10
> > 4 4 8 12 12
> >
> > but names() and colnames() return the expected name:
> > > names( myData )
> >
> > [1] "var1" "var2" "var3" "var4"
> >
> > > colnames( myData )
> >
> > [1] "var1" "var2" "var3" "var4"
> >
> > And it is even worse: I am not able to change the name shown above the
> > 4th
> >
> > column:
> > > names( myData )[ 4 ] <- "var5"
> > > myData
> >
> > var1 var2 var3 var1
> > 1 1 5 6 6
> > 2 2 6 8 8
> > 3 3 7 10 10
> > 4 4 8 12 12
> >
> > I guess that this is a bug, isn't it?
>
> Nope:
> > str(myData)
>
> `data.frame': 4 obs. of 4 variables:
> $ var1: int 1 2 3 4
> $ var2: int 5 6 7 8
> $ var3: int 6 8 10 12
> $ var4:`data.frame': 4 obs. of 1 variable:
> ..$ var1: int 6 8 10 12
>
> It's slightly peculiar, but if a column of a data frame is itself a
> rectangular structure (data frame or matrix), then the "innermost" names
> are used. Cf.
>
> > myData[,"var4"] <- cbind(xyzzy=5:2)
> > myData
>
> var1 var2 var3 xyzzy
> 1 1 5 6 5
> 2 2 6 8 4
> 3 3 7 10 3
> 4 4 8 12 2
>
>
> Arguably, one might prefer
>
> var1 var2 var3 var4
> xyzzy
> 1 1 5 6 5
> 2 2 6 8 4
> 3 3 7 10 3
> 4 4 8 12 2
>
> or something like that, but it's hardly a bug.
More information about the R-help
mailing list