[R] Bug in colnames of data.frames?

Arne Henningsen ahenningsen at email.uni-kiel.de
Tue Aug 17 18:57:19 CEST 2004


Thank you for all your answers!

I agree with you that it is not a bug. My mistake was that I thought that a 
data frame is similar to a matrix, but as ?data.frame says they "... share 
many of the properties of matrices and of lists".

I never used the feature that a data.frame can contain objects with a 
rectangular structure so far. And I wonder if this feature is rather useful 
or confusing.

I think Peter's idea (see below) to show the structure of a data frame is very 
appealing, e.g.:

> myData[,"var4"] <- cbind(xyzzy=5:2)
> myData
  var1 var2 var3  var4
                 xyzzy
1    1    5    6     5
2    2    6    8     4
3    3    7   10     3
4    4    8   12     2

I think the current presentation
> myData
  var1 var2 var3 xyzzy
1    1    5    6     5
2    2    6    8     4
3    3    7   10     3
4    4    8   12     2

is confusing because it is not directly (without another command like str()) 
apparent why myData[[ "var1" ]] works fine while myData[[ "xyzzy" ]] does 
not. 

Best wishes,
Arne

On Tuesday 17 August 2004 16:24, Peter Dalgaard wrote:
> Arne Henningsen <ahenningsen at email.uni-kiel.de> writes:
> > Hi,
> >
> > I am using R 1.9.1 on on i686 PC with SuSE Linux 9.0.
> >
> > I have a data.frame, e.g.:
> > > myData <- data.frame( var1 = c( 1:4 ), var2 = c (5:8 ) )
> >
> > If I add a new column by
> >
> > > myData$var3 <- myData[ , "var1" ] + myData[ , "var2" ]
> >
> > everything is fine, but if I omit the commas:
> > > myData$var4 <- myData[ "var1" ] + myData[ "var2" ]
> >
> > the name shown above the 4th column is not "var4":
> > > myData
> >
> >   var1 var2 var3 var1
> > 1    1    5    6    6
> > 2    2    6    8    8
> > 3    3    7   10   10
> > 4    4    8   12   12
> >
> > but names() and colnames() return the expected name:
> > > names( myData )
> >
> > [1] "var1" "var2" "var3" "var4"
> >
> > > colnames( myData )
> >
> > [1] "var1" "var2" "var3" "var4"
> >
> > And it is even worse: I am not able to change the name shown above the
> > 4th
> >
> > column:
> > > names( myData )[ 4 ] <- "var5"
> > > myData
> >
> >   var1 var2 var3 var1
> > 1    1    5    6    6
> > 2    2    6    8    8
> > 3    3    7   10   10
> > 4    4    8   12   12
> >
> > I guess that this is a bug, isn't it?
>
> Nope:
> > str(myData)
>
> `data.frame':   4 obs. of  4 variables:
>  $ var1: int  1 2 3 4
>  $ var2: int  5 6 7 8
>  $ var3: int  6 8 10 12
>  $ var4:`data.frame':   4 obs. of  1 variable:
>   ..$ var1: int  6 8 10 12
>
> It's slightly peculiar, but if a column of a data frame is itself a
> rectangular structure (data frame or matrix), then the "innermost" names
> are used. Cf.
>
> > myData[,"var4"] <- cbind(xyzzy=5:2)
> > myData
>
>   var1 var2 var3 xyzzy
> 1    1    5    6     5
> 2    2    6    8     4
> 3    3    7   10     3
> 4    4    8   12     2
>
>
> Arguably, one might prefer
>
>   var1 var2 var3  var4
>                  xyzzy
> 1    1    5    6     5
> 2    2    6    8     4
> 3    3    7   10     3
> 4    4    8   12     2
>
> or something like that, but it's hardly a bug.




More information about the R-help mailing list