[R] confusion about what to expect?

Marc Schwartz MSchwartz at medanalytics.com
Wed Sep 24 02:02:07 CEST 2003


On Tue, 2003-09-23 at 18:08, A.J. Rossini wrote:
> In playing around with data.frames (and wanting a simple, cheap way to
> use the variable and case names in plots; but I've solved that with
> some hacks, yech), I noticed the following behavior with subsetting. 
> 
> 
> testdata <- data.frame(matrix(1:20,nrow=4,ncol=5))
> names(testdata) ## expect labels, get them
> names(testdata[2,]) ## expect labels, get them
> names(testdata[,2]) ## expect labels, but NOT --  STRIPPED OFF??
> testdata[,2]  ## would have expect a name (X2) in the front? NOT EXPECTED
> testdata[2,]  ## get what I expect
> testdata[2,2]  ## just a number, not a sub-data.frame? unexpected
> testdata[2,2:3] ## this is a data.frame
> testdata[2:3,2:3] ## and this is, too.
> 
> > version
>          _                
> platform i386-pc-linux-gnu
> arch     i386             
> os       linux-gnu        
> system   i386, linux-gnu  
> status   alpha            
> major    1                
> minor    8.0              
> year     2003             
> month    09               
> day      20               
> language R                
> > 
> 
> I don't have 1.7.1 handy at this location to test, but I would've
> expected a data.frame-like object upon subsetting; should I have
> expected otherwise?  (granted, a data.frame with just a single
> variable could be thought of as silly, but it does have some extra
> information that might be worthwhile, on occassion?)
> 
> I'm not sure that it is a bug, but I was caught by suprise.  If it
> isn't a bug, and someone has a concise way to think through this, for
> my future reference, I'd appreciate hearing about it.
> 
> best,
> -tony


Tony,

A quick review of what is returned when you subset the data.frame
testdata:

> str(testdata[,2])
 int [1:4] 5 6 7 8

> str(testdata[2,])
`data.frame':   1 obs. of  5 variables:
 $ X1: int 2
 $ X2: int 6
 $ X3: int 10
 $ X4: int 14
 $ X5: int 18

> dim(testdata[,2])
NULL

> dim(testdata[2,])
[1] 1 5


Quoting from ?Extract:

"When [.data.frame is used for subsetting rows of a data.frame, it
returns a data frame with unique (and non-missing)row names, if
necessary transforming the names using make.names( * , unique = TRUE)"

What is unstated, but covered by R FAQ 7.7 ("Why do my matrices lose
dimensions?"), a single column in a data.frame resulting from the subset
operation is by default turned into a vector. Hence, no names.

HTH,

Marc Schwartz




More information about the R-help mailing list