[R] subset of matrix vs data frame
Denis White
denis at mail.cor.epa.gov
Thu Jun 1 21:09:44 CEST 2000
On Wed, 31 May 2000, Peter Dalgaard BSA wrote:
> Prof Brian D Ripley <ripley at stats.ox.ac.uk> writes:
>
> > On Wed, 31 May 2000, Denis White wrote:
> >
> > > In Splus this works,
> > >
> > > > a <- data.frame(matrix(round(runif(10),0),nrow=2,ncol=5))
> > > > a[a == 1] <- 2
> > >
> > > but in R only if a is matrix,
> > >
> > > > a[a == 1] <- 2
> > > Error in [<-.data.frame(*tmp*, a == 1, value = 2) :
> > > matrix subscripts not allowed in replacement
> > >
> > > Was this a design decision? Sorry if I missed it in
> > > An Introduction to R.
> >
> > S-PLUS differs from the original S here, as I understand it.
> > So it was an S-PLUS design decision as I understand it. S-PLUS says:
> >
> > else if(nargs() == 3) {
> > # really ambiguous, but follow common use as if list,
> > # except when one subscript is a logical matrix the shape of x, then treat
> > # as if x were a matrix.
>
> Semantically it is a rather strange thing to do since elements
> in different columns of a data matrix can be of different type.
> And some really weird stuff *does* happen in Splus 3.4:
>
> > a<-data.frame(a=1:10,b=factor(1:10),c=I(as.character(1:10)))
> > a[a==5]<-"x"
> Warning messages:
> replacement values not all in levels(x): NA's generated in:
> > "[<-.factor"(.A0,
> i[, k, drop = T], value = .A1)
> > a
> a b c
> 1 1 1 1
> 2 2 2 2
> 3 3 3 3
> 4 4 4 4
> 5 x NA x
> 6 6 6 6
> 7 7 7 7
> 8 8 8 8
> 9 9 9 9
> 10 10 10 10
> > a[a==4]<-"y"
> Warning messages:
> 1: Data length is not an even multiple of group length in:
> > split(Value,
> factor(col(i)[i], levels = seq(len = ncol(i))))
> 2: replacement values not all in levels(x): NA's generated in:
> > "[<-.factor"(.\
> A0, i[, k, drop = T], value = .A1)
> 3: replacement values not all in levels(x): NA's generated in:
> > "[<-.factor"(.\
> A0, i[, k, drop = T], value = .A1)
> > a
> a b c
> 1 1 1 1
> 2 2 2 2
> 3 3 3 3
> 4 NA NA 4
> 5 x NA x
> 6 6 6 6
> 7 7 7 7
> 8 8 8 8
> 9 9 9 9
> 10 10 10 10
>
Your argument is well taken.
I'm preparing a package for R with accompanying data. As I
understand data(), either of the transportable formats (.tab, .csv)
results in a data frame. One of my data objects is probably best
modeled as a matrix, but the R applications I'm using (clustering)
are flexible, so should I just go ahead with this object as a data
frame?
On page 64 of the white book, John Chambers says "Data frames
can be treated as matrices in calls to most of the basic functions
treating arrays: subsets and elements, dim(), ..." Perhaps he
had not contemplated the semantic problems?
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list