[Rd] Inconsistencies in subassignment (PR#7210)

tlumley at u.washington.edu tlumley at u.washington.edu
Sat Sep 4 01:47:28 CEST 2004


I have made the 3-d case do the same as the vector case, which is what the
C code clearly intended (a goto label was in the wrong place).

This leaves the bigger question of the right thing to do. I note that data
frames give an error when any indices are NA.

	-thomas

On Fri, 3 Sep 2004 ripley at stats.ox.ac.uk wrote:

> Apart from the inconsistencies, there are two clear bugs here:
>
> 1) miscalculating the number of values needed, in the matrix case.  E.g.
>
> > AA[idx, 1] <- B[1:4]
> Error in "[<-"(`*tmp*`, idx, 1, value = B[1:4]) :
>         number of items to replace is not a multiple of replacement length
>
> although only 4 values are replaced by AA[idx, 1] <- B.
>
> 2) the behaviour of the 3D case.
>
> ---------- Forwarded message ----------
> Date: Fri, 3 Sep 2004 16:40:24 +0100 (BST)
> From: Prof Brian Ripley <ripley at stats.ox.ac.uk>
> To: "Yao, Minghua" <myao at ou.edu>
> Cc: R Help <r-help at stat.math.ethz.ch>
> Subject: Re: [R] Different Index behaviors of Array and Matrix
>
> [I will copy a version of this to R-bugs: please be careful when you reply
> to only copy to R-bugs a version with a PR number in the subject.]
>
> On Fri, 3 Sep 2004, Yao, Minghua wrote:
>
> >  I found a difference between the indexing of an array and that of a
> > matrix when there are NA's in the index array. The screen copy is as
> > follows.
> >
> > > A <- array(NA, dim=6)
> > > A
> > [1] NA NA NA NA NA NA
>
> > > idx <- c(1,NA,NA,4,5,6)
> > > B <- c(10,20,30,40,50,60)
> > > A[idx] <- B
> > > A
> > [1] 10 NA NA 40 50 60
> > > AA <- matrix(NA,6,1)
> > > AA
> >      [,1]
> > [1,]   NA
> > [2,]   NA
> > [3,]   NA
> > [4,]   NA
> > [5,]   NA
> > [6,]   NA
> > > AA[idx,1] <- B
> > > AA
> >      [,1]
> > [1,]   10
> > [2,]   NA
> > [3,]   NA
> > [4,]   20
> > [5,]   30
> > [6,]   40
> > >
> >  In the case of a array, we miss the elements (20 and 30) in B
> > corresponding to the NA's in the index array. In the case of a matrix,
> > 20 and 30 are assigned to the elements indexed by the indexes following
> > the NA's. Is this a reasonable behavior. Thanks in advance for
> > explanation.
>
> A is a 1D array but it behaves just like a vector.
> Wierder things happen with multi-dimensional arrrays
>
> > A <- array(NA, dim=c(6,1,1))
> > A[idx,1,1] <- B
> > A
> , , 1
>
>      [,1]
> [1,]   10
> [2,]   NA
> [3,]   NA
> [4,]   NA
> [5,]   NA
> [6,]   NA
>
> One problem with what happens for matrices is that
>
> > idx <- c(1,4,5,6)
> > AA <- matrix(NA,6,1)
> > AA[idx,1] <- B
> Error in "[<-"(`*tmp*`, idx, 1, value = B) :
>         number of items to replace is not a multiple of replacement length
>
> is an error, so it is not counting the values consistently.
>
> The only discussion I could find (Blue Book p.103, which is also
> discussing LHS subscripting) just says
>
> 	If a subscript is NA, an NA is returned.
>
> S normally does not use up values when encountering an NA in an index set,
> although it does for logical matrix indexing of data frames.
>
> I can see two possible interpretations.
>
> 1) The NA indicates the values was lost after assignment. We don't know
> what index the first NA was, so 20 got assigned somewhere.  And as we
> don't know where, all the elements had better be NA. However, that is
> unless the NA was 0, when no assignment took place any no value was used.
>
> 2) The NA indicates the value was lost before assignment, so no assignment
> took place and no value was used.
>
> R does neither of those.  I suspect the correct course of action is to ban
> NAs in subscripted assignments.
>
>
> --
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>
> ______________________________________________
> R-devel at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle



More information about the R-devel mailing list