[Rd] Inconsistencies in subassignment with NA index. (PR#7210)

ripley at stats.ox.ac.uk ripley at stats.ox.ac.uk
Fri Sep 3 18:34:46 CEST 2004


Apart from the inconsistencies, there are two clear bugs here:

1) miscalculating the number of values needed, in the matrix case.  E.g.

> AA[idx, 1] <- B[1:4]
Error in "[<-"(`*tmp*`, idx, 1, value = B[1:4]) :
        number of items to replace is not a multiple of replacement length

although only 4 values are replaced by AA[idx, 1] <- B.

2) the behaviour of the 3D case.

---------- Forwarded message ----------
Date: Fri, 3 Sep 2004 16:40:24 +0100 (BST)
From: Prof Brian Ripley <ripley at stats.ox.ac.uk>
To: "Yao, Minghua" <myao at ou.edu>
Cc: R Help <r-help at stat.math.ethz.ch>
Subject: Re: [R] Different Index behaviors of Array and Matrix

[I will copy a version of this to R-bugs: please be careful when you reply
to only copy to R-bugs a version with a PR number in the subject.]

On Fri, 3 Sep 2004, Yao, Minghua wrote:

>  I found a difference between the indexing of an array and that of a
> matrix when there are NA's in the index array. The screen copy is as
> follows.
>  
> > A <- array(NA, dim=6)
> > A
> [1] NA NA NA NA NA NA

> > idx <- c(1,NA,NA,4,5,6)
> > B <- c(10,20,30,40,50,60)
> > A[idx] <- B
> > A
> [1] 10 NA NA 40 50 60
> > AA <- matrix(NA,6,1)
> > AA
>      [,1]
> [1,]   NA
> [2,]   NA
> [3,]   NA
> [4,]   NA
> [5,]   NA
> [6,]   NA
> > AA[idx,1] <- B
> > AA
>      [,1]
> [1,]   10
> [2,]   NA
> [3,]   NA
> [4,]   20
> [5,]   30
> [6,]   40
> > 
>  In the case of a array, we miss the elements (20 and 30) in B
> corresponding to the NA's in the index array. In the case of a matrix,
> 20 and 30 are assigned to the elements indexed by the indexes following
> the NA's. Is this a reasonable behavior. Thanks in advance for
> explanation.

A is a 1D array but it behaves just like a vector.
Wierder things happen with multi-dimensional arrrays

> A <- array(NA, dim=c(6,1,1))
> A[idx,1,1] <- B
> A
, , 1

     [,1]
[1,]   10
[2,]   NA
[3,]   NA
[4,]   NA
[5,]   NA
[6,]   NA

One problem with what happens for matrices is that

> idx <- c(1,4,5,6)
> AA <- matrix(NA,6,1)
> AA[idx,1] <- B
Error in "[<-"(`*tmp*`, idx, 1, value = B) :
        number of items to replace is not a multiple of replacement length

is an error, so it is not counting the values consistently.

The only discussion I could find (Blue Book p.103, which is also
discussing LHS subscripting) just says

	If a subscript is NA, an NA is returned.

S normally does not use up values when encountering an NA in an index set, 
although it does for logical matrix indexing of data frames.

I can see two possible interpretations.

1) The NA indicates the values was lost after assignment. We don't know
what index the first NA was, so 20 got assigned somewhere.  And as we
don't know where, all the elements had better be NA. However, that is
unless the NA was 0, when no assignment took place any no value was used.

2) The NA indicates the value was lost before assignment, so no assignment 
took place and no value was used.

R does neither of those.  I suspect the correct course of action is to ban 
NAs in subscripted assignments.


-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list