[R] Q re: logical indexing with is.na
Richard M. Heiberger
rmh @end|ng |rom temp|e@edu
Sun Mar 10 03:30:03 CET 2019
>From ?Arithmetic
the elements of shorter
vectors are recycled as necessary (with a ‘warning’ when they are
recycled only _fractionally_).
> tmp <- !is.na(y[1:3])
> tmp
[1] TRUE TRUE FALSE
> c(tmp, tmp)
[1] TRUE TRUE FALSE TRUE TRUE FALSE
> c(tmp, tmp)[1:4]
[1] TRUE TRUE FALSE TRUE
> y[c(tmp, tmp)[1:4]]
[1] 0.3534253 -1.6731597 -0.2079209
>
The behavior is as documented. I am surprised that there is no
warning about partial recycling.
On Sat, Mar 9, 2019 at 9:03 PM David Goldsmith
<eulergaussriemann using gmail.com> wrote:
>
> Hi! Newbie (self-)learning R using P. Dalgaard's "Intro Stats w/ R"; not
> new to statistics (have had grad-level courses and work experience in
> statistics) or vectorized programming syntax (have extensive experience
> with MatLab, Python/NumPy, and IDL, and even a smidgen--a long time ago--of
> experience w/ S-plus).
>
> In exploring the use of is.na in the context of logical indexing, I've come
> across the following puzzling-to-me result:
>
> > y; !is.na(y[1:3]); y[!is.na(y[1:3])]
> [1] 0.3534253 -1.6731597 NA -0.2079209
> [1] TRUE TRUE FALSE
> [1] 0.3534253 -1.6731597 -0.2079209
>
> As you can see, y is a four element vector, the third element of which is
> NA; the next line gives what I would expect--T T F--because the first two
> elements are not NA but the third element is. The third line is what
> confuses me: why is the result not the two element vector consisting of
> simply the first two elements of the vector (or, if vectorized indexing in
> R is implemented to return a vector the same length as the logical index
> vector, which appears to be the case, at least the first two elements and
> then either NA or NaN in the third slot, where the logical indexing vector
> is FALSE): why does the implementation "go looking" for an element whose
> index in the "original" vector, 4, is larger than BOTH the largest index
> specified in the inner-most subsetting index AND the size of the resulting
> indexing vector? (Note: at first I didn't even understand why the result
> wasn't simply
>
> 0.3534253 -1.6731597 NA
>
> but then I realized that the third logical index being FALSE, there was no
> reason for *any* element to be there; but if there is, due to some
> overriding rule regarding the length of the result relative to the length
> of the indexer, shouldn't it revert back to *something* that indicates the
> "FALSE"ness of that indexing element?)
>
> Thanks!
>
> DLG
>
> > sessionInfo()
> R version 3.5.2 (2018-12-20)
> Platform: x86_64-apple-darwin15.6.0 (64-bit)
> Running under: macOS High Sierra 10.13.6
>
> Matrix products: default
> BLAS:
> /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
> LAPACK:
> /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] ISwR_2.0-7
>
> loaded via a namespace (and not attached):
> [1] compiler_3.5.2 tools_3.5.2
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list