[Rd] str(<1d-array>)
Marc Schwartz
marc_schwartz at comcast.net
Fri Jan 23 15:41:38 CET 2009
on 01/23/2009 07:36 AM Martin Maechler wrote:
>>>>>> "TP" == Tony Plate <tplate at acm.org>
>>>>>> on Thu, 22 Jan 2009 11:01:21 -0700 writes:
>
> TP> Martin Maechler wrote:
> >>>>>>> "TP" == Tony Plate <tplate at acm.org>
> >>>>>>> on Fri, 16 Jan 2009 13:10:04 -0700 writes:
> >>>>>>>
> >>
> TP> Martin Maechler wrote:
> >> >>>>>>> "PatB" == Patrick Burns <pburns at pburns.seanet.com>
> >> >>>>>>> on Tue, 13 Jan 2009 17:00:40 +0000 writes:
> >> >>>>>>>
> >> >>
> PatB> Henrik Bengtsson wrote:
> >> >> >> Hi.
> >> >> >>
> >> >> >> On Mon, Jan 12, 2009 at 11:58 PM, Prof Brian Ripley
> >> >> >> <ripley at stats.ox.ac.uk> wrote:
> >> >> >>
> >> >> >>> What you have is a one-dimensional array: they crop up
> >> >> >>> in R most often from table() in my experience.
> >> >> >>>
> >> >> >>>
> >> >> >>>> f <- table(rpois(100, 4)) str(f)
> >> >> >>>>
> >> >> >>> 'table' int [, 1:10] 2 6 18 21 13 16 13 4 3 4 - attr(*,
> >> >> >>> "dimnames")=List of 1 ..$ : chr [1:10] "0" "1" "2" "3"
> >> >> >>> ...
> >> >> >>>
> >> >> >>> and yes, f is an atmoic vector and yes, str()'s notation
> >> >> >>> is confusing here but if it did [1:10] you would not
> >> >> >>> know it was an array. I recall discussing this with
> >> >> >>> Martin Maechler (str's author) last century, and I've
> >> >> >>> just checked that R 2.0.0 did the same.
> >> >> >>>
> >> >> >>> The place in which one-dimensional arrays differ from
> >> >> >>> normal vectors is how names are handled: notice that my
> >> >> >>> example has dimnames not names, and ?names says
> >> >> >>>
> >> >> >>> For a one-dimensional array the 'names' attribute really
> >> >> >>> is 'dimnames[[1]]'.
> >> >> >>>
> >> >> >>
> >> >> >> Thanks for this explanation. One could then argue that
> >> >> >> [1:10,] is somewhat better than [,1:10], but that is just polish.
> >> >>
> >> >> yes. And honestly I don't remember anymore why I chose the
> >> >> "[,1:n]" notation. It definitely was there already before R
> >> >> came into existence, as S also has had one-dimensional arrays,
> >> >> and I programmed the first version of str() in 1990.
> >> >>
> PatB> Perhaps it could be:
> >> >>
> PatB> [1:10(,)]
>
> PatB> That is weird enough that it should not lead people to
> PatB> believe that it is a matrix. But might prompt them a
> PatB> bit in that direction.
> >> >>
> >> >> Well, str() was always aimed a bit at experienced S (and R)
> >> >> users, and I had always aimed somewhat to keep it's output
> >> >> "compact". I'm quite astonished that the OP didn't know about
> >> >> 1D arrays in spite of the many years he's been using R.
> >> >> Would a wierd solution like the above have helped?
> >> >>
> >> >> At the moment, I'd tend to keep it "as is" if only just for
> >> >> historical reminescence, but I can be convinced to change the
> >> >> current "tendency" ...
> >> >>
> >> >> Martin Maechler, ETH Zurich
> >> >>
> TP> What about just including "(1d-array)", something like this
> >> >> str(f)
> TP> 'table' int [1:10](1d array) 5 5 9 23 26 16 9 4 2 1
> TP> - attr(*, "dimnames")=List of 1
> TP> ..$ : chr [1:10] "0" "1" "2" "3" ...
> >> >>
> TP> only 9 extra characters for a rare case, and much, much less cryptic?
> >>
> >> well,.. the next text request is to use
> >> "character" instead of "chr", only 6 extra characters ....
> >>
> >> -> no way: str() has its very concise "style" and should keep that.
> >>
> TP> Brevity is good, but clarity is important too. The output of str is
> TP> usually decipherable, but not so much in this case. It's easy to
> TP> dismiss suggestions like replacing "chr" with "character" - the increase
> TP> in clarity would be minimal. However, the potential increase in clarity
> TP> for a 1-d array is significant - the decrease in brevity is at question
> TP> here. Given the rarity of the case it seems like a decent tradeoff to
> TP> add "(1d-array)" (one could even just write "(1d)"). 1-d arrays are
> TP> sufficiently rare that no concise and clear method of indicating them
> TP> using brackets or other symbols has arisen. You did say you "can be
> TP> convinced to change" it, but I won't attempt beyond this! :-)
>
> well, "still can be .." .....
>
> So you currently propose to replace
> "int [,1:10] 5 5 9 23 26 16 9 4 2 1"
> by
> "int [1:10](1d) 5 5 9 23 26 16 9 4 2 1"
> where Pat had
> "int [1:10(,)] 5 5 9 23 26 16 9 4 2 1"
>
> Since the [.....] is where we specify the dimensionality of all
> arrays in str(), I'd like to try something where things remain
> inside "[....]" as with Pat's version or e.g., with
>
> "int [1:10/1d] 5 5 9 23 26 16 9 4 2 1"
>
> Opinions, further proposals ?
Recognizing that I am coming to this discussion quite late, how about:
int [1:10(1d)] 5 5 9 23 26 16 9 4 2 1
?
I do think that any str() representation that includes a ',' would
continue to reinforce the current misunderstandings pertaining to a 1d
array.
Since using str() is a common response to posts on r-help regarding how
to access components of an object, there will be naive users who would
see something like (using Prof. Ripley's example):
> str(f)
'table' int [, 1:11] 1 9 15 21 15 17 13 5 1 2 ...
- attr(*, "dimnames")=List of 1
..$ : chr [1:11] "0" "1" "2" "3" ...
and then think that they could do:
> f[, 1]
Error in f[, 1] : incorrect number of dimensions
which of course they cannot.
I think that the above change would help to reinforce the notion that a
1d array can, for the most part, be treated as an atomic vector.
However, as Prof. Ripley has noted, there is a subtle difference in how
names/dimnames are treated. The use of '(1d)' in the str() output would
make it clear that this object is not quite a simple atomic vector, but
when indexing, can be treated as such.
Regards,
Marc Schwartz
<snip of content below this point>
More information about the R-devel
mailing list