[R] ave(x, y, FUN=length) produces character output when x is character
Nordlund, Dan (DSHS/RDA)
NordlDJ at dshs.wa.gov
Wed Dec 24 21:06:15 CET 2014
> -----Original Message-----
> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Mike
> Miller
> Sent: Wednesday, December 24, 2014 11:31 AM
> To: R-Help List
> Subject: [R] ave(x, y, FUN=length) produces character output when x is
> character
>
> R 3.0.1 on Linux 64...
>
> I was working with someone else's code. They were using ave() in a way
> that I guess is nonstandard: Isn't FUN always supposed to be a variant
> of
> mean()? The idea was to count for every element of a factor vector how
> many times the level of that element occurs in the factor vector.
>
>
> gl() makes a factor:
>
> > gl(2,2,5)
> [1] 1 1 2 2 1
> Levels: 1 2
>
>
> ave() applies FUN to produce the desired count, and it works:
>
> > ave( 1:5, gl(2,2,5), FUN=length )
> [1] 3 3 2 2 3
>
>
> The elements of the first vector are irrelevant because they are only
> counted, so we should get the same result if it were a character
> vector,
> but we don't:
>
> > ave( as.character(1:5), gl(2,2,5), FUN=length )
> [1] "3" "3" "2" "2" "3"
>
> The output has character type, but it is supposed to be a collection of
> vector lengths.
>
>
> Two questions:
>
> (1) Is that a bug in ave()? It certainly is unexpected.
>
> (2) What is the best way to do this sort of thing?
>
> The truth is that we start with a character vector and we want to
> create
> an integer vector that tells us for every element of the character
> vector
> how many times that string occurs. Here are two vectors of length 6
> that
> should give the same result:
>
> > intvec <- c(4,5,6,5,6,6)
> > charvec <- c("A","B","C","B","C","C")
>
> The code was used like this with integer vectors and it seemed to work:
>
> > ave( intvec, intvec, FUN=length )
> [1] 1 2 3 2 3 3
>
> When a character vector came along, it would fail by producing a
> character
> vector as output:
>
> > ave( charvec, charvec, FUN=length )
> [1] "1" "2" "3" "2" "3" "3"
>
> This seems more appropriate, and it might always work, but is it OK?:
>
> > ave( rep(1, length(charvec)), as.factor(charvec), FUN=sum )
> [1] 1 2 3 2 3 3
>
> I suspect that ave() isn't the best choice, but what is the best way to
> do
> this?
>
>
> Thanks in advance.
>
> Mike
For your character vector example, this will get you the counts.
table(charvec)[charvec]
Hope this is helpful,
Dan
Daniel J. Nordlund, PhD
Research and Data Analysis Division
Services & Enterprise Support Administration
Washington State Department of Social and Health Services
More information about the R-help
mailing list