[R] grouping function

Sarah Goslee sarah.goslee at gmail.com
Tue May 8 20:49:50 CEST 2012


Sorry, yes: I changed it before posting it to more closely match what
the default value in the pseudocode. That's a very minor issue: the
very last value in the nested ifelse() statements is what's used by
default.

Sarah

On Tue, May 8, 2012 at 2:46 PM, arun <smartpink111 at yahoo.com> wrote:
> HI Sarah,
>
> I run the same code from your reply email.  For the makegroup2, the results are 0 in places of NA.
>
>> makegroup1 <- function(x,y) {
> + group <- numeric(length(x))
> + group[x <= 1990 & y > 1990] <- 1
> + group[x <= 1991 & y > 1991] <- 2
> + group[x <= 1992 & y > 1992] <- 3
> + group
> + }
>> makegroup2 <- function(x, y) {
> +   ifelse(x <= 1990 & y > 1990, 1,
> +       ifelse(x <= 1991 & y > 1991, 2,
> +         ifelse(x <= 1992 & y > 1992, 3, 0)))
> + }
>> makegroup1(df$begin,df$end)
>  [1] 3 3 3 0 0 3 3 0 0 0 3 0 0 0 0
>> makegroup2(df$begin,df$end)
>  [1] 1 2 3 0 0 2 3 0 0 0 3 0 0 0 0
>
>
> A. K.
>
>
>
>
> ----- Original Message -----
> From: Sarah Goslee <sarah.goslee at gmail.com>
> To: gps at asu.edu
> Cc: "r-help at r-project.org" <r-help at r-project.org>
> Sent: Tuesday, May 8, 2012 2:33 PM
> Subject: Re: [R] grouping function
>
> Hi,
>
> On Tue, May 8, 2012 at 2:17 PM, Geoffrey Smith <gps at asu.edu> wrote:
>> Hello, I would like to write a function that makes a grouping variable for
>> some panel data .  The grouping variable is made conditional on the begin
>> year and the end year.  Here is the code I have written so far.
>>
>> name <- c(rep('Frank',5), rep('Tony',5), rep('Edward',5));
>> begin <- c(seq(1990,1994), seq(1991,1995), seq(1992,1996));
>> end <- c(seq(1995,1999), seq(1995,1999), seq(1996,2000));
>>
>> df <- data.frame(name, begin, end);
>> df;
>
> Thanks for providing reproducible data. Two minor points: you don't
> need ; at the end of lines, and calling your data frame df is
> confusing because there's a df() function.
>
>> #This is the part I am stuck on;
>>
>> makegroup <- function(x,y) {
>>  group <- 0
>>  if (x <= 1990 & y > 1990) {group==1}
>>  if (x <= 1991 & y > 1991) {group==2}
>>  if (x <= 1992 & y > 1992) {group==3}
>>  return(x,y)
>> }
>>
>> makegroup(df$begin,df$end);
>>
>> #I am looking for output where each observation belongs to a group
>> conditional on the begin year and end year.  I would also like to use a for
>> loop for programming accuracy as well;
>
> This isn't a clear specification:
> 1990, 1994 for instance fits into all three groups. Do you want to
> extend this to more start years, or are you only interested in those
> three? Assuming end is always >= start, you don't even need to
> consider the end years in your grouping.
>
> Here are two methods, one that "looks like" your pseudocode, and one
> that is more R-ish. They give different results because of different
> handling of cases that fit all three groups. Rearranging the
> statements in makegroup1() from broadest to most restrictive would
> make it give the same result as makegroup2().
>
>
> makegroup1 <- function(x,y) {
> group <- numeric(length(x))
> group[x <= 1990 & y > 1990] <- 1
> group[x <= 1991 & y > 1991] <- 2
> group[x <= 1992 & y > 1992] <- 3
> group
> }
>
> makegroup2 <- function(x, y) {
>    ifelse(x <= 1990 & y > 1990, 1,
>       ifelse(x <= 1991 & y > 1991, 2,
>           ifelse(x <= 1992 & y > 1992, 3, 0)))
> }
>
>> makegroup1(df$begin,df$end)
> [1] 3 3 3 0 0 3 3 0 0 0 3 0 0 0 0
>> makegroup2(df$begin,df$end)
> [1]  1  2  3 NA NA  2  3 NA NA NA  3 NA NA NA NA
>> df
>
>
> But really, it's a better idea to develop an unambiguous statement of
> your desired output.
>
> Sarah
>

-- 
Sarah Goslee
http://www.functionaldiversity.org



More information about the R-help mailing list