[R] grouping function
Sarah Goslee
sarah.goslee at gmail.com
Tue May 8 20:49:50 CEST 2012
Sorry, yes: I changed it before posting it to more closely match what
the default value in the pseudocode. That's a very minor issue: the
very last value in the nested ifelse() statements is what's used by
default.
Sarah
On Tue, May 8, 2012 at 2:46 PM, arun <smartpink111 at yahoo.com> wrote:
> HI Sarah,
>
> I run the same code from your reply email. For the makegroup2, the results are 0 in places of NA.
>
>> makegroup1 <- function(x,y) {
> + group <- numeric(length(x))
> + group[x <= 1990 & y > 1990] <- 1
> + group[x <= 1991 & y > 1991] <- 2
> + group[x <= 1992 & y > 1992] <- 3
> + group
> + }
>> makegroup2 <- function(x, y) {
> + ifelse(x <= 1990 & y > 1990, 1,
> + ifelse(x <= 1991 & y > 1991, 2,
> + ifelse(x <= 1992 & y > 1992, 3, 0)))
> + }
>> makegroup1(df$begin,df$end)
> [1] 3 3 3 0 0 3 3 0 0 0 3 0 0 0 0
>> makegroup2(df$begin,df$end)
> [1] 1 2 3 0 0 2 3 0 0 0 3 0 0 0 0
>
>
> A. K.
>
>
>
>
> ----- Original Message -----
> From: Sarah Goslee <sarah.goslee at gmail.com>
> To: gps at asu.edu
> Cc: "r-help at r-project.org" <r-help at r-project.org>
> Sent: Tuesday, May 8, 2012 2:33 PM
> Subject: Re: [R] grouping function
>
> Hi,
>
> On Tue, May 8, 2012 at 2:17 PM, Geoffrey Smith <gps at asu.edu> wrote:
>> Hello, I would like to write a function that makes a grouping variable for
>> some panel data . The grouping variable is made conditional on the begin
>> year and the end year. Here is the code I have written so far.
>>
>> name <- c(rep('Frank',5), rep('Tony',5), rep('Edward',5));
>> begin <- c(seq(1990,1994), seq(1991,1995), seq(1992,1996));
>> end <- c(seq(1995,1999), seq(1995,1999), seq(1996,2000));
>>
>> df <- data.frame(name, begin, end);
>> df;
>
> Thanks for providing reproducible data. Two minor points: you don't
> need ; at the end of lines, and calling your data frame df is
> confusing because there's a df() function.
>
>> #This is the part I am stuck on;
>>
>> makegroup <- function(x,y) {
>> group <- 0
>> if (x <= 1990 & y > 1990) {group==1}
>> if (x <= 1991 & y > 1991) {group==2}
>> if (x <= 1992 & y > 1992) {group==3}
>> return(x,y)
>> }
>>
>> makegroup(df$begin,df$end);
>>
>> #I am looking for output where each observation belongs to a group
>> conditional on the begin year and end year. I would also like to use a for
>> loop for programming accuracy as well;
>
> This isn't a clear specification:
> 1990, 1994 for instance fits into all three groups. Do you want to
> extend this to more start years, or are you only interested in those
> three? Assuming end is always >= start, you don't even need to
> consider the end years in your grouping.
>
> Here are two methods, one that "looks like" your pseudocode, and one
> that is more R-ish. They give different results because of different
> handling of cases that fit all three groups. Rearranging the
> statements in makegroup1() from broadest to most restrictive would
> make it give the same result as makegroup2().
>
>
> makegroup1 <- function(x,y) {
> group <- numeric(length(x))
> group[x <= 1990 & y > 1990] <- 1
> group[x <= 1991 & y > 1991] <- 2
> group[x <= 1992 & y > 1992] <- 3
> group
> }
>
> makegroup2 <- function(x, y) {
> ifelse(x <= 1990 & y > 1990, 1,
> ifelse(x <= 1991 & y > 1991, 2,
> ifelse(x <= 1992 & y > 1992, 3, 0)))
> }
>
>> makegroup1(df$begin,df$end)
> [1] 3 3 3 0 0 3 3 0 0 0 3 0 0 0 0
>> makegroup2(df$begin,df$end)
> [1] 1 2 3 NA NA 2 3 NA NA NA 3 NA NA NA NA
>> df
>
>
> But really, it's a better idea to develop an unambiguous statement of
> your desired output.
>
> Sarah
>
--
Sarah Goslee
http://www.functionaldiversity.org
More information about the R-help
mailing list