Brijesh Mishra
Thu Dec 15 15:43:32 CET 2016

```Wow, Mr Petr. The placing of diff(fyear1) was very clever indeed. Just
to understand the steps intended by you-

exp(diff(log(sales1))/diff(fyear1))- 1)
= exp(((log(sales1(t)/sales1(t-1)))/(fyear1(t)-fyear(t-1)))-1)
= exp(log(sales(t)/sales(t-1))^(1/(delta(fyear1))))-1
= ((sales(t)/(sales(t-1)))^(1/(delta(fyear1)))-1

This gives the CAGR, which saves some precious data-points (in my
dataset, it may prove a big boon). I spent a significant amount of
time today to figure out something like this, which you did so easily.

Many Thanks,

Brijesh

On Thu, Dec 15, 2016 at 7:21 PM, PIKAL Petr wrote:
> Hi
>
> Maybe you does not need if or ifelse but just divide by years difference.
>
> d2<-ddply(df1,"co_code1",transform, growth=c(NA,exp(diff(log(sales1))/diff(fyear1))- 1)*100)
>
> Cheers
> Petr
>
>> -----Original Message-----
>> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Berend
>> Hasselman
>> Sent: Thursday, December 15, 2016 1:18 PM
>> To: Brijesh Mishra <brijeshkmishra at gmail.com>
>> Cc: r-help mailing list <r-help at r-project.org>
>> Subject: Re: [R] Computing growth rate
>>
>>
>> > On 15 Dec 2016, at 04:40, Brijesh Mishra <brijeshkmishra at gmail.com>
>> wrote:
>> >
>> > Hi,
>> >
>> > I am trying to calculate growth rate (say, sales, though it is to be
>> > computed for many variables) in a panel data set. Problem is that I
>> > have missing data for many firms for many years. To put it simply, I
>> > have created this short dataframe (original df id much bigger)
>> >
>> > df1<-data.frame(co_code1=rep(c(1100, 1200, 1300), each=7),
>> > fyear1=rep(1990:1996, 3), sales1=rep(seq(1000,1600, by=100),3))
>> >
>> > # this gives me
>> > co_code1 fyear1 sales1
>> > 1      1100   1990   1000
>> > 2      1100   1991   1100
>> > 3      1100   1992   1200
>> > 4      1100   1993   1300
>> > 5      1100   1994   1400
>> > 6      1100   1995   1500
>> > 7      1100   1996   1600
>> > 8      1200   1990   1000
>> > 9      1200   1991   1100
>> > 10     1200   1992   1200
>> > 11     1200   1993   1300
>> > 12     1200   1994   1400
>> > 13     1200   1995   1500
>> > 14     1200   1996   1600
>> > 15     1300   1990   1000
>> > 16     1300   1991   1100
>> > 17     1300   1992   1200
>> > 18     1300   1993   1300
>> > 19     1300   1994   1400
>> > 20     1300   1995   1500
>> > 21     1300   1996   1600
>> >
>> > # I am now removing a couple of rows
>> > df1<-df1[-c(5, 8), ]
>> > # the result is
>> >   co_code1 fyear1 sales1
>> > 1      1100   1990   1000
>> > 2      1100   1991   1100
>> > 3      1100   1992   1200
>> > 4      1100   1993   1300
>> > 6      1100   1995   1500
>> > 7      1100   1996   1600
>> > 9      1200   1991   1100
>> > 10     1200   1992   1200
>> > 11     1200   1993   1300
>> > 12     1200   1994   1400
>> > 13     1200   1995   1500
>> > 14     1200   1996   1600
>> > 15     1300   1990   1000
>> > 16     1300   1991   1100
>> > 17     1300   1992   1200
>> > 18     1300   1993   1300
>> > 19     1300   1994   1400
>> > 20     1300   1995   1500
>> > 21     1300   1996   1600
>> > # so 1994 for co_code1 1100 and 1990 for co_code1 1200 have been
>> > removed. If I try,
>> > d<-ddply(df1,"co_code1",transform, growth=c(NA,exp(diff(log(sales1)))-
>> 1)*100)
>> >
>> > # this apparently gives wrong results for the year 1995 (as shown
>> > below) as growth rates are computed considering yearly increment.
>> >
>> >   co_code1 fyear1 sales1    growth
>> > 1      1100   1990   1000        NA
>> > 2      1100   1991   1100 10.000000
>> > 3      1100   1992   1200  9.090909
>> > 4      1100   1993   1300  8.333333
>> > 5      1100   1995   1500 15.384615
>> > 6      1100   1996   1600  6.666667
>> > 7      1200   1991   1100        NA
>> > 8      1200   1992   1200  9.090909
>> > 9      1200   1993   1300  8.333333
>> > 10     1200   1994   1400  7.692308
>> > 11     1200   1995   1500  7.142857
>> > 12     1200   1996   1600  6.666667
>> > 13     1300   1990   1000        NA
>> > 14     1300   1991   1100 10.000000
>> > 15     1300   1992   1200  9.090909
>> > 16     1300   1993   1300  8.333333
>> > 17     1300   1994   1400  7.692308
>> > 18     1300   1995   1500  7.142857
>> > 19     1300   1996   1600  6.666667
>> > # I thought of using the formula only when the increment of fyear1 is
>> > only 1 while in a co_code1, by using this formula
>> >
>> > d<-ddply(df1,
>> >         "co_code1",
>> >         transform,
>> >         if(diff(fyear1)==1){
>> >           growth=(exp(diff(log(df1\$sales1)))-1)*100
>> >         } else{
>> >           growth=NA
>> >         })
>> >
>> > But, this doesn't work. I am getting the following error.
>> >
>> > In if (diff(fyear1) == 1) { :
>> >  the condition has length > 1 and only the first element will be used
>> > (repeated a few times).
>> >
>> > # I have searched for a solution, but somehow couldn't get one. Hope
>> > that some kind soul will guide me here.
>> >
>>
>> In your case use ifelse() as explained by Rui.
>> But it can be done more easily since the fyear1 and co_code1 are
>> synchronized.
>> Add a new column to df1 like this
>>
>> df1\$growth <- c(NA,
>>          ifelse(diff(df1\$fyear1)==1,
>>                     (exp(diff(log(df1\$sales1)))-1)*100,
>>                     NA
>>                     )
>>         )
>>
>> and display df1. From your request I cannot determine if this is what you
>> want.
>>
>> regards,
>>
>> Berend Hasselman
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
```