[R] create a dummy variables for companies with complete history.
David L Carlson
dcarlson at tamu.edu
Wed Jun 24 22:36:59 CEST 2015
You may want to consider another way of getting your answer that takes advantage of some of R's features:
> # Make some example data
> cods <- LETTERS[1:10] # Ten companies
> yrs <- 2010:2014 # 5 years
> set.seed(42) # Set random seed so we all get the same values
> # Chances of revenue for a given year are 95%
> rev <- round(rbinom(50, 1, .95)*runif(50, 25, 50), 2)
> z <- data.frame(expand.grid(year=yrs, cod=cods)[, 2:1], rev)
> # Remove years with missing (0) revenue
> z <- z[z$rev > 1, ]
> str(z)
'data.frame': 45 obs. of 3 variables:
$ cod : Factor w/ 10 levels "A","B","C","D",..: 1 1 1 1 1 2 2 2 2 2 ...
$ year: int 2010 2011 2012 2013 2014 2010 2011 2012 2013 2014 ...
$ rev : num 33.3 33.7 35 44.6 26 ...
>
> # Construct the dummy variable
> tbl <- xtabs(~cod+year, z)
> tbl
year
cod 2010 2011 2012 2013 2014
A 1 1 1 1 1
B 1 1 1 1 1
C 1 1 1 1 1
D 1 0 1 1 1
E 1 1 0 1 1
F 1 1 1 1 1
G 1 1 1 1 1
H 1 1 1 1 1
I 1 1 1 0 1
J 0 1 1 0 1
> dummy <- as.integer(apply(tbl, 1, all))
> dummy
[1] 1 1 1 0 0 1 1 1 0 0
-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352
-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Michael Dewey
Sent: Wednesday, June 24, 2015 2:12 PM
To: giacomo begnis; r-help at r-project.org
Subject: Re: [R] create a dummy variables for companies with complete history.
Comments below
On 24/06/2015 19:26, giacomo begnis wrote:
> Hi, I have a dataset (728 obs) containing three variables code of a company, year and revenue. Some companies have a complete history of 5 years, others have not a complete history (for instance observations for three or four years).I would like to determine the companies with a complete history using a dummy variables.I have written the following program but there is somehting wrong because the dummy variable that I have create is always equal to zero.Can somebody help me?Thanks, gm
>
> z<-read.table(file="c:/Rp/cddat.txt", sep="", header=T)
> attach(z)
> n<-length(z$cod) // number of obs dataset
>
Could also use nrow(z)
> d1<-numeric(n) // dummy variable
>
> for (i in 5:n) {
> if (z$cod[i]==z$cod[i-4]) // cod is the code of a company
{ d1[i]<=1} else { d1[i]<=0} // d1=1 for a
company with complete history, d1=0 if the history is not complete }d1
Did you really type <= which means less than or equals to? If so, try
replacing it with <- and see what happens.
> When I run the program d1 is always equal to zero. Why?
> Once I have create the dummy variable with subset I obtains the code of the companies with a complete history and finally with a merge I determine a panel of companies with a complete history.But how to determine correctly d1?My best regards, gm
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Michael
http://www.dewey.myzen.co.uk/home.html
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list