[R] Creating dummy variables in r

peter dalgaard pdalgd at gmail.com
Wed Jan 30 09:35:40 CET 2013


On Jan 30, 2013, at 04:58 , Bert Gunter wrote:

> You almost never need dummy variables in R. R creates them
> automatically from factors given model and possibly contrasts
> specification.
> 
> ?contrasts  ## for some technical details.
> 
> If you have not read "An Introduction to R" do so now. Pay particular
> attention to the chapter on modeling and categorical variables. You
> can also google around to find appropriate tutorials. Here is one:
> 
> http://www.ats.ucla.edu/stat/r/modules/dummy_vars.htm
> 
> I repeat: DO not create dummy variablesby hand in R unless you have
> understood the above and have good reason to do so.

In this case it's a cutpoint-type situation, and the user might be excused for not wanting to deal with the mysteries of cut() (yet). 

More importantly, the main issue here seems to be a lack of understanding of where new variables are located. I.e., if the data set is called dd, you need

dd$prev1 <- (etc)

and if you use attach(), do it _after_ modifying the data (or detach() and reattach).

Otherwise, new variables end up in the global environment. (This is logical enough once you realize that the result of a computation does not necessarily "fit" into the dataset.)

By the way: You don't need ifelse(): as.numeric(ret1 >= .5) or even just (ret1 >= .5) works. 

> 
> -- Bert
> 
> On Tue, Jan 29, 2013 at 7:21 PM, Joseph Norman Thomson
> <thomsonj at email.arizona.edu> wrote:
>> Hello,
>> 
>> Semi-new r user here and still learning the ropes. I am creating dummy
>> variables for a dataset on stock prices in r. One dummy variable is
>> called prev1 and is:
>> 
>> prev1 <- ifelse(ret1 >= .5, 1, 0)
>> 
>> where ret1 is the previous day's return.
>> 
>> The variable "prev1" is created fine and works in my regression model
>> and for running conditional statistics. However, when I call the
>> names() function on the dataset the freshly created variable (prev1)
>> doesn't show up; also, when I export the dataset the prev1 variable
>> doesn't show up in the exported file. Is there a way to make the
>> variable show up on both the call function but more importantly on the
>> exported file? Or am I forced to create dummy variables elsewhere(much
>> tougher)?
>> 
>> 
>> Thanks,
>> 
>> Joe
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 
> -- 
> 
> Bert Gunter
> Genentech Nonclinical Biostatistics
> 
> Internal Contact Info:
> Phone: 467-7374
> Website:
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-help mailing list