[R] help for stata user

Marwan Khawaja marwan.khawaja at aub.edu.lb
Mon Sep 27 18:48:50 CEST 2004


> R is more of a statistical programming language whereas Stata is a
> statistical package. R is more powerful but also has a steeper
> learning curve. I like R's builtin matrix facilities, which are much
> better than Stata's (although there are user-written extensions to
> the Stata matrix facilities). R has good debugging tools and
> user-written help files are integrated in the help.search() system.

Well -- Stata does have a powerful macro facility -- and you can program
virtually anything you need for stat.

Marwan

-------------------------------------------------------------------
Marwan Khawaja         http://staff.aub.edu.lb/~mk36/
-------------------------------------------------------------------



>
> An area where Stata has the advantage is converting between strings
> and variables. R has constructs like
> "eval(parse(text=paste(object,"string),sep="")))", very confusing.
> See 7.21 of the R faq for some examples.
>
> Below are two sample jobs in R and in Stata. They read in a
> space-delimited data file, define labels, summarize the data, run
> tables, and estimate linear regression, multinomial logit, and
> ordered logit models. The data are available from the "catspec"
> package if you'd like to experiment with them yourself. Hope they get
> you started in R. See "An introduction to R" to help you along a
> little further (included in the R distribution). I also found John
> Maindonald's "Using R for Data Analysis and Graphics" very helpful;
> it's available at http://cran.r-project.org/other-docs.html
>
> Good luck,
> John Hendrickx
>
> ---- sampjob.do ----------------------------------------------------
> /* Sample data used in Logan (1983: 332-333)
>    Data are from the 1972-1978 merged Genderal Social Surveys (GSS)
>    of the National Opinion Research Center (NORC). The selection is
>    restricted to males, aged 25 to 34 at the time of the interview,
> who
>    were in the labour force at the time of interview, and who had
>    non-missing values for respondent's education and occupation and
> for
>    father's occupation and education. N=838.
>
>    Logan, J. (1983). "A multivariate model for mobility tables."
>    American Journal of Sociology 89: 324-349.
> */
> #delimit ;
> label define occs 1 "Farm"          /* occupations */
>                   2 "Operatives"    /* service, and laborers */
>                   3 "Craftsmen"     /* and kindred workers */
>                   4 "Sales"         /* and clerical" */
>                   5 "Professional"; /* technical, and mangerial"*/
> #delimit cr
> label define race 1 "black" 0 "non-black"
>
> infile byte(occ focc educ black) using logan.dat
> label variable occ "occupation"
> label variable focc "father's occupation at age 16"
> label variable educ "education in years"
> label variable black "race"
> label values occ occs
> label values focc occs
> label values black race
>
> summarize
>
> tab focc
> tab occ
> tab focc occ
> tab black
> tab occ black
> tab occ black, col
> tab focc occ, row col
>
> regress occ educ black
> mlogit occ educ black, base(1)
> ologit occ educ black
> /* Sample data used in Logan (1983: 332-333)
>    Data are from the 1972-1978 merged Genderal Social Surveys (GSS)
>    of the National Opinion Research Center (NORC). The selection is
>    restricted to males, aged 25 to 34 at the time of the interview,
> who
>    were in the labour force at the time of interview, and who had
>    non-missing values for respondent's education and occupation and
> for
>    father's occupation and education. N=838.
>
>    Logan, J. (1983). "A multivariate model for mobility tables."
>    American Journal of Sociology 89: 324-349.
> */
> #delimit ;
> label define occs 1 "Farm"          /* occupations */
>                   2 "Operatives"    /* service, and laborers */
>                   3 "Craftsmen"     /* and kindred workers */
>                   4 "Sales"         /* and clerical" */
>                   5 "Professional"; /* technical, and mangerial"*/
> #delimit cr
> label define race 1 "black" 0 "non-black"
>
> infile byte(occ focc educ black) using logan.dat
> label variable occ "occupation"
> label variable focc "father's occupation at age 16"
> label variable educ "education in years"
> label variable black "race"
> label values occ occs
> label values focc occs
> label values black race
>
> summarize
>
> tab focc
> tab occ
> tab focc occ
> tab black
> tab occ black
> tab occ black, col
> tab focc occ, row col
>
> regress occ educ black
> mlogit occ educ black, base(1)
> ologit occ educ black
> /* Sample data used in Logan (1983: 332-333)
>    Data are from the 1972-1978 merged Genderal Social Surveys (GSS)
>    of the National Opinion Research Center (NORC). The selection is
>    restricted to males, aged 25 to 34 at the time of the interview,
> who
>    were in the labour force at the time of interview, and who had
>    non-missing values for respondent's education and occupation and
> for
>    father's occupation and education. N=838.
>
>    Logan, J. (1983). "A multivariate model for mobility tables."
>    American Journal of Sociology 89: 324-349.
> */
> #delimit ;
> label define occs 1 "Farm"          /* occupations */
>                   2 "Operatives"    /* service, and laborers */
>                   3 "Craftsmen"     /* and kindred workers */
>                   4 "Sales"         /* and clerical" */
>                   5 "Professional"; /* technical, and mangerial"*/
> #delimit cr
> label define race 1 "black" 0 "non-black"
>
> infile byte(occ focc educ black) using logan.dat
> label variable occ "occupation"
> label variable focc "father's occupation at age 16"
> label variable educ "education in years"
> label variable black "race"
> label values occ occs
> label values focc occs
> label values black race
>
> summarize
>
> tab focc
> tab occ
> tab focc occ
> tab black
> tab occ black
> tab occ black, col
> tab focc occ, row col
>
> regress occ educ black
> mlogit occ educ black, base(1)
> ologit occ educ black
> --------------------------------------------------------------------
>
> --- sampjob.R -------------------------------------------------------
> mytab <- function (x,y) {
> 	prop.table(table(x,y),2)*100
> }
>
> logan <- read.table("logan.dat")
> names(logan) <- c("occ", "focc", "educ", "black")
> attach(logan)
> occ.codes <- c("farm", "operatives", "craftsmen", "sales",
> "professional")
> occ <- factor(occ,label=occ.codes)
> focc <- factor(focc,label=occ.codes)
> black <- factor(black,label=c("non-black", "black"))
>
> summary(logan)
>
> table(focc)
> table(occ)
> dit<-mytab(focc,occ)
> dit
> table(black)
> mytab(occ,black)
>
> library(gregmisc)
> CrossTable(occ,black,prop.t=F,prop.r=F,fisher=FALSE)
> CrossTable(focc,occ,prop.t=F,fisher=FALSE)
> detach(package:gregmisc)
>
> fm <- lm(occ~ educ+black, data=logan)
> summary(fm)
> anova(fm)
>
> library(nnet)
> mnl.logit<-multinom(occ ~ educ+black, data=logan)
> summary(mnl.logit,correlation=FALSE)
> detach(package:nnet)
>
> library(MASS)
> or.logit <- polr(occ ~ educ+black)
> summary(or.logit)
> detach(package:MASS)
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>




More information about the R-help mailing list