[R] Models with ordered and unordered factors
Paul Johnson
pauljohn32 at gmail.com
Tue Nov 15 17:54:28 CET 2011
On Tue, Nov 15, 2011 at 9:00 AM, Catarina Miranda
<catarina.miranda at gmail.com> wrote:
> Hello;
>
> I am having a problems with the interpretation of models using ordered or
> unordered predictors.
> I am running models in lmer but I will try to give a simplified example
> data set using lm.
> Both in the example and in my real data set I use a predictor variable
> referring to 3 consecutive days of an experiment. It is a factor, and I
> thought it would be more correct to consider it ordered.
> Below is my example code with my comments/ideas along it.
> Can someone help me to understand what is happening?
Dear Catarina:
I have had the same question, and I hope my answers help you
understand what's going on.
The short version:
http://pj.freefaculty.org/R/WorkingExamples/orderedFactor-01.R
The longer version, "Working with Ordinal Predictors"
http://pj.freefaculty.org/ResearchPapers/MidWest09/Midwest09.pdf
HTH
pj
>
> Thanks a lot in advance;
>
> Catarina Miranda
>
>
> y<-c(72,25,24,2,18,38,62,30,78,34,67,21,97,79,64,53,27,81)
>
> Day<-c(rep("Day 1",6),rep("Day 2",6),rep("Day 3",6))
>
> dataf<-data.frame(y,Day)
>
> str(dataf) #Day is not ordered
> #'data.frame': 18 obs. of 2 variables:
> # $ y : num 72 25 24 2 18 38 62 30 78 34 ...
> # $ Day: Factor w/ 3 levels "Day 1","Day 2",..: 1 1 1 1 1 1 2 2 2 2 ...
>
> summary(lm(y~Day,data=dataf)) #Day 2 is not significantly different from
> Day 1, but Day 3 is.
> #
> #Call:
> #lm(formula = y ~ Day, data = dataf)
> #
> #Residuals:
> # Min 1Q Median 3Q Max
> #-39.833 -14.458 -3.833 13.958 42.167
> #
> #Coefficients:
> # Estimate Std. Error t value Pr(>|t|)
> #(Intercept) 29.833 9.755 3.058 0.00797 **
> #DayDay 2 18.833 13.796 1.365 0.19234
> #DayDay 3 37.000 13.796 2.682 0.01707 *
> #---
> #Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> #
> #Residual standard error: 23.9 on 15 degrees of freedom
> #Multiple R-squared: 0.3241, Adjusted R-squared: 0.234
> #F-statistic: 3.597 on 2 and 15 DF, p-value: 0.05297
> #
>
> dataf$Day<-ordered(dataf$Day)
>
> str(dataf) # "Day 1"<"Day 2"<"Day 3"
> #'data.frame': 18 obs. of 2 variables:
> # $ y : num 72 25 24 2 18 38 62 30 78 34 ...
> # $ Day: Ord.factor w/ 3 levels "Day 1"<"Day 2"<..: 1 1 1 1 1 1 2 2 2 2 ...
>
> summary(lm(y~Day,data=dataf)) #Significances reversed (or "Day.L" and
> "Day.Q" are not sinonimous "Day 2" and "Day 3"?): Day 2 (".L") is
> significantly different from Day 1, but Day 3 (.Q) isn't.
>
> #Call:
> #lm(formula = y ~ Day, data = dataf)
> #
> #Residuals:
> # Min 1Q Median 3Q Max
> #-39.833 -14.458 -3.833 13.958 42.167
> #
> #Coefficients:
> # Estimate Std. Error t value Pr(>|t|)
> #(Intercept) 48.4444 5.6322 8.601 3.49e-07 ***
> #Day.L 26.1630 9.7553 2.682 0.0171 *
> #Day.Q -0.2722 9.7553 -0.028 0.9781
> #---
> #Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> #
> #Residual standard error: 23.9 on 15 degrees of freedom
> #Multiple R-squared: 0.3241, Adjusted R-squared: 0.234
> #F-statistic: 3.597 on 2 and 15 DF, p-value: 0.05297
>
> [[alternative HTML version deleted]]
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
--
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas
More information about the R-help
mailing list