[R-sig-ME] Does changing the reference level cause any difference in results?

Dan McCloy drmccloy at uw.edu
Wed Apr 6 19:55:08 CEST 2016


To answer your specific questions:

1. changing the reference level should not change the overall model fit
itself, but it will change the magnitude and direction of the coefficient
estimates (because when you change the reference level the new coefficients
will represent different comparisons).

2. you should always code factor variables in a way that makes sense given
the scientific question you are trying to answer.  A very good discussion
of factor coding is here:  http://talklab.psy.gla.ac.uk/tvw/catpred/

General comments: Coding the ordinal variables (like age cohort) in their
proper order does not in fact capture their ordered-ness: the model will
not necessarily enforce the assumption that the change from middle-aged to
old should be the same magnitude and direction as the change from young to
middle-aged.  In all of the examples you've given, what you will get is the
first level of the factor treated as baseline, and the coefficient
estimates for other levels as differences between them and the baseline.
For example, a coefficient "residence:migrant" will tell you how much more
(or less) likely migrants are to use the "CA" form than the "MA" form, when
compared to villagers (the baseline level).

-- dan

Daniel McCloy
http://dan.mccloy.info/
Postdoctoral Research Associate
Institute for Learning and Brain Sciences
University of Washington




On Wed, Apr 6, 2016 at 6:22 AM, Saudi Sadiq <ss1272 at york.ac.uk> wrote:

> Hi all,
> I am analysing a dataset 'qaaf' (attached) using logistic regression.
> The dataset includes:
> 1. speaker: participants in my study
> 2. item: words as used by my participants
> 3. gender: independent variable (2 levels: 'female' and 'male')
> 4. age.group: independent variable (3 levels:  'middle-aged',  'old' and
> 'young')
> 5. education: independent variable (3 levels:  'postgraduate', 'secondary
> or below' and 'university')
> 6. residence: independent variable (3 levels: 'migrant', 'urbanite' and
> 'villager')
> 7. convergence: the dependent variable (whether a speaker uses a CA or MA
> form).  Here, I am testing whether my participants use the CA form or not.
> This is the form of the prestigious dialect in Egypt. If they use MA, this
> means that they use their traditional dialect. I am trying to find out
> which factor (independent variable) is responsible or more responsible for
> using the CA form.
> As the target is CA and this (alphabetically) takes the 0 value,
> I re-levelled the dependent variable (convergence) to change the value of
> CA from 0 to 1,  as follows:
> (a) attach(qaaf)
> (b) qaaf$convergence= factor(convergence, levels=c(MA', 'CA'))
> I also re-levelled these variables:
> (c) qaaf$education=factor(education, levels=c("secondary or below",
> "university",  "postgraduate"))
> (d) qaaf$residence = factor(residence, levels=c('villager', 'migrant',
> 'urbanite'))
> (e) qaaf$age.group = factor(age.group, levels=c('young', 'middle-aged',
> 'old'))
>
> I re-levelled the variables in (c), (d) and (e) because these are ordinal
> variables (e.g. old people were middle-aged one day and before that had
> been young). My question may be general:
> Q: Does changing the reference level cause any difference in results?
> or
> Q: Is leaving the variable levels alphabetically arranged good or bad? Put
> another way, when should levels be left alphabetically arranged and when
> should they be re-levelled?
>
> Best
>
>
> --
> Saudi Sadiq,
> Assistant Lecturer, English Department,
> Faculty of Al-Alsun,Minia University,
> Minia City, Egypt &
> PhD Student, Language and Linguistic Science Department,
> University of York, York, North Yorkshire, UK,
> YO10 5DD
> http://york.academia.edu/SaudiSadiq
> https://www.researchgate.net/profile/Saudi_Sadiq
> Certified Translator by Egyptian Translation Association (Egyta)
> <http://www.egyta.com/>
> Certified Interpreter by Pearl Linguistics
> <http://www.pearllinguistics.com/>
> Verified Teacher at https://lingos.co/users/saudi-sadiq
> Verified Teacher at
> https://www.firsttutors.com/uk/languages/teacher/saudi.arabic.english
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list