[R-sig-ME] Does changing the reference level cause any difference in results?

Saudi Sadiq ss1272 at york.ac.uk
Thu Apr 7 21:06:10 CEST 2016


Many thanks, Dan. This is really helpful.

On 6 April 2016 at 18:55, Dan McCloy <drmccloy at uw.edu> wrote:

> To answer your specific questions:
>
> 1. changing the reference level should not change the overall model fit
> itself, but it will change the magnitude and direction of the coefficient
> estimates (because when you change the reference level the new coefficients
> will represent different comparisons).
>
> 2. you should always code factor variables in a way that makes sense given
> the scientific question you are trying to answer.  A very good discussion
> of factor coding is here:  http://talklab.psy.gla.ac.uk/tvw/catpred/
>
> General comments: Coding the ordinal variables (like age cohort) in their
> proper order does not in fact capture their ordered-ness: the model will
> not necessarily enforce the assumption that the change from middle-aged to
> old should be the same magnitude and direction as the change from young to
> middle-aged.  In all of the examples you've given, what you will get is the
> first level of the factor treated as baseline, and the coefficient
> estimates for other levels as differences between them and the baseline.
> For example, a coefficient "residence:migrant" will tell you how much more
> (or less) likely migrants are to use the "CA" form than the "MA" form, when
> compared to villagers (the baseline level).
>
> -- dan
>
> Daniel McCloy
> http://dan.mccloy.info/
> Postdoctoral Research Associate
> Institute for Learning and Brain Sciences
> University of Washington
>
>
>
>
> On Wed, Apr 6, 2016 at 6:22 AM, Saudi Sadiq <ss1272 at york.ac.uk> wrote:
>
>> Hi all,
>> I am analysing a dataset 'qaaf' (attached) using logistic regression.
>> The dataset includes:
>> 1. speaker: participants in my study
>> 2. item: words as used by my participants
>> 3. gender: independent variable (2 levels: 'female' and 'male')
>> 4. age.group: independent variable (3 levels:  'middle-aged',  'old' and
>> 'young')
>> 5. education: independent variable (3 levels:  'postgraduate', 'secondary
>> or below' and 'university')
>> 6. residence: independent variable (3 levels: 'migrant', 'urbanite' and
>> 'villager')
>> 7. convergence: the dependent variable (whether a speaker uses a CA or MA
>> form).  Here, I am testing whether my participants use the CA form or not.
>> This is the form of the prestigious dialect in Egypt. If they use MA, this
>> means that they use their traditional dialect. I am trying to find out
>> which factor (independent variable) is responsible or more responsible for
>> using the CA form.
>> As the target is CA and this (alphabetically) takes the 0 value,
>> I re-levelled the dependent variable (convergence) to change the value of
>> CA from 0 to 1,  as follows:
>> (a) attach(qaaf)
>> (b) qaaf$convergence= factor(convergence, levels=c(MA', 'CA'))
>> I also re-levelled these variables:
>> (c) qaaf$education=factor(education, levels=c("secondary or below",
>> "university",  "postgraduate"))
>> (d) qaaf$residence = factor(residence, levels=c('villager', 'migrant',
>> 'urbanite'))
>> (e) qaaf$age.group = factor(age.group, levels=c('young', 'middle-aged',
>> 'old'))
>>
>> I re-levelled the variables in (c), (d) and (e) because these are ordinal
>> variables (e.g. old people were middle-aged one day and before that had
>> been young). My question may be general:
>> Q: Does changing the reference level cause any difference in results?
>> or
>> Q: Is leaving the variable levels alphabetically arranged good or bad? Put
>> another way, when should levels be left alphabetically arranged and when
>> should they be re-levelled?
>>
>> Best
>>
>>
>> --
>> Saudi Sadiq,
>> Assistant Lecturer, English Department,
>> Faculty of Al-Alsun,Minia University,
>> Minia City, Egypt &
>> PhD Student, Language and Linguistic Science Department,
>> University of York, York, North Yorkshire, UK,
>> YO10 5DD
>> http://york.academia.edu/SaudiSadiq
>> https://www.researchgate.net/profile/Saudi_Sadiq
>> Certified Translator by Egyptian Translation Association (Egyta)
>> <http://www.egyta.com/>
>> Certified Interpreter by Pearl Linguistics
>> <http://www.pearllinguistics.com/>
>> Verified Teacher at https://lingos.co/users/saudi-sadiq
>> Verified Teacher at
>> https://www.firsttutors.com/uk/languages/teacher/saudi.arabic.english
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>
>


-- 
Saudi Sadiq,
Assistant Lecturer, English Department,
Faculty of Al-Alsun,Minia University,
Minia City, Egypt &
PhD Student, Language and Linguistic Science Department,
University of York, York, North Yorkshire, UK,
YO10 5DD
http://york.academia.edu/SaudiSadiq
https://www.researchgate.net/profile/Saudi_Sadiq
Certified Translator by Egyptian Translation Association (Egyta)
<http://www.egyta.com/>
Certified Interpreter by Pearl Linguistics
<http://www.pearllinguistics.com/>
Verified Teacher at https://lingos.co/users/saudi-sadiq
Verified Teacher at
https://www.firsttutors.com/uk/languages/teacher/saudi.arabic.english

	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list