[R-sig-ME] Question concerning the reduction of random effects

Sat Feb 4 17:33:30 CET 2017

Hi,

perhaps I should add for greater clarity that I have calculated the values below for each level of the random effect "noun". Altogether, the random effect has a standard deviation of 3.85 on an intercept of 6.95. 

I have assumed that a level of the random effect does not have an effect if sqrt(level)*1,96 includes 0 (because then in a binary model, it can have a positive as well as a negative effect on the intercept). I am only interested in those levels that show a unique negative influence. Thus the level with the highest negative value (-7.93) may reduce the likelihood of a positive outcome from 99.9 to a mere 27.3 %.

With kind regards

Tibor

Am 04.02.2017 um 17:22 schrieb Tibor Kiss <tibor at linguistics.rub.de>:

> Hi John,
> 
> no, I have extracted the conditional variances with ranef(MODEL, condVar = T) as well as attr(MODEL.randoms[[1]], "postVar"), determined the sqrt(variances)*1,96 and calculated whether abs(intercept) - sqrt(variance)*1,96 > 0. 
> 
> I treat nouns only as random effects, since they are sampled from an infinite population.
> 
> With kind regards
> 
> Tibor
> 
> 
> 
>  
> 
> Am 04.02.2017 um 17:10 schrieb Poe, John <jdpo223 at g.uky.edu>:
> 
>> Tibor, 
>> 
>> If I understand you correctly you've included each group as a fixed effect to get the confidence intervals and done a an enormous number of hypotheses tests. If that's the case you really can't trust the results.  That many categorical fixed effects for a nonlinear outcome will produce biased coefficients and standard errors. Even with as few as ten groups you start to see bias. So just because something has zero in the CI on a group fixed effect doesn't mean that the group does not, in reality,  have a significant mean difference from the population average.   
>> 
>> On Feb 4, 2017 10:38 AM, "Tibor Kiss" <tibor at linguistics.rub.de> wrote:
>> Hello everybody,
>> 
>> I have a question that might perhaps sound weird, but I haven't found anything on this yet, so maybe someone can enlighten me here.
>> 
>> I am working with a BGLMM (random intercept) that contains the random effect "noun" (basically, a noun occurring in a natural language sentence from a sample) with 1.302 levels. The high number of levels is due to the fact that the sample contains this many different nouns in the relevant position of the clause.
>> 
>> After determining the variances of the individual 1.302 nouns, there are only 54 nouns left which do not contain 0 in the 95 % confidence interval. So, these are nouns that are actually interesting. I have a hunch now that this reduced number of nouns also influences some of the slopes, but I cannot test this. The dataset contains only 2.284 observations. Thus, the total number of random effects is larger than the number of observations when tested against a binary feature.
>> 
>> I would like to find a way to reduce the random effects to the ones which have shown relevant in the random intercept model, and would like to use the reduced set for a random slope model. It strikes me that this is tampering with the data, unless there is a principled way of selecting a subset from the set of random effects. So, if there is a principled way, I would appreciate learning about it.
>> 
>> 
>> 
>> With kind regards
>> 
>> Tibor
>> 
>> 
>> 
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

	[[alternative HTML version deleted]]