[R-sig-ME] Level 2 outcome and 'Downdated VtV' error

Thierry Onkelinx th|erry@onke||nx @end|ng |rom |nbo@be
Tue Jul 7 09:02:30 CEST 2020

Dear Matthew,

I recommend aggregating the data into one record per healthcare facility,
as you did when calculating the outcome variable. The aggregation removes
all variability at the patient level. Given the huge dataset, that would
force the error term close to zero.

Another option is to use an outcome variable at the patient level.

Best regards,

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkelinx using inbo.be
Havenlaan 88 bus 73, 1000 Brussel

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey


Op di 7 jul. 2020 om 00:19 schreef Matthew Boden <matthew.t.boden using gmail.com

> Good afternoon,
> I am looking for advice regarding a multi-level model I am trying to
> implement using lme4. My two-level random-effects model won’t run, perhaps
> due to one or two issues.
> Background: Level 1 is patients, which are clustered in healthcare
> facilities (‘Station’). The outcome is a continuous variable (‘PopCov’)
> that is calculated at the facility-level, and is thus a Level 2 variable
> that does not vary at the patient level.
> The aim of this analysis is to examine whether PopCov is predicted by (a)
> patient-level (e.g., race/ethnicity, age, symptom severity), and (b)
> facility-level variables (e.g., overall racial/ethnic composition, average
> age). It is important to examine factors such as race/ethnicity at both
> patient and facility-levels because patients with different racial/ethnic
> backgrounds tend to differ in terms of age, symptom severity, etc.
> Each record/row in my data is a patient, with facility-level variables
> (including PopCov) having identical values among patients within a given
> facility.
> An error is thrown when I run a basic model.
> A1 <-lmer(PopCov ~ (1 | Station), data = DISP)
> *Error in fn9nM$xeval()) : Downdated VtV is not positive definite
> I obtain the same error when I add to the model either a patient-level or
> facility level predictor.
> An internet search suggested that I have complete separation of my data
> and/or poorly scaled variables.
> I assume this issue has to do with the fact that the outcome is a level 2
> variable. Perhaps compounding the issue is the large and unbalanced nature
> of the data. I have ~6 million patients clustered in ~1000 healthcare
> facilities. Individual facilities have anywhere from 100 to 30000 patients
> clustered in them.
> I could use some advice regarding how to specify the model to predict a
> facility-level variable (level 2) from both patient (level 1) and
> facility-level (level 2) variables with these data.
> Thank you in advance.
> Matt
>         [[alternative HTML version deleted]]
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

	[[alternative HTML version deleted]]

More information about the R-sig-mixed-models mailing list