[R-sig-ME] Does corSymm() require balanced data?

Mon Mar 15 19:09:01 CET 2021

Dear Thierry, Ben and Wolfgang,

Thank you all so much! My sense of `corSymm()` is that there needs to be
"enough" observations with respect to each unique time point index so that
the model can converge and/or be trusted.

However, we can't exactly say what constitutes "enough" observations (maybe
power analysis helps) other than by some kind of trial and error.

The other thing that I noticed was that for `corSymm()` we can NOT use
`corSymm(form = ~ time | id)`. Rather, it must be `corSymm(form = ~1| id)`
even though our goal is to allow observations under all unique `time` point
indices to be correlated with each other.

Is there a specific reason `corSymm(form = ~ time | id)` can not be used
(while `corAR1(form = ~ time | id)` works fine)?

Thanks a million,
Joe

On Mon, Mar 15, 2021 at 12:04 PM Thierry Onkelinx <thierry.onkelinx using inbo.be>
wrote:

> Dear Joe,
>
> CorSymm() needs n * (n - 1) / 2 parameters with n the number of groups
> (subjects). n = 4 implies 6 parameters for the correlation alone. So you'll
> need plenty of data to fit such a model. I'd recommend that the data should
> contain at least 20 subjects for every combination of time points. So 20
> subjects with measurements for time 0 and time 1, ... In a balanced case
> you'll need at least 20 subjects measured at every time point. If some
> combinations are missing in subjects, you'll need extra subjects with those
> combinations. 9 and 7 subjects in your data is simply not enough for such a
> complex correlation structure.
>
> corAR1(form = ~time) is equivalent to corAR1(form = ~time | id) if random
> = ~1|id.
> I think that corAR1(form = ~1| id) will use the order of the data. So if,
> and only if, your data is ordered along time, then it is
> equivalent to corAR1(form = ~time | id). I recommend to use the
> verbose corAR1(form = ~time | id), which is more clear about the structure
> what you want.
>
> Best regards,
>
> ir. Thierry Onkelinx
> Statisticus / Statistician
>
> Vlaamse Overheid / Government of Flanders
> INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND
> FOREST
> Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
> thierry.onkelinx using inbo.be
> Havenlaan 88 bus 73, 1000 Brussel
> www.inbo.be
>
>
> ///////////////////////////////////////////////////////////////////////////////////////////
> To call in the statistician after the experiment is done may be no more
> than asking him to perform a post-mortem examination: he may be able to say
> what the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does not
> ensure that a reasonable answer can be extracted from a given body of data.
> ~ John Tukey
>
> ///////////////////////////////////////////////////////////////////////////////////////////
>
> <https://www.inbo.be>
>
>
> Op ma 15 mrt. 2021 om 15:56 schreef Tip But <fswfswt using gmail.com>:
>
>> Dear Thierry,
>>
>> Thank you so much for your insightful comments. May I follow-up on them
>> below in-line:
>>
>>
>> ***"You have too few subjects with 4 observations. Either drop those
>> fourth
>> observations."
>>
>> >>>> Does the above mean that for an unstructured residual correlation
>> matrix, the unique number of measurements (e.g., 3 times, 4 times etc.)
>> must have relatively equal sizes (e.g., 9 subjects with 3 times, 7 subjects
>> with 4 times)?
>>
>> ***"Or use a different correlation structure. E.g. an AR1:
>>
>> fit_alt <- lme(opp ~ time * ccog, random = ~1 | id,
>>   correlation = corAR1(form = ~ time), data = dat)
>> "
>>
>> >>>> In your above R code, is it necessary to use `corAR1(form = ~
>> time)`? It seems `corAR1(form = ~1 | id)` gives the same result?
>>
>> On Mon, Mar 15, 2021 at 2:37 AM Thierry Onkelinx <
>> thierry.onkelinx using inbo.be> wrote:
>>
>>> Dear Joe,
>>>
>>> You have too few subjects with 4 observations. Either drop those fourth
>>> observations. Or use a different correlation structure. E.g. an AR1
>>>
>>> fit <- lme(
>>>   opp ~ time * ccog, random = ~1 | id,
>>>   correlation = corSymm(), data = dat, subset = time < 3
>>> )
>>>
>>> fit_alt <- lme(
>>>   opp ~ time * ccog, random = ~1 | id,
>>>   correlation = corAR1(form = ~ time), data = dat
>>> )
>>> Best regards,
>>>
>>>
>>> ir. Thierry Onkelinx
>>> Statisticus / Statistician
>>>
>>> Vlaamse Overheid / Government of Flanders
>>> INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE
>>> AND FOREST
>>> Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
>>> thierry.onkelinx using inbo.be
>>> Havenlaan 88 bus 73, 1000 Brussel
>>> www.inbo.be
>>>
>>>
>>> ///////////////////////////////////////////////////////////////////////////////////////////
>>> To call in the statistician after the experiment is done may be no more
>>> than asking him to perform a post-mortem examination: he may be able to say
>>> what the experiment died of. ~ Sir Ronald Aylmer Fisher
>>> The plural of anecdote is not data. ~ Roger Brinner
>>> The combination of some data and an aching desire for an answer does not
>>> ensure that a reasonable answer can be extracted from a given body of data.
>>> ~ John Tukey
>>>
>>> ///////////////////////////////////////////////////////////////////////////////////////////
>>>
>>> <https://www.inbo.be>
>>>
>>>
>>> Op ma 15 mrt. 2021 om 03:27 schreef Tip But <fswfswt using gmail.com>:
>>>
>>>> Dear Members,
>>>>
>>>> In my longitudinal data below, the first couple of subjects were
>>>> measured 4
>>>> times but the rest of the subjects were measured 3 times (see data
>>>> below).
>>>>
>>>> We intend to use an unstructured residual correlation matrix in
>>>> `nlme::lme()`. But our model fails to converge.
>>>>
>>>> Question: Given our data is unbalanced with respect to our grouping
>>>> variable (i.e., `id`), can we use ` corSymm()`? And if we do, what
>>>> would be
>>>> the dimensions of the resultant unstructured residual correlation matrix
>>>> for our data; a 3x3 or a 4x4 matrix?
>>>>
>>>> Thank you for your expertise,
>>>> Joe
>>>>
>>>> # Data and R Code
>>>> dat <- read.csv("https://raw.githubusercontent.com/hkil/m/master/un.csv
>>>> ")
>>>>
>>>> library(nlme)
>>>>
>>>> fit <- lme(opp~time*ccog, random = ~1|id, correlation=corSymm(form = ~
>>>> 1 |
>>>> id),
>>>>            data=dat)
>>>>
>>>> Error:
>>>>   nlminb problem, convergence error code = 1
>>>>   message = false convergence (8)
>>>>
>>>>         [[alternative HTML version deleted]]
>>>>
>>>> _______________________________________________
>>>> R-sig-mixed-models using r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>
>>>

	[[alternative HTML version deleted]]