[R-sig-ME] Mixed model effect for unreplicated design

Thierry Onkelinx th|erry@onke||nx @end|ng |rom |nbo@be
Fri Sep 22 14:57:03 CEST 2023


Dear Gianni,

After a quick look at the code, I think you miscoded the Sample. They
should have a unique identifier. You are reusing the identifiers between
fields.
df$Sample <- interaction(df$Field, df$Sample) should fix that.

Note that your data can only calculate the differences between these three
specific fields. You cannot generalise the differences as effects from the
treatment of the fields. One sample at eight different fields per treatment
would have been a better option than eight samples at one field per
treatment.

Best regards,

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND
FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkelinx using inbo.be
Havenlaan 88 bus 73, 1000 Brussel
www.inbo.be

///////////////////////////////////////////////////////////////////////////////////////////
To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey
///////////////////////////////////////////////////////////////////////////////////////////

<https://www.inbo.be>


Op vr 22 sep 2023 om 14:45 schreef Gianni Micucci <GVM962 using student.bham.ac.uk
>:

> Dear all,
> Thanks for taking the time to read and help if you can,
> I'll try to be as clear as I can, I am not a stat pro.
>
> I basically study three different fields (agricultural, herbal, pasture)
> and measured nitrate content (8 replicate per field) in 8 different month,
> which results in 192 observations.
> I am able to plot the evolution of nitrate content as a function of time
> for the three fields, I would like to know if these fields differ
> statistically in terms of Nitrate content.
> I was told I cannot do ANOVA since these 3 types of fields are not
> replicated and thus my measures are not independent. I was advise to go for
> mixed models.
> I am afraid the model is however misspecified and don't know if the
> between field contrasts are being generated.
> I wrote a code that generates data, plot them and do the mixed model,
> please see below.
> If anyone could advise me that would be great.
>
> Many thanks again for your help
> Best regards
> Gianni Micucci
>
>
> P.S. In addition I get the following warning message:
> boundary (singular) fit: see help('isSingular')
> when I run the model over the simulated data but not on the real ones, not
> sure why.
>
>
>
> ####### Load necessary libraries #####
> library(tibble)
> library(tidyr)
> library(dplyr)
>
> ###### Generate data #####
> # Set random seed for reproducibility
>
>
> # Define the number of Field, time points, and unique samples per pond
> num_fields <- 3
> num_time_points <- 8
> num_unique_samples_per_field <- 8
>
> # Set random seed for reproducibility
> set.seed(123)
>
>
> # Create an empty data frame
> simulated_data <- data.frame()
>
> # Generate random data
> for (field in 1:num_fields) {
>   for (time in 1:num_time_points) {
>     sampled_samples <- sample(1:num_unique_samples_per_field,
> num_unique_samples_per_field, replace = FALSE)
>     nitrate <- rnorm(num_unique_samples_per_field, mean = 50, sd = 10)
>     field_data <- data.frame(
>       Field = factor(field),
>       Sample = factor(sampled_samples),
>       Time = factor(time),
>       Nitrate = nitrate
>     )
>     simulated_data <- rbind(simulated_data, field_data)
>   }
> }
>
> # Add interaction effect to simulate differences between fields over time
> simulated_data$Nitrate <- simulated_data$Nitrate +
>   as.numeric(simulated_data$Field) * 5 +    # Field-specific effect
>   as.numeric(simulated_data$Time) * 0.5     # Time-specific effect
> df <- simulated_data
>
> # Re-organize as per my data
> df <- df %>%
>   arrange(Field, Time, Sample) %>%
>   mutate(Field = factor(Field, levels = c(1, 2, 3), labels = c("Herbal",
> "Arable", "Pasture"))) %>%
>   rename(Month = Time) %>%
>   mutate(Month = factor(Month, levels = 1:8, labels = c("April", "May",
> "June", "July", "August", "September", "October", "November"))) %>%
>   select(Field, Month, Sample, Nitrate)
>
>
> # View the first few rows of the simulated dataset
> head(df)
>
>
> str(df)
>
> # Descriptors
> field_summary <- df %>%
>   group_by(Field) %>%
>   summarise(
>     Count_Samples = n(),
>     Mean_Nitrate = mean(Nitrate),
>     Variance_Nitrate = var(Nitrate)
>   )
>
> # Print the summary
> print(field_summary)
>
>
> ##### Plot ######
> library(ggplot2)
> # field with time coloured
> ggplot(df, aes(x = Field, y = Nitrate)) +
>   geom_boxplot(outlier.shape = NA, width = 0.5) +  # Create a single
> boxplot for each field
>   geom_jitter(aes(color = Time), width = 0.2, alpha = 0.5) +  # Overlay
> data points with jitter and color by Time
>   labs(x = "Field", y = "Nitrate")
>
> #Time with field coloured
>
> ggplot(df, aes(x = Time, y = Nitrate)) +
>   geom_boxplot(outlier.shape = NA, width = 0.5) +  # Create a single
> boxplot for each field
>   geom_jitter(aes(color = Field), width = 0.2, alpha = 0.5) +  # Overlay
> data points with jitter and color by Time
>   labs(x = "Time", y = "Nitrate")
>
> #plot errorbar x time with three fields on the plot
> # Load necessary libraries
>
> # Calculate means and standard errors for Nitrate by Time and Field
> summary_data <- df %>%
>   group_by(Month, Field) %>%
>   summarize(
>     Mean_Nitrate = mean(Nitrate),
>     SE_Nitrate = sd(Nitrate) / sqrt(n())
>   )
>
> # Create the error bar plot
> ggplot(summary_data, aes(x = Month, y = Mean_Nitrate, group = Field, color
> = Field)) +
>   geom_errorbar(aes(ymin = Mean_Nitrate - SE_Nitrate, ymax = Mean_Nitrate
> + SE_Nitrate), width = 0.2) +
>   geom_line() +
>   geom_point(size = 3) +
>   labs(
>     title = "Error Bar Plot of Nitrate by Month and Field",
>     x = "Month",
>     y = "Mean Nitrate",
>     color = "Field"
>   ) +
>   theme(legend.position = "top")
>
>
> ###### Mixed model #####
>
> modele_mixte <- lmer(Nitrate ~ Field + (1 | Month) + (1|Sample), data = df)
> summary(modele_mixte)
>
>
> library(car)
> Pvals <- Anova(modele_mixte)
>
> print(Pvals)
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list