LING 3300 - Week 9: Linear mixed effects modelling

First step: let’s load in the data and required packages. This is the standard McMurray & Jongman dataset of fricative and vowel measures that we’ve been using this term.

library(tidyverse)

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.2.0     ✔ readr     2.1.6
## ✔ forcats   1.0.1     ✔ stringr   1.6.0
## ✔ ggplot2   4.0.1     ✔ tibble    3.3.0
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.2
## ✔ purrr     1.2.0     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(yarrr)

## Loading required package: jpeg
## Loading required package: BayesFactor
## Loading required package: coda
## Loading required package: Matrix
## 
## Attaching package: 'Matrix'
## 
## The following objects are masked from 'package:tidyr':
## 
##     expand, pack, unpack
## 
## ************
## Welcome to BayesFactor 0.9.12-4.7. If you have questions, please contact Richard Morey (richarddmorey@gmail.com).
## 
## Type BFManual() to open the manual.
## ************
## Loading required package: circlize
## ========================================
## circlize version 0.4.17
## CRAN page: https://cran.r-project.org/package=circlize
## Github page: https://github.com/jokergoo/circlize
## Documentation: https://jokergoo.github.io/circlize_book/book/
## 
## If you use it in published research, please cite:
## Gu, Z. circlize implements and enhances circular visualization
##   in R. Bioinformatics 2014.
## 
## This message can be suppressed by:
##   suppressPackageStartupMessages(library(circlize))
## ========================================
## 
## 
## Attaching package: 'yarrr'
## 
## The following object is masked from 'package:ggplot2':
## 
##     diamonds

mcm <- read_csv("/Users/Thomas/Library/CloudStorage/OneDrive-YorkUniversity/LING 3300/Datasets/McMurrayJongmanFricativeAcoustics.csv")

## Rows: 2880 Columns: 29
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (3): Talker, Fricative, Vowel
## dbl (26): ID, Rep, F0, F1, F2, F3, F4, F5, maxpf, rms_f, rms_v, dur_f, dur_v...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Let’s also mark all the fricatives for place of articulation and voicing:

# Assigning 4 place of articulation levels
mcm$poa <- ifelse(mcm$Fricative %in% c("f", "v"), "labiodental", ifelse(mcm$Fricative %in% c("th", "dh"), "interdental", ifelse(mcm$Fricative%in% c("s", "z"), "alveolar", "postalveolar")))

# Assigning voiced vs. voiceless in a column
mcm$voicing <- ifelse(mcm$Fricative == "dh" | mcm$Fricative == "v" | mcm$Fricative == "z" | mcm$Fricative == "zh", "voiced", "voiceless")

Up to now, we’ve been taking our raw datasets and wrangling them into independent observations that fulfill the assumptions of linear regressions. In the fricatives experiment, participants uttered each fricative many times; we therefore have been finding the by-talker, by-fricative means before running our models, like so:

talker_means <- mcm %>% 
  group_by(Talker, Fricative) %>% 
  summarise(mean_maxpf = mean(na.omit(maxpf)),
            mean_M1 = mean(na.omit(M1)))

## `summarise()` has regrouped the output.
## ℹ Summaries were computed grouped by Talker and Fricative.
## ℹ Output is grouped by Talker.
## ℹ Use `summarise(.groups = "drop_last")` to silence this message.
## ℹ Use `summarise(.by = c(Talker, Fricative))` for per-operation grouping
##   (`?dplyr::dplyr_by`) instead.

While this gets our data into a format where all the data points are independent, a drawback of this is that we lose statistical power because we now have far fewer datapoints than were present in the raw dataset. Mixed effects modelling introduces random effects which allow us to use the raw dataset (which is much more powerful because of its greater degrees of freedom) but also account for the inter-dependence of data points by talker.

Let’s just look at alveolar and interdental fricatives, just as in Tutorial 8.

# Filter so we just have the alveolars and interdentals for now
mcm2 <- mcm %>%
  filter(poa == "alveolar" | poa == "interdental")

# Let's look at it as a pirate plot
pirateplot(M1 ~ voicing * poa, mcm2, inf.method = 'ci')

Now, instead of using our dataframe of by-speaker means, let’s use the raw dataset to make our model. The difference from our previous models is that we’ll be using the lmer() function from the lmerTest package. If you don’t already have it, install the package; then load it.

# Install lmerTest
# install.packages("lmerTest")

# Load lmerTest
library(lmerTest)

## Loading required package: lme4

## 
## Attaching package: 'lmerTest'

## The following object is masked from 'package:lme4':
## 
##     lmer

## The following object is masked from 'package:stats':
## 
##     step

According to Schertz Chapter 11, you should include a random intercept if you have multiple observations per participant, but each participant is present in only one level of predictor. That would look like this:

# Making model with random intercept
model <- lmer(M1 ~ voicing * poa + (1|Talker), mcm2)
summary(model)

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: M1 ~ voicing * poa + (1 | Talker)
##    Data: mcm2
## 
## REML criterion at convergence: 24301.2
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.6675 -0.4645  0.0382  0.5831  3.8037 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  Talker   (Intercept)  393817   627.5  
##  Residual             1279920  1131.3  
## Number of obs: 1437, groups:  Talker, 20
## 
## Fixed effects:
##                                 Estimate Std. Error       df t value Pr(>|t|)
## (Intercept)                      6768.82     152.50    24.26  44.386   <2e-16
## voicingvoiceless                   92.19      84.44  1414.01   1.092    0.275
## poainterdental                  -1874.56      84.44  1414.01 -22.199   <2e-16
## voicingvoiceless:poainterdental  2066.27     119.38  1414.01  17.308   <2e-16
##                                    
## (Intercept)                     ***
## voicingvoiceless                   
## poainterdental                  ***
## voicingvoiceless:poainterdental ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) vcngvc pntrdn
## voicngvclss -0.277              
## poaintrdntl -0.277  0.500       
## vcngvclss:p  0.196 -0.707 -0.707

According to Schertz, you should include a random slope if you have multiple observations per participant, and each participant is in both levels of the predictor. That’s the case here – each participant produced both voiced and voiceless fricatives, and both interdental and alveolar fricatives. Ideally, your model might look something like this:

# Making model with random slope
model <- lmer(M1 ~ voicing * poa + (1 + voicing|Talker) + (1 + poa|Talker), mcm2)

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model failed to converge with max|grad| = 0.00711232 (tol = 0.002, component 1)
##   See ?lme4::convergence and ?lme4::troubleshooting.

summary(model)

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: M1 ~ voicing * poa + (1 + voicing | Talker) + (1 + poa | Talker)
##    Data: mcm2
## 
## REML criterion at convergence: 23727.3
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.3298 -0.4250  0.0460  0.4941  4.5366 
## 
## Random effects:
##  Groups   Name             Variance Std.Dev. Corr 
##  Talker   (Intercept)       697660   835.3        
##           voicingvoiceless  414425   643.8   -1.00
##  Talker.1 (Intercept)      1063570  1031.3        
##           poainterdental   1534061  1238.6   -0.99
##  Residual                   809103   899.5        
## Number of obs: 1437, groups:  Talker, 20
## 
## Fixed effects:
##                                 Estimate Std. Error       df t value Pr(>|t|)
## (Intercept)                      6770.86     300.53    34.23  22.530  < 2e-16
## voicingvoiceless                   90.25     158.84    22.90   0.568    0.575
## poainterdental                  -1876.81     284.98    20.10  -6.586 1.99e-06
## voicingvoiceless:poainterdental  2068.42      94.92  1376.06  21.792  < 2e-16
##                                    
## (Intercept)                     ***
## voicingvoiceless                   
## poainterdental                  ***
## voicingvoiceless:poainterdental ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) vcngvc pntrdn
## voicngvclss -0.608              
## poaintrdntl -0.768  0.050       
## vcngvclss:p  0.079 -0.299 -0.167
## optimizer (nloptwrap) convergence code: 0 (OK)
## Model failed to converge with max|grad| = 0.00711232 (tol = 0.002, component 1)
##   See ?lme4::convergence and ?lme4::troubleshooting.

However, you’ll note the warning here that this model fails to converge. Oops! This is because we’ve stuffed too much into the model, given the amount of data that we have. It would also be bad if we got a “boundary (singular) fit” message when we run the model. In this case, we need to make a decision about what might be most useful to keep in. Since we can see from the visualization that we still want the interaction of the main effects, it makes sense here for us to just choose one of the random slopes and delete the other one. Hard to say which is better, but you can see that the p-values slightly change depending on which one you run:

# Making model with one random slope
model1 <- lmer(M1 ~ voicing * poa + (1 + poa|Talker), mcm2)
summary(model1)

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: M1 ~ voicing * poa + (1 + poa | Talker)
##    Data: mcm2
## 
## REML criterion at convergence: 23878.4
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.0332 -0.4385  0.0745  0.5089  4.1508 
## 
## Random effects:
##  Groups   Name           Variance Std.Dev. Corr 
##  Talker   (Intercept)    1220197  1104.6        
##           poainterdental 1530454  1237.1   -0.88
##  Residual                 910550   954.2        
## Number of obs: 1437, groups:  Talker, 20
## 
## Fixed effects:
##                                 Estimate Std. Error       df t value Pr(>|t|)
## (Intercept)                      6771.74     252.08    19.78  26.863  < 2e-16
## voicingvoiceless                   89.62      71.23  1395.01   1.258    0.208
## poainterdental                  -1877.82     285.65    20.24  -6.574 1.98e-06
## voicingvoiceless:poainterdental  2069.17     100.69  1395.01  20.549  < 2e-16
##                                    
## (Intercept)                     ***
## voicingvoiceless                   
## poainterdental                  ***
## voicingvoiceless:poainterdental ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) vcngvc pntrdn
## voicngvclss -0.141              
## poaintrdntl -0.871  0.125       
## vcngvclss:p  0.100 -0.707 -0.176

# Making model with one random slope
model2 <- lmer(M1 ~ voicing * poa + (1 + voicing|Talker), mcm2)
summary(model2)

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: M1 ~ voicing * poa + (1 + voicing | Talker)
##    Data: mcm2
## 
## REML criterion at convergence: 24216.2
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.7007 -0.4555  0.0484  0.5365  4.4238 
## 
## Random effects:
##  Groups   Name             Variance Std.Dev. Corr 
##  Talker   (Intercept)       793354   890.7        
##           voicingvoiceless  395592   629.0   -0.89
##  Residual                  1184335  1088.3        
## Number of obs: 1437, groups:  Talker, 20
## 
## Fixed effects:
##                                 Estimate Std. Error       df t value Pr(>|t|)
## (Intercept)                      6768.01     207.28    20.55  32.651   <2e-16
## voicingvoiceless                   92.81     162.41    24.83   0.571    0.573
## poainterdental                  -1873.60      81.23  1395.03 -23.065   <2e-16
## voicingvoiceless:poainterdental  2065.49     114.84  1395.03  17.986   <2e-16
##                                    
## (Intercept)                     ***
## voicingvoiceless                   
## poainterdental                  ***
## voicingvoiceless:poainterdental ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) vcngvc pntrdn
## voicngvclss -0.836              
## poaintrdntl -0.196  0.250       
## vcngvclss:p  0.139 -0.354 -0.707

In most cases, we proceed with interpreting our model outputs the same way as with our lm() regressions. No need to interpret the outputs regarding the random effects, but see Winter (2020) for more details if you wish.

The above models are dummy-coded (R’s default) but random effects can also be put into models with any other type of coding.

Practice with mixed models

With the fricatives dataset, consider just /f/ and /v/ (as in the Tutorial 8 practice). Instead of modelling the by-speaker means, try doing your modelling with the raw dataset and integrating a random intercept of Talker. Also try a random slope of Fricative by Talker.

LING 3300 - Week 9: Linear mixed effects modelling

Thomas Kettig

March 26, 2026

Practice with mixed models