Week 19. Workbook introduction to Generalized Linear Mixed-effects Models

Written by Rob Davies

Week 19 Introduction to Generalized Linear Mixed-effects Models

Welcome to your overview of the work we will do together in Week 19.

We extend our understanding and skills by moving to examine data where the outcome variable is categorical: this is a context that requires the use of Generalized Linear Mixed-effects Models (GLMMs).

Important

Categorical outcomes cannot be analyzed using linear models, in whatever form, without having to make some important compromises.

You need to do something about the categorical nature of the outcome.

Targets

Here, we look at Generalized Linear Mixed-effects Models (GLMMs): we can use these models to analyze outcome variables of different kinds, including outcome variables like response accuracy that are coded using discrete categories (e.g. correct vs. incorrect).

In this workbook, and in our conceptual introduction, our aims are to:

  1. Recognize the limitations of alternative (traditional) methods for analyzing such outcomes.
  2. Understand the practical reasons for using GLMMs when we analyze discrete outcome variables.
  3. Practice running GLMMs with varying random effects structures.
  4. Practice reporting the results of GLMMs, including through the use of model plots.

Learning resources

You will see, next, the lectures we share to explain the concepts you will learn about, and the practical data analysis skills you will develop. Then you will see information about the practical materials you can use to build and practise your skills.

Every week, you will learn best if you first watch the lectures then do the practical exercises.

Linked resources
  1. We learned about multilevel structured data in the conceptual introduction to multilevel data and the workbook introduction to multilevel data.
  2. We then deepened our understanding by looking at the analysis of data from studies with repeated-measures designs in the conceptual introduction to linear mixed-effects models and the workbook introduction to mixed-effects models.
  3. We further extended our understanding and practice skills in the chapter on developing linear mixed-effects models and the corresponding workbook.

This workbook introduction on Generalized Linear Mixed-effects Models (GLMMs) is linked to the corresponding chapter where the explanation of ideas or of practical analysis steps is set out more extensively or in more depth.

Lectures

The lecture materials for this week are presented in three short parts.

Click on a link and your browser should open a tab showing the Panopto video for the lecture part.

  1. Part 1 (20 minutes): Understand the reasons for using Generalized Linear Mixed-effects Models (GLMMs); categorical outcomes like response accuracy and the bad compromises involved in (traditional methods) not using GLMMs to analyze them; understanding what the generalized part of this involves.
  1. Part 2 (16 minutes): The questions, design and methods of the working data example study; category coding practicalities; specifying a GLMM appropriate to the design; the GLMM results summary, reading the results, visualizing the results.
  1. Part 3 (19 minutes): What random effects should we include; recognizing when a model has got into trouble, and what to do about it; general approaches to model specification; reporting model results.

Lecture slides

Download the lecture slides

You can download the lecture slides in two different versions:

The GLMM.pdf version is the version delivered for the lecture recordings. To make the slides easier to download, I produced a six-slide-per-page version, GLMM_6pp.pdf. This should be easier to download and print out if that is what you want to do.

Practical materials: data and R-Studio

We will be working with data collected for a study investigating word learning in children, reported by Ricketts et al. (2021). You will see that the study design has both a repeated measures aspect because each child is asked to respond to multiple stimuli, and a longitudinal aspect because responses are recorded at two time points. Because responses were observed to multiple stimuli for each child, and because responses were recorded at multiple time points, the data have a multilevel structure. These features require the use of mixed-effects models for analysis.

We will see, also, that the study involves the factorial manipulation of learning conditions. This means that, when you see the description of the study design, you will see embedded in it a 2 x 2 factorial design. You will be able to generalize from our work here to many other research contexts where psychologists conduct experiments in which conditions are manipulated according to a factorial design.

However, our focus now is on the fact that the outcome for analysis is the accuracy of the responses made by children to word targets in a spelling task. The categorical nature of accuracy as an outcome is the reason why we now turn to use Generalized Linear Mixed-effects Models.

You can read more about these data in the conceptual introduction chapter on GLMMs.

We addressed three research questions and tested predictions in relation to each question.

  1. Does the presence of orthography promote greater word learning?
  • We predicted that children would demonstrate greater orthographic learning for words that they had seen (orthography present condition) versus not seen (orthography absent condition).
  1. Will orthographic facilitation be greater when the presence of orthography is emphasized explicitly during teaching?
  • We expected to observe an interaction between instructions and orthography, with the highest levels of learning when the orthography present condition was combined with explicit instructions.
  1. Does word consistency moderate the orthographic facilitation effect?
  • For orthographic learning, we expected that the presence of orthography might be particularly beneficial for words with higher spelling-sound consistency, with learning highest when children saw and heard the word, and these codes provided overlapping information.
Important

Get the data: get the data file and the .R script you can use to do the exercises that will support your learning.

  • You can download the files folder for this chapter by clicking on the link 04-glmm.zip.

The practical materials folder includes data files and an .R script:

In this chapter, we will be working with the data about the orthographic post-test outcome for the Ricketts word learning study:

  • long.orth_2020-08-11.csv

The data file is collected together with the .R script:

  • 04-glmm-workbook.R the workbook you will need to do the practical exercises.

The data come from the Ricketts et al. (2021) study, and you can access the analysis code and data for that study, in full, at the OSF repository here

Important

You can access the sign-in page for R-Studio Server here

Practical materials guide

The aims of the practical work are to:

  • Understand the reasons for using Generalized Linear Mixed-effects models (GLMMs) when we analyze discrete outcome variables.
  • Recognize the limitations of alternative methods for analyzing such outcomes.
  • Practice running GLMMs with varying random effects structures.
  • Practice reporting the results of GLMMs, including through the use of model plots.

My recommendations for learning are that you should aim to:

  • run GLMMs of demonstration data;
  • run GLMMs of alternate data sets;
  • play with the .R code used to create examples for the lecture;
  • and edit example code to create alternate visualizations.

The practical exercises

Now you will progress through a series of tasks, and challenges, to aid your learning.

Warning

We will work with the data file:

  • long.orth_2020-08-11.csv

We again split the steps into into parts, tasks and questions.

We are going to work through the following workflow steps: each step is labelled as a practical part.

  1. Set-up
  2. Load the data
  3. Tidy data
  4. Analyze the data: random intercepts
  5. Analyze the data: model comparisons

In the following, we will guide you through the tasks and questions step by step.

Important

An answers version of the workbook will be provided after the practical class.

Practical Part 1: Set-up

To begin, we set up our environment in R.

Practical Task 1 – Run code to load relevant libraries

Use the library() function to make the functions we need available to you.

library(broom)
library(effects)
Loading required package: carData
lattice theme set by effectsTheme()
See ?effectsTheme for details.
library(gridExtra)
library(here)
here() starts at /Users/robdavies/Dropbox/teaching-statistics-year-MSc
library(lattice)
library(knitr)
library(lme4)
Loading required package: Matrix
library(MuMIn)
Registered S3 method overwritten by 'MuMIn':
  method        from 
  nobs.multinom broom
library(sjPlot)
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::combine() masks gridExtra::combine()
✖ tidyr::expand()  masks Matrix::expand()
✖ dplyr::filter()  masks stats::filter()
✖ dplyr::lag()     masks stats::lag()
✖ tidyr::pack()    masks Matrix::pack()
✖ tidyr::unpack()  masks Matrix::unpack()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

Practical Part 2: Load the data

Practical Task 2 – Read in the data file we will be using

Read the data file into R:

long.orth <- read_csv("long.orth_2020-08-11.csv", 
                      col_types = cols(
                        Participant = col_factor(),
                        Time = col_factor(),
                        Study = col_factor(),
                        Instructions = col_factor(),
                        Version = col_factor(),
                        Word = col_factor(),
                        Orthography = col_factor(),
                        Measure = col_factor(),
                        Spelling.transcription = col_factor()
                      )
                    )

You can see, here, that within the read_csv() function call, I specify col_types, instructing R how to treat a number of different variables.

  • You can read more about this convenient way to control the read-in process here.

Practical Part 3: Tidy the data

The data are already tidy: each column in long.orth_2020-08-11.csv corresponds to a variable and each row corresponds to an observation. However, we need to do a bit of work, before we can run any analyses, to fix the coding of the categorical predictor (or independent) variables: the factors Orthography, Instructions, and Time.

Practical Task 3 – Inspect the data

It is always a good to inspect what you have got when you read a data file in to R.

summary(long.orth)

You will see that some of the variables included in the .csv file are listed, following, with information about value coding or calculation.

  • Participant – Participant identity codes were used to anonymize participation.
  • Time – Test time was coded 1 (time 1) or 2 (time 2). For the Study 1 longitudinal data, it can be seen that each participant identity code is associated with observations taken at test times 1 and 2.
  • Instructions – Variable coding for whether participants undertook training in the explicit} or incidental} conditions.
  • Word – Letter string values showing the words presented as stimuli to the children.
  • Orthography – Variable coding for whether participants had seen a word in training in the orthography absent or present conditions.
  • Consistency-H – Calculated orthography-to-phonology consistency value for each word. -zConsistency-H – Standardized Consistency H scores
  • Score – Outcome variable – for the orthographic post-test, responses were scored as 1 (correct, if the target spelling was produced in full) or 0 (incorrect, if the target spelling was not produced).

The summary will show you that we have a number of other variables available, including measures of individual differences in reading or reading-related abilities or knowledge, but we do not need to pay attention to them, for our exercises.

Pract.Q.1. Look at the summaries for the variables Time, Instructions and Orthography. Assuming that the read_csv() action did what was required, you will see how R presents factor summaries by default. What do the variable summaries tell you about the factor level coding, and the number of observations in each level?

For a factor like Orthography, the column values code for whether the observations in a data row are associated with the condition absent or the condition present.

It is simpler to talk about absent or present as being levels of Orthography condition.

  • You can ask R what the levels of a factor are using: levels(long.orth$Orthography)

Pract.A.1. The summary shows:

Time (1, 655; 2, 608); Instructions (explicit, 592; incidental, 671); and Orthography (absent, 631; present, 632).

Pract.Q.2. Are there any surprises in the summary of factors?

We would hope to see equal numbers of observations at different levels for a factor.

Pract.A.2. It should be surprising that we do not have equal numbers of observations.

Practical Task 4 – Fit a Generalized Linear Model

Fit a simple Generalized Linear Model with outcome (accuracy of response) Score, and Instructions as the predictor.

  • Note that you are ignoring random effects, for this task.
summary(glm(Score ~ Instructions, family = "binomial", data = long.orth))

Pract.Q.3. What is the estimated effect of Instructions on Score?

Pract.A.3. The estimate is: Instructionsincidental -0.05932

Pract.Q.4. Can you briefly explain in words what the estimate says about how log odds Score (response correct vs. incorrect) changes in association with different Instructions conditions?

Pract.A.4. We can say, simply, that the log odds that a response would be correct were lower in the incidental compared to the explicit Instructions condition.

This task and these questions are designed to alert you to the challenges involved in estimating the effect of categorical variables, and of interpreting the effects estimates.

Tip

This model (and default coding) gives us an estimate of how the log odds of a child getting a response correct changes if we compare the responses in the explicit condition (here, treated as the baseline or reference level) with responses in the incidental condition.

R tells us about the estimate by adding the name of the factor level that is not the reference level, here, incidental to the name of the variable Instructions whose effect is being estimated.

Practical Task 5 – Use the {memisc} library and recode the factors, to use contrast sum (effect) coding

However, it is better not to use R’s default dummy coding scheme if we are analyzing data, where the data come from a study involving two or more factors, and we want to estimate not just the main effects of the factors but also the effect of the interaction between the factors.

In our analyses, we want the coding that allows us to get estimates of the main effects of factors, and of the interaction effects, somewhat like what we would get from an ANOVA. This requires us to use effect coding.

We can code whether a response was recorded in the absent or present condition using numbers. In dummy coding, for any observation, we would use a column of zeroes or ones to code condition: i.e., absent (0) or present (1). In effect coding, for any observation, we would use a column of ones or minus ones to code condition: i.e., absent (-1) or present (1). (With a factor with more than two levels, we would use more than one column to do the coding: the number of columns we would use would equal the number of factor condition levels minus one.) In effect coding, observations coded -1 are in the reference level.

With effect coding, the constant (i.e., the intercept for our model) is equal to the grand mean of all the observed responses. And the coefficient of each of the effect variables is equal to the difference between the mean of the group coded 1 and the grand mean.

  • You can read more about effect coding here or here.

We follow recommendations to use sum contrast coding for the experimental factors. Further, to make interpretation easier, we want the coding to work so that for both Orthography and Instructions conditions, doing something is the “high” level in the factor – hence:

  • Orthography, absent (-1) vs. present (+1)
  • Instructions, incidental (-1) vs. explicit (+1)
  • Time, test time 1 (-1) vs. time 2 (+1)

We use a modified version of the contr.sum() function (provided in the {memisc} library) that allows us to define the base or reference level for the factor manually (see documentation).

We need to start by loading the {memisc} library.

library(memisc)

In some circumstances, this can create warnings.

  • You can see information about the potential warnings here.

Pract.Q.5. Can you work out how to use the contr.sum() function to change the coding of the categorical variables (factors) in the example data-set from dummy to effects coding?

  • It would be sensible to examine how R sees the coding of the different levels for each factor, before and then after you use contr.sum() to use that coding.

In the following sequence, I first check how R codes the levels of each factor by default, I then change the coding, and check that the change gets me what I want.

We want effects coding for the orthography condition factor, with orthography condition coded as -1, +1. Check the coding.

contrasts(long.orth$Orthography)
        present
absent        0
present       1

You can see that Orthography condition is initially coded, by default, using dummy coding: absent (0); present (1). We want to change the coding, then check that we have got what we want.

contrasts(long.orth$Orthography) <- contr.sum(2, base = 1)
contrasts(long.orth$Orthography)
         2
absent  -1
present  1

We want effects coding for the Instructions condition factor, with Instructions condition coded as -1, +1. Check the coding.

contrasts(long.orth$Instructions)
           incidental
explicit            0
incidental          1

Change it.

contrasts(long.orth$Instructions) <- contr.sum(2, base = 2)
contrasts(long.orth$Instructions)
            1
explicit    1
incidental -1

We want effects coding for the Time factor, with Time coded as -1, +1 Check the coding.

contrasts(long.orth$Time)
  2
1 0
2 1

Change it.

contrasts(long.orth$Time) <- contr.sum(2, base = 1)
contrasts(long.orth$Time)
   2
1 -1
2  1

In these chunks of code, I use contr.sum(a, base = b) to do the coding, where a is the number of levels in a factor (replace a with the right number), and b tells R which level to use as the baseline or reference level (replace b with the right number). I usually need to check the coding before and after I specify it.

Pract.Q.6. Experiment: what happens if you change the first number for one of the factors?

Run the following code example, and reflect on what you see:

contrasts(long.orth$Time) <- contr.sum(3, base = 1)
contrasts(long.orth$Time)

Pract.A.6. Changing the first number from 2 will result in warnings, showing that the number must match the number of factor levels for the factor.

Pract.Q.7. Experiment: what happens if you change the base number for one of the factors?

Run the following code example, and reflect on what you see:

contrasts(long.orth$Time) <- contr.sum(2, base = 1)
contrasts(long.orth$Time)

Pract.A.7. Changing the base number changes which level gets coded as -1 vs. 1.

Practical Part 4: Analyze the data: random intercepts

Practical Task 6 – Take a quick look at a random intercepts model

Specify a model with:

  1. The correct function
  2. Outcome = Score
  3. Predictors including:
  • Time + Orthography + Instructions + zConsistency_H +
  • Orthography:Instructions +
  • Orthography:zConsistency_H +
  1. Plus random effects:
  • Random effect of participants (Participant) on intercepts
  • Random effect of items (Word) on intercepts
  1. Using the "bobyqa" optimizer

Run the code to fit the model and get a summary of results.

long.orth.min.glmer <- glmer(Score ~ 
                               
                          Time + Orthography + Instructions + zConsistency_H + 
                               
                          Orthography:Instructions +
                               
                          Orthography:zConsistency_H +
                               
                          (1 | Participant) + 
                               
                          (1 |Word),
                             
                          family = "binomial", 
                          glmerControl(optimizer="bobyqa", optCtrl=list(maxfun=2e5)),
                             
                          data = long.orth)

summary(long.orth.min.glmer)
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: Score ~ Time + Orthography + Instructions + zConsistency_H +  
    Orthography:Instructions + Orthography:zConsistency_H + (1 |  
    Participant) + (1 | Word)
   Data: long.orth
Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))

      AIC       BIC    logLik -2*log(L)  df.resid 
   1040.4    1086.7    -511.2    1022.4      1254 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-5.0994 -0.4083 -0.2018  0.2019  7.4940 

Random effects:
 Groups      Name        Variance Std.Dev.
 Participant (Intercept) 1.840    1.357   
 Word        (Intercept) 2.224    1.491   
Number of obs: 1263, groups:  Participant, 41; Word, 16

Fixed effects:
                             Estimate Std. Error z value Pr(>|z|)    
(Intercept)                 -1.878464   0.443942  -4.231 2.32e-05 ***
Time2                        0.050136   0.083325   0.602    0.547    
Orthography2                 0.455009   0.086813   5.241 1.59e-07 ***
Instructions1                0.042290   0.230335   0.184    0.854    
zConsistency_H              -0.618092   0.384002  -1.610    0.107    
Orthography2:Instructions1   0.005786   0.083187   0.070    0.945    
Orthography2:zConsistency_H  0.014611   0.083105   0.176    0.860    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) Time2  Orthg2 Instr1 zCns_H Or2:I1
Time2        0.008                                   
Orthogrphy2 -0.059  0.008                            
Instructns1  0.014  0.025  0.001                     
zCnsstncy_H  0.016 -0.002 -0.029 -0.001              
Orthgrp2:I1 -0.002 -0.001  0.049 -0.045  0.000       
Orthgr2:C_H -0.027  0.001  0.179  0.000 -0.035 -0.007

Pract.Q.8. Experiment: replace the fixed effects with another variable e.g. mean_z_read (an aggregate measure of reading skill) to get a feel for how the code works.

  • What is the estimate for the mean_z_read effect?
long.orth.min.glmer.expt <- glmer(Score ~ mean_z_read +

                               (1 | Participant) +

                               (1 |Word),

                             family = "binomial",
                             glmerControl(optimizer="bobyqa", 
                                          optCtrl=list(maxfun=2e5)),

                             data = long.orth)

summary(long.orth.min.glmer.expt)
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: Score ~ mean_z_read + (1 | Participant) + (1 | Word)
   Data: long.orth
Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))

      AIC       BIC    logLik -2*log(L)  df.resid 
   1000.4    1021.0    -496.2     992.4      1259 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-6.1454 -0.4180 -0.2111  0.2543  7.3457 

Random effects:
 Groups      Name        Variance Std.Dev.
 Participant (Intercept) 0.1363   0.3692  
 Word        (Intercept) 2.3481   1.5324  
Number of obs: 1263, groups:  Participant, 41; Word, 16

Fixed effects:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  -1.7967     0.4027  -4.461 8.14e-06 ***
mean_z_read   1.5090     0.1522   9.913  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr)
mean_z_read -0.121

Pract.A.8. The estimate is: mean_z_read 1.5090

Pract.Q.9. Can you briefly indicate in words what the estimate says about how log odds Score (response correct vs. incorrect) changes in association with mean_z_read score?

Pract.A.9. We can say, simply, that log odds that a response would be correct were higher for increasing values of mean_z_read.

Practical Task 7 – Visualize the significant effects of the Orthography and Consistency factors

Fit the first model you were asked to fit.

long.orth.min.glmer <- glmer(Score ~ 
                               
                          Time + Orthography + Instructions + zConsistency_H + 
                               
                          Orthography:Instructions +
                               
                          Orthography:zConsistency_H +
                               
                          (1 | Participant) + 
                               
                          (1 |Word),
                             
                          family = "binomial", 
                          glmerControl(optimizer="bobyqa", optCtrl=list(maxfun=2e5)),
                             
                          data = long.orth)

summary(long.orth.min.glmer)
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: Score ~ Time + Orthography + Instructions + zConsistency_H +  
    Orthography:Instructions + Orthography:zConsistency_H + (1 |  
    Participant) + (1 | Word)
   Data: long.orth
Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))

      AIC       BIC    logLik -2*log(L)  df.resid 
   1040.4    1086.7    -511.2    1022.4      1254 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-5.0994 -0.4083 -0.2018  0.2019  7.4940 

Random effects:
 Groups      Name        Variance Std.Dev.
 Participant (Intercept) 1.840    1.357   
 Word        (Intercept) 2.224    1.491   
Number of obs: 1263, groups:  Participant, 41; Word, 16

Fixed effects:
                             Estimate Std. Error z value Pr(>|z|)    
(Intercept)                 -1.878464   0.443942  -4.231 2.32e-05 ***
Time2                        0.050136   0.083325   0.602    0.547    
Orthography2                 0.455009   0.086813   5.241 1.59e-07 ***
Instructions1                0.042290   0.230335   0.184    0.854    
zConsistency_H              -0.618092   0.384002  -1.610    0.107    
Orthography2:Instructions1   0.005786   0.083187   0.070    0.945    
Orthography2:zConsistency_H  0.014611   0.083105   0.176    0.860    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) Time2  Orthg2 Instr1 zCns_H Or2:I1
Time2        0.008                                   
Orthogrphy2 -0.059  0.008                            
Instructns1  0.014  0.025  0.001                     
zCnsstncy_H  0.016 -0.002 -0.029 -0.001              
Orthgrp2:I1 -0.002 -0.001  0.049 -0.045  0.000       
Orthgr2:C_H -0.027  0.001  0.179  0.000 -0.035 -0.007

Then produce plots to show the predicted change in outcome, given the model effects estimates.

  • Can you work out how to do this?

There are different ways to complete this task. Two convenient methods involve:

  • using the {sjPlot} plot_model() function;
  • using the {effects} library effect() function

First using the {sjPlot} plot_model() function:

porth <- plot_model(long.orth.min.glmer,
           type="pred",
           terms = "Orthography") +
         theme_bw() +
         ggtitle("Predicted probability") +
         ylim(0,1)
You are calculating adjusted predictions on the population-level (i.e.
  `type = "fixed"`) for a *generalized* linear mixed model.
  This may produce biased estimates due to Jensen's inequality. Consider
  setting `bias_correction = TRUE` to correct for this bias.
  See also the documentation of the `bias_correction` argument.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
pzconsH <- plot_model(long.orth.min.glmer,
           type="pred",
           terms = "zConsistency_H") +
         theme_bw() +
         ggtitle("Predicted probability") +
         ylim(0,1)
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
grid.arrange(porth, pzconsH,
            ncol=2)

Second using the {effects} library effect() function”

porth <- plot(effect("Orthography", mod = long.orth.min.glmer))
NOTE: Orthography is not a high-order term in the model
pzconsH <- plot(effect("zConsistency_H", mod = long.orth.min.glmer))
NOTE: zConsistency_H is not a high-order term in the model
grid.arrange(porth, pzconsH,
            ncol=2)

Practical Task 8 – Pick a different predictor variable to visualize

Notice that in the preceding code chunks, I assign the plot objects to names and then use `grid.arrange() to present the named plots in grids.

  • To produce and show a plot, don’t do that, just adapt and use the plot functions, as shown in the next code example.
plot_model(long.orth.min.glmer,
           type="pred",
           terms = "Instructions") +
  theme_bw() +
  ggtitle("Predicted probability") +
  ylim(0,1)
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.

Practical Task 9 – Can you figure out how to plot interaction effects?

We can get a plot showing the predicted impact of the Instructions x Orthography interaction effect as follows.

plot_model(long.orth.min.glmer,
           type="pred",
           terms = c("Instructions", "Orthography")) +
  theme_bw() +
  ggtitle("Predicted probability") +
  ylim(0,1)
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.

Practical Part 5: Analyze the data: model comparisons

Practical Task 10 – We are now going to fit a series of models and evaluate their relative fit

Note that the models all have the same fixed effects.

Note also that while we will be comparing models varying in random effects we are not going to use REML=TRUE

  • See here for why:

https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#reml-for-glmms

First fit the minimum random intercepts model.

Specify a model with:

  1. The correct function
  2. Outcome = Score
  3. Predictors =
  • Time + Orthography + Instructions + zConsistency_H +
  • Orthography:Instructions +
  • Orthography:zConsistency_H +
  1. Plus random effects:
  • Random effect of participants (Participant) on intercepts
  • Random effect of items (Word) on intercepts
  1. Using the "bobyqa" optimizer

Run the code to fit the model and get a summary of results, as shown in the chapter

long.orth.min.glmer <- glmer(Score ~ 
                               
                            Time + Orthography + Instructions + zConsistency_H + 
                               
                            Orthography:Instructions +
                               
                            Orthography:zConsistency_H +
                               
                               (1 | Participant) + 
                               
                               (1 |Word),
                             
                            family = "binomial", 
                            glmerControl(optimizer="bobyqa", 
                                         optCtrl=list(maxfun=2e5)),
                             
                             data = long.orth)

summary(long.orth.min.glmer)
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: Score ~ Time + Orthography + Instructions + zConsistency_H +  
    Orthography:Instructions + Orthography:zConsistency_H + (1 |  
    Participant) + (1 | Word)
   Data: long.orth
Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))

      AIC       BIC    logLik -2*log(L)  df.resid 
   1040.4    1086.7    -511.2    1022.4      1254 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-5.0994 -0.4083 -0.2018  0.2019  7.4940 

Random effects:
 Groups      Name        Variance Std.Dev.
 Participant (Intercept) 1.840    1.357   
 Word        (Intercept) 2.224    1.491   
Number of obs: 1263, groups:  Participant, 41; Word, 16

Fixed effects:
                             Estimate Std. Error z value Pr(>|z|)    
(Intercept)                 -1.878464   0.443942  -4.231 2.32e-05 ***
Time2                        0.050136   0.083325   0.602    0.547    
Orthography2                 0.455009   0.086813   5.241 1.59e-07 ***
Instructions1                0.042290   0.230335   0.184    0.854    
zConsistency_H              -0.618092   0.384002  -1.610    0.107    
Orthography2:Instructions1   0.005786   0.083187   0.070    0.945    
Orthography2:zConsistency_H  0.014611   0.083105   0.176    0.860    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) Time2  Orthg2 Instr1 zCns_H Or2:I1
Time2        0.008                                   
Orthogrphy2 -0.059  0.008                            
Instructns1  0.014  0.025  0.001                     
zCnsstncy_H  0.016 -0.002 -0.029 -0.001              
Orthgrp2:I1 -0.002 -0.001  0.049 -0.045  0.000       
Orthgr2:C_H -0.027  0.001  0.179  0.000 -0.035 -0.007

Second the maximum model: this will take several seconds to run.

Specify a model with:

  1. The correct function
  2. Outcome = Score
  3. Predictors =
  • Time + Orthography + Instructions + zConsistency_H +
  • Orthography:Instructions +
  • Orthography:zConsistency_H +
  1. Plus random effects:
  • Random effect of participants (Participant) on intercepts
  • Random effect of items (Word) on intercepts
  1. Plus:
  • Random effects of participants on the slopes of within-participant effects
  • Random effects of items on the slopes of within-item effects
  1. Plus:
  • Allow for covariance between random intercepts and random slopes
  1. Using the "bobyqa" optimizer

Run the code to fit the model and get a summary of results, as shown in the chapter

long.orth.max.glmer <- glmer(Score ~ 
                           
                      Time + Orthography + Instructions + zConsistency_H + 
                           
                      Orthography:Instructions +
                           
                      Orthography:zConsistency_H +
                           
                      (Time + Orthography + zConsistency_H + 1 | Participant) + 
                           
                      (Time + Orthography + Instructions + 1 |Word),
                         
                      family = "binomial",
                      glmerControl(optimizer="bobyqa", 
                                   optCtrl=list(maxfun=2e5)),
                         
                         data = long.orth)
boundary (singular) fit: see help('isSingular')
summary(long.orth.max.glmer)
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: Score ~ Time + Orthography + Instructions + zConsistency_H +  
    Orthography:Instructions + Orthography:zConsistency_H + (Time +  
    Orthography + zConsistency_H + 1 | Participant) + (Time +  
    Orthography + Instructions + 1 | Word)
   Data: long.orth
Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))

      AIC       BIC    logLik -2*log(L)  df.resid 
   1053.6    1192.4    -499.8     999.6      1236 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-4.1014 -0.4027 -0.1723  0.2037  7.0331 

Random effects:
 Groups      Name           Variance Std.Dev. Corr             
 Participant (Intercept)    2.043127 1.42938                   
             Time2          0.005675 0.07533   0.62            
             Orthography2   0.079980 0.28281   0.78 -0.01      
             zConsistency_H 0.065576 0.25608   0.49  0.99 -0.16
 Word        (Intercept)    2.793448 1.67136                   
             Time2          0.046736 0.21618   0.14            
             Orthography2   0.093740 0.30617  -0.68 -0.81      
             Instructions1  0.212706 0.46120  -0.74 -0.05  0.38
Number of obs: 1263, groups:  Participant, 41; Word, 16

Fixed effects:
                            Estimate Std. Error z value Pr(>|z|)    
(Intercept)                 -2.10099    0.49589  -4.237 2.27e-05 ***
Time2                        0.02077    0.12285   0.169 0.865740    
Orthography2                 0.52480    0.15496   3.387 0.000708 ***
Instructions1                0.24467    0.27281   0.897 0.369805    
zConsistency_H              -0.67818    0.36311  -1.868 0.061803 .  
Orthography2:Instructions1  -0.05133    0.10004  -0.513 0.607907    
Orthography2:zConsistency_H  0.05850    0.11634   0.503 0.615064    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) Time2  Orthg2 Instr1 zCns_H Or2:I1
Time2        0.097                                   
Orthogrphy2 -0.280 -0.187                            
Instructns1 -0.278  0.023  0.109                     
zCnsstncy_H  0.062  0.005 -0.064  0.120              
Orthgrp2:I1  0.024  0.005 -0.071  0.212 -0.065       
Orthgr2:C_H -0.062  0.049  0.246 -0.031 -0.440  0.001
optimizer (bobyqa) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')

Then fit models, building on the first minimal (random intercepts) model: adding one extra random effect at a time.

Add code to include a random effect of participants on the slopes of the effect of Orthography.

  • And add code to include a random effect of items on the slopes of the effect of Orthography.

Run the code to fit the model and get a summary of results, as shown in the chapter

long.orth.2.glmer <- glmer(Score ~ 
                             Time + Orthography + Instructions + zConsistency_H + 
                             
                             Orthography:Instructions +
                             
                             Orthography:zConsistency_H +
                             
                             (dummy(Orthography) + 1 || Participant) + 
                             
                             (dummy(Orthography) + 1 || Word),
                           
                           family = "binomial", 
                           glmerControl(optimizer="bobyqa", 
                                        optCtrl=list(maxfun=2e5)),
                           
                           data = long.orth)

summary(long.orth.2.glmer)
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: Score ~ Time + Orthography + Instructions + zConsistency_H +  
    Orthography:Instructions + Orthography:zConsistency_H + (dummy(Orthography) +  
    1 || Participant) + (dummy(Orthography) + 1 || Word)
   Data: long.orth
Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))

      AIC       BIC    logLik -2*log(L)  df.resid 
   1041.0    1097.6    -509.5    1019.0      1252 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.9373 -0.4132 -0.1952  0.1826  6.9614 

Random effects:
 Groups        Name               Variance Std.Dev.
 Participant   (Intercept)        1.57092  1.2534  
 Participant.1 dummy(Orthography) 0.57624  0.7591  
 Word          (Intercept)        2.36284  1.5372  
 Word.1        dummy(Orthography) 0.02101  0.1450  
Number of obs: 1263, groups:  Participant, 41; Word, 16

Fixed effects:
                              Estimate Std. Error z value Pr(>|z|)    
(Intercept)                 -1.9025027  0.4512443  -4.216 2.49e-05 ***
Time2                        0.0501974  0.0837186   0.600 0.548775    
Orthography2                 0.4135727  0.1120792   3.690 0.000224 ***
Instructions1                0.0455920  0.2234615   0.204 0.838333    
zConsistency_H              -0.6254414  0.3958971  -1.580 0.114151    
Orthography2:Instructions1   0.0019343  0.1038694   0.019 0.985142    
Orthography2:zConsistency_H -0.0007112  0.0877740  -0.008 0.993535    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) Time2  Orthg2 Instr1 zCns_H Or2:I1
Time2        0.007                                   
Orthogrphy2  0.034  0.007                            
Instructns1  0.013  0.025  0.007                     
zCnsstncy_H  0.017 -0.002 -0.024 -0.002              
Orthgrp2:I1  0.003  0.001  0.043  0.126  0.000       
Orthgr2:C_H -0.029  0.001  0.182  0.001 -0.024 -0.011

Add code to include a random effect of items on the slopes of the effect of Instructions.

Run the code to fit the model and get a summary of results, as shown in the chapter

long.orth.3.glmer <- glmer(Score ~ 
                             Time + Orthography + Instructions + zConsistency_H + 
                             
                             Orthography:Instructions +
                             
                             Orthography:zConsistency_H +
                             
                          (dummy(Orthography) + 1 || Participant) + 
                             
                          (dummy(Orthography) + dummy(Instructions) + 1 || Word),
                           
                           family = "binomial", 
                           glmerControl(optimizer="bobyqa", 
                                        optCtrl=list(maxfun=2e5)),
                           
                           data = long.orth)

summary(long.orth.3.glmer)
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: Score ~ Time + Orthography + Instructions + zConsistency_H +  
    Orthography:Instructions + Orthography:zConsistency_H + (dummy(Orthography) +  
    1 || Participant) + (dummy(Orthography) + dummy(Instructions) +  
    1 || Word)
   Data: long.orth
Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))

      AIC       BIC    logLik -2*log(L)  df.resid 
   1036.5    1098.2    -506.2    1012.5      1251 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.9951 -0.4068 -0.1920  0.1838  5.9308 

Random effects:
 Groups        Name                Variance Std.Dev.
 Participant   (Intercept)         1.64393  1.2822  
 Participant.1 dummy(Orthography)  0.55604  0.7457  
 Word          (Intercept)         1.94313  1.3940  
 Word.1        dummy(Orthography)  0.01607  0.1268  
 Word.2        dummy(Instructions) 0.86694  0.9311  
Number of obs: 1263, groups:  Participant, 41; Word, 16

Fixed effects:
                              Estimate Std. Error z value Pr(>|z|)    
(Intercept)                 -1.9621261  0.4400649  -4.459 8.25e-06 ***
Time2                        0.0507347  0.0843237   0.602  0.54740    
Orthography2                 0.4263703  0.1114244   3.827  0.00013 ***
Instructions1                0.1907423  0.2632605   0.725  0.46874    
zConsistency_H              -0.6270900  0.3669904  -1.709  0.08750 .  
Orthography2:Instructions1  -0.0265729  0.1048392  -0.253  0.79991    
Orthography2:zConsistency_H -0.0006305  0.0878823  -0.007  0.99428    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) Time2  Orthg2 Instr1 zCns_H Or2:I1
Time2        0.007                                   
Orthogrphy2  0.028  0.008                            
Instructns1 -0.127  0.022  0.019                     
zCnsstncy_H  0.017 -0.002 -0.026  0.011              
Orthgrp2:I1  0.013  0.001  0.003  0.068 -0.009       
Orthgr2:C_H -0.030  0.002  0.182  0.002 -0.026 -0.018

Add code to include a random effect of participants on the slopes of the effect of consistency (zConsistency_H).

Run the code to fit the model and get a summary of results, as shown in the chapter

long.orth.4.a.glmer <- glmer(Score ~ 
                             Time + Orthography + Instructions + zConsistency_H + 
                             
                             Orthography:Instructions +
                             
                             Orthography:zConsistency_H +
                             
                      (dummy(Orthography) + zConsistency_H + 1 || Participant) + 
                             
                      (dummy(Orthography) + dummy(Instructions) + 1 || Word),
                           
                           family = "binomial", 
                           glmerControl(optimizer="bobyqa", 
                                        optCtrl=list(maxfun=2e5)),
                           
                           data = long.orth)
boundary (singular) fit: see help('isSingular')
summary(long.orth.4.a.glmer)
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: Score ~ Time + Orthography + Instructions + zConsistency_H +  
    Orthography:Instructions + Orthography:zConsistency_H + (dummy(Orthography) +  
    zConsistency_H + 1 || Participant) + (dummy(Orthography) +  
    dummy(Instructions) + 1 || Word)
   Data: long.orth
Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))

      AIC       BIC    logLik -2*log(L)  df.resid 
   1038.5    1105.3    -506.2    1012.5      1250 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.9952 -0.4068 -0.1920  0.1838  5.9309 

Random effects:
 Groups        Name                Variance  Std.Dev. 
 Participant   (Intercept)         1.644e+00 1.282e+00
 Participant.1 dummy(Orthography)  5.560e-01 7.456e-01
 Participant.2 zConsistency_H      2.090e-10 1.446e-05
 Word          (Intercept)         1.943e+00 1.394e+00
 Word.1        dummy(Orthography)  1.604e-02 1.266e-01
 Word.2        dummy(Instructions) 8.669e-01 9.311e-01
Number of obs: 1263, groups:  Participant, 41; Word, 16

Fixed effects:
                              Estimate Std. Error z value Pr(>|z|)    
(Intercept)                 -1.9621306  0.4400477  -4.459 8.24e-06 ***
Time2                        0.0507346  0.0843236   0.602  0.54740    
Orthography2                 0.4263730  0.1114188   3.827  0.00013 ***
Instructions1                0.1907330  0.2632583   0.725  0.46875    
zConsistency_H              -0.6270898  0.3669803  -1.709  0.08749 .  
Orthography2:Instructions1  -0.0265706  0.1048370  -0.253  0.79992    
Orthography2:zConsistency_H -0.0006352  0.0878782  -0.007  0.99423    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) Time2  Orthg2 Instr1 zCns_H Or2:I1
Time2        0.007                                   
Orthogrphy2  0.028  0.008                            
Instructns1 -0.127  0.022  0.019                     
zCnsstncy_H  0.017 -0.002 -0.026  0.011              
Orthgrp2:I1  0.013  0.001  0.003  0.068 -0.009       
Orthgr2:C_H -0.030  0.002  0.182  0.002 -0.026 -0.018
optimizer (bobyqa) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')

Add code to include a random effect of participants on the slopes of the effect of Time.

  • And add code to include a random effect of items on the slopes of the effect of Time.

Run the code to fit the model and get a summary of results, as shown in the chapter

long.orth.4.b.glmer <- glmer(Score ~ 
                             Time + Orthography + Instructions + zConsistency_H + 
                             
                             Orthography:Instructions +
                             
                             Orthography:zConsistency_H +
                             
          (dummy(Orthography) + dummy(Time) + 1 || Participant) + 
                             
          (dummy(Orthography) + dummy(Instructions) + dummy(Time) + 1 || Word),
                           
                           family = "binomial", 
                           glmerControl(optimizer="bobyqa", 
                                        optCtrl=list(maxfun=2e5)),
                           
                           data = long.orth)
boundary (singular) fit: see help('isSingular')
summary(long.orth.4.b.glmer)
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: Score ~ Time + Orthography + Instructions + zConsistency_H +  
    Orthography:Instructions + Orthography:zConsistency_H + (dummy(Orthography) +  
    dummy(Time) + 1 || Participant) + (dummy(Orthography) + dummy(Instructions) +  
    dummy(Time) + 1 || Word)
   Data: long.orth
Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))

      AIC       BIC    logLik -2*log(L)  df.resid 
   1040.5    1112.5    -506.2    1012.5      1249 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.9952 -0.4068 -0.1920  0.1838  5.9309 

Random effects:
 Groups        Name                Variance  Std.Dev. 
 Participant   (Intercept)         1.644e+00 1.2821917
 Participant.1 dummy(Orthography)  5.560e-01 0.7456314
 Participant.2 dummy(Time)         0.000e+00 0.0000000
 Word          (Intercept)         1.943e+00 1.3939636
 Word.1        dummy(Orthography)  1.604e-02 0.1266475
 Word.2        dummy(Instructions) 8.669e-01 0.9310619
 Word.3        dummy(Time)         1.992e-08 0.0001411
Number of obs: 1263, groups:  Participant, 41; Word, 16

Fixed effects:
                              Estimate Std. Error z value Pr(>|z|)    
(Intercept)                 -1.9621304  0.4400607  -4.459 8.24e-06 ***
Time2                        0.0507347  0.0843238   0.602  0.54740    
Orthography2                 0.4263730  0.1114191   3.827  0.00013 ***
Instructions1                0.1907329  0.2632608   0.725  0.46876    
zConsistency_H              -0.6270910  0.3669875  -1.709  0.08750 .  
Orthography2:Instructions1  -0.0265706  0.1048370  -0.253  0.79992    
Orthography2:zConsistency_H -0.0006352  0.0878785  -0.007  0.99423    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) Time2  Orthg2 Instr1 zCns_H Or2:I1
Time2        0.007                                   
Orthogrphy2  0.028  0.008                            
Instructns1 -0.127  0.022  0.019                     
zCnsstncy_H  0.017 -0.002 -0.026  0.011              
Orthgrp2:I1  0.013  0.001  0.003  0.068 -0.009       
Orthgr2:C_H -0.030  0.002  0.182  0.002 -0.026 -0.018
optimizer (bobyqa) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')

Pract.Q.10. Which models converge and which models do not converge?

Pract.A.10. Models 4.a. (consistency) and 4.b. (time) should not converge.

Pract.Q.11. How can you tell?

Pract.A.11. Convergence warnings are presented for models that do not converge:

optimizer (bobyqa) convergence code: 0 (OK)

boundary (singular) fit: see help('isSingular')

Practical Task 11 – Now compare the models that do converge: min vs. 2, vs. 3
anova(long.orth.min.glmer, long.orth.2.glmer)
Data: long.orth
Models:
long.orth.min.glmer: Score ~ Time + Orthography + Instructions + zConsistency_H + Orthography:Instructions + Orthography:zConsistency_H + (1 | Participant) + (1 | Word)
long.orth.2.glmer: Score ~ Time + Orthography + Instructions + zConsistency_H + Orthography:Instructions + Orthography:zConsistency_H + (dummy(Orthography) + 1 || Participant) + (dummy(Orthography) + 1 || Word)
                    npar    AIC    BIC  logLik -2*log(L)  Chisq Df Pr(>Chisq)
long.orth.min.glmer    9 1040.4 1086.7 -511.20    1022.4                     
long.orth.2.glmer     11 1041.0 1097.6 -509.51    1019.0 3.3909  2     0.1835
anova(long.orth.min.glmer, long.orth.3.glmer)
Data: long.orth
Models:
long.orth.min.glmer: Score ~ Time + Orthography + Instructions + zConsistency_H + Orthography:Instructions + Orthography:zConsistency_H + (1 | Participant) + (1 | Word)
long.orth.3.glmer: Score ~ Time + Orthography + Instructions + zConsistency_H + Orthography:Instructions + Orthography:zConsistency_H + (dummy(Orthography) + 1 || Participant) + (dummy(Orthography) + dummy(Instructions) + 1 || Word)
                    npar    AIC    BIC  logLik -2*log(L)  Chisq Df Pr(>Chisq)  
long.orth.min.glmer    9 1040.4 1086.7 -511.20    1022.4                       
long.orth.3.glmer     12 1036.5 1098.2 -506.24    1012.5 9.9115  3    0.01933 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Practical Task 12 – Before we move on, check out some of the code adjustments we have used

Pract.Q.12. What does using bobyqa do? Delete glmerControl() and report what impact does this have?

Run the code to fit the model and get a summary of results

long.orth.3.glmer.check <- glmer(Score ~
                            
                            Time + Orthography + Instructions + zConsistency_H +

                            Orthography:Instructions +

                            Orthography:zConsistency_H +

                            (dummy(Orthography) + 1 || Participant) +

                            (dummy(Orthography) + dummy(Instructions) + 1 || Word),

                           family = "binomial",

                           data = long.orth)

summary(long.orth.3.glmer.check)
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: Score ~ Time + Orthography + Instructions + zConsistency_H +  
    Orthography:Instructions + Orthography:zConsistency_H + (dummy(Orthography) +  
    1 || Participant) + (dummy(Orthography) + dummy(Instructions) +  
    1 || Word)
   Data: long.orth

      AIC       BIC    logLik -2*log(L)  df.resid 
   1036.5    1098.2    -506.2    1012.5      1251 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.9949 -0.4068 -0.1920  0.1837  5.9304 

Random effects:
 Groups        Name                Variance Std.Dev.
 Participant   (Intercept)         1.64374  1.2821  
 Participant.1 dummy(Orthography)  0.55646  0.7460  
 Word          (Intercept)         1.94423  1.3944  
 Word.1        dummy(Orthography)  0.01615  0.1271  
 Word.2        dummy(Instructions) 0.86769  0.9315  
Number of obs: 1263, groups:  Participant, 41; Word, 16

Fixed effects:
                              Estimate Std. Error z value Pr(>|z|)    
(Intercept)                 -1.9617535  0.4401423  -4.457 8.31e-06 ***
Time2                        0.0507351  0.0843245   0.602  0.54740    
Orthography2                 0.4263842  0.1114435   3.826  0.00013 ***
Instructions1                0.1906959  0.2632724   0.724  0.46886    
zConsistency_H              -0.6275205  0.3670921  -1.709  0.08737 .  
Orthography2:Instructions1  -0.0265347  0.1048529  -0.253  0.80022    
Orthography2:zConsistency_H -0.0006278  0.0878892  -0.007  0.99430    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) Time2  Orthg2 Instr1 zCns_H Or2:I1
Time2        0.007                                   
Orthogrphy2  0.028  0.008                            
Instructns1 -0.127  0.022  0.019                     
zCnsstncy_H  0.017 -0.002 -0.026  0.011              
Orthgrp2:I1  0.013  0.001  0.003  0.068 -0.009       
Orthgr2:C_H -0.030  0.002  0.182  0.002 -0.026 -0.018

Pract.A.12. Deleting the bobyqa requirement seems to have no impact on models that converge.

Pract.Q.13. What does using dummy() in the random effects coding do?

  • Experiment: check out models 2 or 3, removing dummy() from the random effects coding to see what happens.

Run the code to fit the model and get a summary of results

long.orth.3.glmer.check <- glmer(Score ~
                             Time + Orthography + Instructions + zConsistency_H +

                             Orthography:Instructions +

                             Orthography:zConsistency_H +

                             (Orthography + 1 || Participant) +

                             (Orthography + Instructions + 1 || Word),

                           family = "binomial",
                           glmerControl(optimizer="bobyqa", 
                                        optCtrl=list(maxfun=2e5)),

                           data = long.orth)
boundary (singular) fit: see help('isSingular')
summary(long.orth.3.glmer.check)
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: Score ~ Time + Orthography + Instructions + zConsistency_H +  
    Orthography:Instructions + Orthography:zConsistency_H + (Orthography +  
    1 || Participant) + (Orthography + Instructions + 1 || Word)
   Data: long.orth
Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))

      AIC       BIC    logLik -2*log(L)  df.resid 
   1039.1    1131.6    -501.5    1003.1      1245 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-4.3164 -0.4102 -0.1850  0.2049  6.3675 

Random effects:
 Groups        Name                   Variance  Std.Dev.  Corr
 Participant   (Intercept)            1.277e-14 1.130e-07     
 Participant.1 Orthographyabsent      1.235e+00 1.111e+00     
               Orthographypresent     2.637e+00 1.624e+00 1.00
 Word          (Intercept)            7.695e-10 2.774e-05     
 Word.1        Orthographyabsent      1.637e+00 1.279e+00     
               Orthographypresent     8.779e-01 9.370e-01 1.00
 Word.2        Instructionsexplicit   4.259e-01 6.526e-01     
               Instructionsincidental 2.373e+00 1.540e+00 1.00
Number of obs: 1263, groups:  Participant, 41; Word, 16

Fixed effects:
                            Estimate Std. Error z value Pr(>|z|)    
(Intercept)                 -1.98581    0.46206  -4.298 1.73e-05 ***
Time2                        0.04821    0.08442   0.571 0.567946    
Orthography2                 0.44338    0.12723   3.485 0.000492 ***
Instructions1                0.22950    0.26577   0.864 0.387847    
zConsistency_H              -0.61318    0.33663  -1.822 0.068524 .  
Orthography2:Instructions1  -0.03827    0.09513  -0.402 0.687475    
Orthography2:zConsistency_H  0.01581    0.09969   0.159 0.874019    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) Time2  Orthg2 Instr1 zCns_H Or2:I1
Time2        0.007                                   
Orthogrphy2 -0.127  0.014                            
Instructns1 -0.259  0.022  0.007                     
zCnsstncy_H  0.020 -0.003 -0.059  0.041              
Orthgrp2:I1  0.020  0.004 -0.021  0.277 -0.031       
Orthgr2:C_H -0.041  0.003  0.256 -0.002 -0.410 -0.016
optimizer (bobyqa) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')

Pract.A.13. If we refit model 3 but with the random effects:

(Orthography + 1 || Participant) +

Orthography + Instructions + 1 || Word),

We get:

  • a convergence warning
  • random effects variances and covariances that we did not ask for: including some bad signs

The answers

After the practical class, we will reveal the answers that are currently hidden.

The answers version of the webpage will present my answers for questions, and some extra information where that is helpful.

Back to top

References

Ricketts, J., Dawson, N., & Davies, R. (2021). The hidden depths of new word knowledge: Using graded measures of orthographic and semantic learning to measure vocabulary acquisition. Learning and Instruction, 74, 101468. https://doi.org/10.1016/j.learninstruc.2021.101468