Hypotheses, associations

Rob Davies

Department of Psychology, Lancaster University

2024-02-19

PSYC122: Classes in weeks 16-20

My name is Dr Rob Davies, I am an expert in communication, individual differences, and methods

Tip

Ask me anything:

questions during class in person or anonymously through slido;
all other questions on the discussion forum

Weeks 16-20

Introduction: our objectives, our methods and the benefits to you

Objectives: 1. Link together ideas on how to do psychological science

You are learning about:

the scientific method
measurement and hypothesis testing
modern reproducible open science

Our job now is to connect these ideas together

Picture shows a spider's web against a green background. — flickr, Khunal Gate ‘Web’

Objectives: 2. Strengthen your practice and build your independence

In PSYC121 and PSYC122, you have learned about working with data
In PSYC122, so far: you have learned about correlations and linear models
Our job now is to deepen and broaden your skills

Picture shows a group of climbers on a snow field, standing near some rocks. In the background, there is a mountain peak and blue cloudless skies. — flickr, Magryciak ‘Great weekend’

Objectives: 3. Show you how to join the credibility revolution

We have taught you about a revolution

Old ways: questionable research, closed practices
New ways: research integrity, open science

Our job now is to show you how to join in as critical thinkers

Picture shows a person with long hair boarding a train. They are holding a sign with the word 'revolution' and a rainbow painted on it. — flickr, Cesar Salvadeo ‘Revolution’

What we are going to do

Now, we will put the ideas into practice
In the context of a live investigation
We will work together for real world impact

Picture shows a sign 'because we can' drawn in light in front of a space with white garage doors to left and right, and houses behind. — flickr, Ben Matthews ‘Because we can’

Our mission: to make the world a bit better

Picture shows a gallery or picture space, with people wearing masks and old fresco pictures on the walls. Some of the people are looking towards the camera.

Our approach: Concepts, skills, levels

Each week, we focus on building concepts and skills
In conceptual work, we aim for deep and broad understanding of what you do and why
In practical work, we introduce, consolidate, or extend within a single problem set to grow your independence

The new idea: data analysis in context

Traditionally, psychologists have to teach a different procedure each week: a-test-a-week
limited discussion of theory or measurement,
focus on rules about doing and reporting null hypothesis significance tests
but:

This risks student (researcher) focus on doing the significance test (when, what, how)
While the real challenges are located in figuring out what we want to find out, measure, and explain

Targets for weeks 16-19: Concepts

We are working together to develop concepts:

Week 16 — Hypotheses, measurement and associations
Week 17 — Predicting people using linear models
Week 18 — Everything is some kind of linear model
Week 19 — The real challenge in psychological science

Targets for weeks 16-19: Concepts

The real challenges we face as psychologists: our diversity
We examine the impact of diversity
And we explore how far we can ever reproduce or generalize our findings

Picture shows a crowd of people seen from above with a variety of colours of clothing. — flickr, Cat Walker ‘crowd’

Targets for weeks 16-19: Skills

We are working together to develop skills:

Week 16 — Visualizing, estimating, and reporting associations
Week 17 — Using data to predict people
Week 18 — Going deeper on linear models
Week 19 — Evaluating evidence across multiple studies

Targets for weeks 16-19: Skills

We revisit some ideas in new ways
Students – like all people – are diverse
We build visual and verbal ways of thinking

The benefits: critical thinking and you

Important

By end of first year, most students learn to code in R:

The real challenge comes – in the second and third years – when you have to show that you can critically evaluate evidence

To get a B+ or an A you will need to show critical reflection
Our work here will build your ability to do this

Statistical rituals largely eliminate critical thinking

Traditionally, students learn statistical tests, and learn to identify if a test statistic is significant or not
If we do not also talk about what is actually observed, and whether or how it is or is not compatible with theory-based predictions then we do ritual not science (Gigerenzer, 2004)
This is a problem: the focus on significance allows us to build or accommodate vague theories that can never be wrong

Open, reproducible, methods are not enough

Now: we need to think causally about predictions and measurement

We need better theory so we can build clear testable predictions from explicit assumptions
We need better measurement because if we cannot reliably measure something then it is hard to build a theory about it

We need to think about the derivation chain

Figure 1: The derivation chain

Here’s a toolkit for thinking productively about your hypotheses

The derivation chain (Meehl, 1990; Scheel et al., 2021)

Develop your theory: the concepts, and the assumptions about causality
Specify how psychological concepts will be measured
Identify auxiliary assumptions about how we get from theoretical concepts to observable data
Identify theoretical predictions
Link theoretical predictions to specific statistical tests that may support or contradict them

Valid measures

We often teach and learn about different kinds of validity but the key idea is simple (Borsboom et al., 2004):

a test is valid for measuring an attribute if and only if (a) the attribute exists and (b) variations in the attribute causally produce variations in the outcomes of the measurement procedure
We want to work with valid measures but validity requires explaining: (Q.1) Does the thing exist in the world? (Q.2) Is variation in that thing be reflected in variation in our measurement?

Summary: our critical thinking checklist

What is our (causal) theory?
What measures are we using, why?
What is our specific prediction, why?
Does the prediction relate to sign and to magnitude?
What analysis can test this prediction, why?
How will our results affect our beliefs, why?

Let’s take a break

End of part 1

Learning targets for this week:

Concepts: begin learning to think critically
Skills: identify how we build hypotheses

Case study: the health comprehension project

Because the real challenge concerns how psychologists ask and answer questions
We will work in the context of a live research project: What makes it easy or difficult to understand written health information?

flickr: Sasin Tipchair 'Senior woman in wheelchair talking to a nurse in a hospital' — flickr, Sasin Tipchair ‘Senior woman in wheelchair talking to a nurse in a hospital’

Why this? We don’t really know what makes it easy or difficult to understand advice about health

flickr: WendyHarris1955 'COVID-19 Antibody test' packs and information leaflets

Health comprehension project: impacts

We are working to improve health communication
With partners at Vienna Business University, Kantar Public, and the London School of Economics
Our results could change: business and health communication; understanding reading development

Health comprehension project: questions and analyses

Our research questions are:

Note

What person attributes predict success in understanding?
Can people accurately evaluate whether they correctly understand written health information?

These kinds of research questions can be answered using methods like correlation, linear models

Health comprehension project – relevance: methods you will use in your professional work

We collect data using online Qualtrics questionnaire surveys
We test people on a range of dimensions using standardized ability and our own knowledge tests
Many of you will go on to work with online surveys, and with data from standardized ability measures

You can get involved

You can – if you choose – get involved
Complete the survey and contribute your responses
Forthcoming on PEP: be a named co-author assisting in the development of preprint and repository to share our data

Extract from Qualtrics survey showing a sample written health information text extract, a multiple choice question probing understanding of the information in the extract, and a rating scale allowing participants to self-evaluate their understanding — Extract from Qualtrics survey

Health comprehension project: why it is a case study

The health project has strengths and limitations
Watch how to identify and critically evaluate this project
So you can do the same for your work

Cognitive process theory of comprehension success

When skilled adult readers read and try to understand written text (Kintsch, 1994)
They must recognize and access the meanings of words
Then use knowledge and reasoning to build an interpretation of what is in the text
Based on connecting the information in the text with what they already know

Individual differences theory of comprehension success

Successfully understanding text depends on (1.) language experience and (2.) reasoning ability (Freed et al., 2017)

Figure 2: Factors influencing comprehension success

Where the data come from: our measures

We measure reading comprehension: asking people to read text and then answer multiple choice questions
We measure background knowledge: vocabulary knowledge (Shipley); health literacy (HLVA)
We ask people to rate their own understanding of each text

Example critical evaluation questions

Are multiple choice questions good ways to probe understanding? – What alternatives are there?
Are tests like the Shipley good measures of language knowledge? – What do we miss?
Can a person accurately evaluate their own understanding? – Can we rely on subjective judgments?

Relevance to you

Tip

Even very good students sometimes do not question the validity of measures:
Not asking questions like this has a real impact on the value of the interpretation of results

Here, we are looking ahead to the critical thinking you will need to show in your second and third year essays

Let’s take a break

End of part 2

Learning targets for this week:

Concepts – associations: correlations, estimates and hypothesis tests
Skills – visualizing variation and covariation
Skills – writing the code
Skills – estimating correlations
Skills – interpreting and reporting correlations

Talking about the relationships between variables

Psychologists and people who work in related fields often want to know about associations
Is variation in observed values on one dimension (e.g., comprehension) related to variation in another dimension (e.g., vocabulary)?
Do values on both dimensions vary together?

The language in this area can vary: we will be consistent but you need to be aware of the different terms

Outcome $=$ response $=$ criterion $=$ dependent variable
Predictor $=$ covariate $=$ independent variable $=$ factor
Linear model $=$ regression analysis $=$ regression model $=$ multiple regression

Let’s look at the data we will use

The person in row 1 has ETHNICITY White, is AGE 34 years, scored 33 on Shipley vocabulary, scored 7 on HLVA health literacy
and, on average, self-rated their understanding of health information as 7.96 (so 8/9, mean.self)
while scoring 0.49 accuracy in tests of understanding (49% mean.acc)

# A tibble: 4 × 6
  mean.acc mean.self  HLVA SHIPLEY   AGE ETHNICITY
     <dbl>     <dbl> <dbl>   <dbl> <dbl> <fct>    
1     0.49      7.96     7      33    34 White    
2     0.85      7.28     7      33    25 White    
3     0.82      7.36     8      40    43 White    
4     0.94      7.88    11      33    46 White

Destination correlation: where the correlation number comes from

Covariance

\[COV_{xy} = \frac{\sum(x - \bar{x})(y - \bar{y})}{n -1}\]

If we want to estimate the correlation between two sets of numbers: $x$ and $y$
We want to know if variation in $x$ (given by $x - \bar{x}$)
Varies together with variation in $y$ (given by $y - \bar{y}$)

Destination correlation: where the correlation number comes from

Covariance divided by standard deviations

\[r = \frac{COV_{xy}}{s_xs_y}\]

Because the two sets of numbers can be on different scales: e.g., SHIPLEY out of 40; mean.acc (proportion, out of 1)
And because covariance values depend on the scales
To make correlations easier to compare, we remove scaling by dividing by the standard deviations of the variables

Let’s think about an example

Note

Research question: Can people accurately evaluate whether they correctly understand written health information?

Measurement: Someone with higher scores on tested accuracy of understanding will also present higher scores on their ratings of their own understanding
Statistical prediction: We predict that mean.acc and mean.self scores will be associated
Test: If the prediction is correct, mean.acc and mean.self scores will be correlated

Distributions: Let’s see what this means – how do scores vary?

There are two histograms, shown side by side: The 'mean accuracy' histogram shows how 'mean accuracy' scores vary between about 0.3 and 1.0, with a peak, indicated by a vertical red line, around .8; the 'mean self-rated accuracy' histogram shows how 'mean accuracy' scores vary between about 2.5 and 9.0, with a peak, indicated by a vertical red line, around 7.

Histograms showing the distribution of mean accuracy and mean self-rated accuracy scores in the ‘clearly.one.subjects’ dataset: means calculated for each participant over all their responses

A histogram is a useful way to show the distribution of values

We have a sample of accuracy scores:
Mean accuracy scores vary between 0.0 and 1.0
We draw the plot by grouping together similar values in bins
Heights of bars represent numbers of cases with similar values in same bin

The 'mean accuracy' histogram shows how 'mean accuracy' scores vary between about 0.3 and 1.0, with a peak, indicated by a vertical red line, around .8. — Distribution of mean accuracy

When we talk about variance we are talking about how values vary in relation to the mean for the sample

The average of these mean accuracy scores is marked with a red line where $\bar{x} =$ 0.8
The accuracy score for the person in row 1 is located at $x = .49$, marked in blue

The 'mean accuracy' histogram shows how 'mean accuracy' scores vary between about 0.3 and 1.0, with the average of mean accuracy scores, indicated by a vertical red line located near .8, and the score of the person in row 1 of the dataset, indicated by a blue line located at .49. — Distribution of mean accuracy

We are talking about how values vary in relation to the mean for the sample

In comparison, the mean accuracy score for the person in row 4 is located at $x = .94$, marked in blue

The basic question when we examine covariance: do values vary together?

If the person at row 1 has a mean.accuracy score of .49, lower than the average
And the person at row 4 has a mean.accuracy score of .94, higher than the average
What will their mean.self scores be: will they be higher or lower than the average mean.self score?

We can use scatterplots to examine associations

The figure shows two scatterplots, side by side: both plots show points where each point indicates the pair of scores corresponding to the 'mean accuracy' and the 'mean self-rated accuracy' recorded for each participant in the example data. The plot on the left orients the presentation with 'mean accuracy' on the y axis. The plot on the right orients the presentation with 'mean self-rated accuracy' on the y axis.

Scatterplots showing whether values on mean accuracy (mean.acc) vary together with values on mean self-rated accuracy (mean.self) for the participants in this sample

A scatterplot is a useful way to examine if the values of two or more variables vary together

Mean accuracy scores vary between 0.0 and 1.0
The height of each point shows the observed value of accuracy on the y-axis
Self-rated accuracy scores vary between 1 and 9
The horizontal position of each point shows the observed value of self-rated accuracy on the x-axis

The figure shows a scatterplot showing points where each point indicates the pair of scores corresponding to the 'mean accuracy' and the 'mean self-rated accuracy' recorded for each participant in the example data. The plot orients the presentation with 'mean accuracy' on the y axis. — Scatterplot showing how values on mean accuracy and mean self-rated accuracy vary together

A scatterplot is a useful way to examine if the values of two or more variables vary together

We have a sample of 170 people
For each person, we have a value for the mean accuracy and a paired value for the mean self-rated accuracy
Each point shows the paired data values for a person
In red: someone scored 3.48 on mean self-rated accuracy, 0.57 on mean accuracy

Let’s take a break

End of part 3

The R code for a correlation test, bit by bit

cor.test(clearly.one.subjects$mean.acc, 
         clearly.one.subjects$mean.self,
         method = "pearson")

We specify the cor.test function, and name one variable clearly.one.subjects$mean.acc
Then we name the second variable clearly.one.subjects$mean.self
Last we specify the correlation method = "pearson" because we have a choice

Identifying the key information in the results from one correlation test

We look at the value of the correlation (here, cor) and the p-value
We can see that the correlation statistic is positive cor = .4863771 which we round to $cor = .49$
And p-value = 2.026e-11 indicating that the correlation is significant $p < .001$

cor.test(clearly.one.subjects$mean.acc, 
         clearly.one.subjects$mean.self, 
         method = "pearson")


    Pearson's product-moment correlation

data:  clearly.one.subjects$mean.acc and clearly.one.subjects$mean.self
t = 7.1936, df = 167, p-value = 2.026e-11
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.3619961 0.5937425
sample estimates:
      cor 
0.4863771

Reporting a correlation

Usually, we report a correlation like this:

Mean accuracy and mean self-rated accuracy were significantly correlated ($r (167) = .49, p < .001$). Higher mean accuracy scores are associated with higher mean self-rated accuracy scores.

Interpreting correlations with the help of visualization

The correlation statistic is positive in sign and moderate in size, about $r = .49$
We can see that higher mean accuracy (mean.acc) scores are associated with higher mean self-rated accuracy (mean.self) scores

What will different kinds of correlations look like?

We can simulate data to demonstrate: (left) the correlation is positive, $r = .5$; (right) the correlation is negative, $r = -.5$

The figure shows two scatterplots showing how simulated data values on mean accuracy and mean self-rated accuracy *could* vary together given positive or negative correlations. Each plot shows points, where each point indicates the pair of scores corresponding to the 'mean accuracy' and the 'mean self-rated accuracy' recorded for each participant in a simulated dataset. The plot on the left shows the scatter of points when data are simulated assuming r = .5. The plot on the left shows the scatter of points when data are simulated assuming r = -.5.

Scatterplots showing how simulated data values on mean accuracy and mean self-rated accuracy could vary together given positive or negative correlations

We can also imagine – again with simulated data – what correlations of increasing size might look like

The figure shows 4 scatterplots showing how simulated data values on mean accuracy and mean self-rated accuracy *could* vary together given positive correlations of increasing size. Each plot shows points, where each point indicates the pair of scores corresponding to the 'mean accuracy' and the 'mean self-rated accuracy' recorded for each participant in a simulated dataset. The plots show the scatter of points, from left to right, (1.) if r - .1; (2.) if r = .3; (3.) if r = .5; (4.) if r = .8

Scatterplots showing how simulated data values on mean accuracy and mean self-rated accuracy could vary together given positive correlations of increasing size

Summary

We are often interested in whether or how variation in the values of two variables are associated
We can visualize the distribution of values in any one variable using histograms
We visualize the association of values in two variables using scatterplots
We conduct correlation tests to examine the sign (positive or negative) and the strength of the association
But we always need to think about our research questions, about where our data come from and about whether our measures are any good

End of lecture

References

Borsboom, D., Mellenbergh, G. J., & Heerden, J. van. (2004). The concept of validity. Psychological Review, 111(4), 1061–1071. https://doi.org/10.1037/0033-295X.111.4.1061

Freed, E. M., Hamilton, S. T., & Long, D. L. (2017). Comprehension in proficient readers: The nature of individual variation. Journal of Memory and Language, 97, 135–153. https://doi.org/10.1016/j.jml.2017.07.008

Gigerenzer, G. (2004). Mindless statistics. The Journal of Socio-Economics, 33(5), 587–606. https://doi.org/10.1016/j.socec.2004.09.033

Kintsch, W. (1994). Text comprehension, memory, and learning. American Psychologist, 49(4), 294–303. https://doi.org/10.1037/0003-066x.49.4.294

Meehl, P. E. (1990). Why summaries of research on psychological theories are often uninterpretable 1, 2. i, 195–244.

Scheel, A. M., Tiokhin, L., Isager, P. M., & Lakens, D. (2021). Why Hypothesis Testers Should Spend Less Time Testing Hypotheses. Perspectives on Psychological Science, 16(4), 744–755. https://doi.org/10.1177/1745691620966795