1Cademy - Test-Retest Reliability

Learn Before

Types of Reliability
Evaluating the Measure

Concept

Test-Retest Reliability

Test-retest reliability evaluates the extent to which a psychological measure yields consistent scores when administered to the same individuals across different times. This form of reliability is crucial when assessing constructs that are theoretically expected to remain stable, such as intelligence or self-esteem. However, it is not an appropriate standard for constructs that naturally vary over time, like mood or immediate stress levels.

Updated 2026-06-17

Contributors are:

Who are from:

References

OpenStax Psychology (2nd ed.) Textbook
KPU Research Methods in Psychology - 4th American Edition
KPU Research Methods in Psychology - 4th American Edition

Learn After

Test-Retest Correlation
Assessing Test-Retest Reliability
Example of Reliability Without Validity
Test-retest reliability is considered an appropriate standard of consistency for which type of psychological construct?
A psychological measure designed to assess immediate stress levels must demonstrate high test-retest reliability to be considered a useful instrument.
A researcher is deciding whether test-retest reliability is an appropriate metric to evaluate the consistency of several different psychological measures. Match each construct with the correct rationale for using (or not using) this form of reliability.
A research team is reviewing the quality of several new psychological instruments. Rank the following scientific evaluations from the least appropriate application of test-retest reliability to the most appropriate application based on the nature of the constructs and the evidence provided.
Which procedure is used to assess the test-retest reliability of a psychological measure?
Match each research scenario with the correct interpretation of its test-retest reliability based on the nature of the construct being measured.
A researcher finds that a measure of 'General Intelligence' and a measure of 'Immediate Mood' both yield a test-retest correlation of $r = +0.20$ . Upon analysis, the researcher concludes that the 'Immediate Mood' measure may be functioning correctly, but the 'General Intelligence' measure is severely flawed because intelligence is theoretically a(n) _____ construct.
A clinical psychologist develops a new survey to measure 'state anxiety' (an individual's immediate, fluctuating level of anxiety in response to temporary stressors). To demonstrate that this new survey is a reliable and high-quality measure, the psychologist must show that it has high test-retest reliability (such as a correlation of $r = +0.80$ or higher) over a two-week interval.
A researcher is analyzing why a newly developed psychological scale of 'trait self-esteem' yielded an unexpectedly low test-retest reliability correlation of $r = +0.35$ over a three-week interval. To systematically diagnose the root cause of this low correlation, arrange the analytical steps in the most logical order from first to last.
A research panel is evaluating a newly proposed scale designed to measure 'immediate state of mindfulness' (a transient, rapidly fluctuating mental state). The creators of the scale boast that it is highly reliable, citing a test-retest correlation of $r = +0.85$ over a two-week interval. To critically evaluate this claim, the panel must determine if this reliability metric is actually appropriate. Because an immediate state of mindfulness is theoretically expected to change frequently, a high
What is the primary purpose of evaluating a psychological measure's test-retest reliability?
A psychological measure must demonstrate high test-retest reliability to be considered of high quality, even if the construct it measures naturally fluctuates over time, such as immediate stress levels.
Match each research scenario involving repeated measurements to the most accurate conclusion about the measure's test-retest reliability.
A researcher develops a new questionnaire to measure introversion, a construct theoretically expected to remain stable. However, when the same participants complete the questionnaire on a Monday and then again two weeks later, their scores are completely inconsistent. By analyzing this inconsistency across different times, the researcher can determine that the measure lacks ____ reliability.
A journal reviewer is evaluating a research paper that claims a new 'immediate exam stress' questionnaire is highly effective because it demonstrated high test-retest reliability over a six-month period. Arrange the logical steps the reviewer should take to critique and ultimately reject this methodological claim.
Which statement best describes the definition and appropriate application of test-retest reliability?
Why is test-retest reliability considered an inappropriate standard for evaluating a measure of momentary mood?
A clinical psychologist is creating a new observational tool to measure clients' immediate stress levels during public speaking tasks. To confirm the tool is reliable, the psychologist should expect it to demonstrate high test-retest reliability when administered to the same clients several weeks apart.
Analyze the following research scenarios involving repeated measurements. Match each scenario's empirical outcome to the most accurate analytical conclusion regarding its test-retest reliability.
A methodology consultant is evaluating a hospital's new 'Daily Mood Tracker'. The hospital administrators want to discard the measure because patients' scores change significantly from morning to evening. The consultant argues against this decision, explaining that because mood naturally fluctuates, it is methodologically inappropriate to evaluate the tracker using ____ reliability.

Learn Before

Related

Learn After