Longitudinal Study Design

Julia Simkus

Editor at Simply Psychology

BA (Hons) Psychology, Princeton University

Julia Simkus is a graduate of Princeton University with a Bachelor of Arts in Psychology. She is currently studying for a Master's Degree in Counseling for Mental Health and Wellness in September 2023. Julia's research has been published in peer reviewed journals.

Learn about our Editorial Process

Saul Mcleod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul Mcleod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

A longitudinal study is a type of observational and correlational study that involves monitoring a population over an extended period of time. It allows researchers to track changes and developments in the subjects over time.

What is a Longitudinal Study?

In longitudinal studies, researchers do not manipulate any variables or interfere with the environment. Instead, they simply conduct observations on the same group of subjects over a period of time.

These research studies can last as short as a week or as long as multiple years or even decades. Unlike cross-sectional studies that measure a moment in time, longitudinal studies last beyond a single moment, enabling researchers to discover cause-and-effect relationships between variables.

They are beneficial for recognizing any changes, developments, or patterns in the characteristics of a target population. Longitudinal studies are often used in clinical and developmental psychology to study shifts in behaviors, thoughts, emotions, and trends throughout a lifetime.

For example, a longitudinal study could be used to examine the progress and well-being of children at critical age periods from birth to adulthood.

The Harvard Study of Adult Development is one of the longest longitudinal studies to date. Researchers in this study have followed the same men group for over 80 years, observing psychosocial variables and biological processes for healthy aging and well-being in late life (see Harvard Second Generation Study).

When designing longitudinal studies, researchers must consider issues like sample selection and generalizability, attrition and selectivity bias, effects of repeated exposure to measures, selection of appropriate statistical models, and coverage of the necessary timespan to capture the phenomena of interest.

Panel Study

  • A panel study is a type of longitudinal study design in which the same set of participants are measured repeatedly over time.
  • Data is gathered on the same variables of interest at each time point using consistent methods. This allows studying continuity and changes within individuals over time on the key measured constructs.
  • Prominent examples include national panel surveys on topics like health, aging, employment, and economics. Panel studies are a type of prospective study .

Cohort Study

  • A cohort study is a type of longitudinal study that samples a group of people sharing a common experience or demographic trait within a defined period, such as year of birth.
  • Researchers observe a population based on the shared experience of a specific event, such as birth, geographic location, or historical experience. These studies are typically used among medical researchers.
  • Cohorts are identified and selected at a starting point (e.g. birth, starting school, entering a job field) and followed forward in time. 
  • As they age, data is collected on cohort subgroups to determine their differing trajectories. For example, investigating how health outcomes diverge for groups born in 1950s, 1960s, and 1970s.
  • Cohort studies do not require the same individuals to be assessed over time; they just require representation from the cohort.

Retrospective Study

  • In a retrospective study , researchers either collect data on events that have already occurred or use existing data that already exists in databases, medical records, or interviews to gain insights about a population.
  • Appropriate when prospectively following participants from the past starting point is infeasible or unethical. For example, studying early origins of diseases emerging later in life.
  • Retrospective studies efficiently provide a “snapshot summary” of the past in relation to present status. However, quality concerns with retrospective data make careful interpretation necessary when inferring causality. Memory biases and selective retention influence quality of retrospective data.

Allows researchers to look at changes over time

Because longitudinal studies observe variables over extended periods of time, researchers can use their data to study developmental shifts and understand how certain things change as we age.

High validation

Since objectives and rules for long-term studies are established before data collection, these studies are authentic and have high levels of validity.

Eliminates recall bias

Recall bias occurs when participants do not remember past events accurately or omit details from previous experiences.

Flexibility

The variables in longitudinal studies can change throughout the study. Even if the study was created to study a specific pattern or characteristic, the data collection could show new data points or relationships that are unique and worth investigating further.

Limitations

Costly and time-consuming.

Longitudinal studies can take months or years to complete, rendering them expensive and time-consuming. Because of this, researchers tend to have difficulty recruiting participants, leading to smaller sample sizes.

Large sample size needed

Longitudinal studies tend to be challenging to conduct because large samples are needed for any relationships or patterns to be meaningful. Researchers are unable to generate results if there is not enough data.

Participants tend to drop out

Not only is it a struggle to recruit participants, but subjects also tend to leave or drop out of the study due to various reasons such as illness, relocation, or a lack of motivation to complete the full study.

This tendency is known as selective attrition and can threaten the validity of an experiment. For this reason, researchers using this approach typically recruit many participants, expecting a substantial number to drop out before the end.

Report bias is possible

Longitudinal studies will sometimes rely on surveys and questionnaires, which could result in inaccurate reporting as there is no way to verify the information presented.

  • Data were collected for each child at three-time points: at 11 months after adoption, at 4.5 years of age and at 10.5 years of age. The first two sets of results showed that the adoptees were behind the non-institutionalised group however by 10.5 years old there was no difference between the two groups. The Romanian orphans had caught up with the children raised in normal Canadian families.
  • The role of positive psychology constructs in predicting mental health and academic achievement in children and adolescents (Marques Pais-Ribeiro, & Lopez, 2011)
  • The correlation between dieting behavior and the development of bulimia nervosa (Stice et al., 1998)
  • The stress of educational bottlenecks negatively impacting students’ wellbeing (Cruwys, Greenaway, & Haslam, 2015)
  • The effects of job insecurity on psychological health and withdrawal (Sidney & Schaufeli, 1995)
  • The relationship between loneliness, health, and mortality in adults aged 50 years and over (Luo et al., 2012)
  • The influence of parental attachment and parental control on early onset of alcohol consumption in adolescence (Van der Vorst et al., 2006)
  • The relationship between religion and health outcomes in medical rehabilitation patients (Fitchett et al., 1999)

Goals of Longitudinal Data and Longitudinal Research

The objectives of longitudinal data collection and research as outlined by Baltes and Nesselroade (1979):
  • Identify intraindividual change : Examine changes at the individual level over time, including long-term trends or short-term fluctuations. Requires multiple measurements and individual-level analysis.
  • Identify interindividual differences in intraindividual change : Evaluate whether changes vary across individuals and relate that to other variables. Requires repeated measures for multiple individuals plus relevant covariates.
  • Analyze interrelationships in change : Study how two or more processes unfold and influence each other over time. Requires longitudinal data on multiple variables and appropriate statistical models.
  • Analyze causes of intraindividual change: This objective refers to identifying factors or mechanisms that explain changes within individuals over time. For example, a researcher might want to understand what drives a person’s mood fluctuations over days or weeks. Or what leads to systematic gains or losses in one’s cognitive abilities across the lifespan.
  • Analyze causes of interindividual differences in intraindividual change : Identify mechanisms that explain within-person changes and differences in changes across people. Requires repeated data on outcomes and covariates for multiple individuals plus dynamic statistical models.

How to Perform a Longitudinal Study

When beginning to develop your longitudinal study, you must first decide if you want to collect your own data or use data that has already been gathered.

Using already collected data will save you time, but it will be more restricted and limited than collecting it yourself. When collecting your own data, you can choose to conduct either a retrospective or prospective study .

In a retrospective study, you are collecting data on events that have already occurred. You can examine historical information, such as medical records, in order to understand the past. In a prospective study, on the other hand, you are collecting data in real-time. Prospective studies are more common for psychology research.

Once you determine the type of longitudinal study you will conduct, you then must determine how, when, where, and on whom the data will be collected.

A standardized study design is vital for efficiently measuring a population. Once a study design is created, researchers must maintain the same study procedures over time to uphold the validity of the observation.

A schedule should be maintained, complete results should be recorded with each observation, and observer variability should be minimized.

Researchers must observe each subject under the same conditions to compare them. In this type of study design, each subject is the control.

Methodological Considerations

Important methodological considerations include testing measurement invariance of constructs across time, appropriately handling missing data, and using accelerated longitudinal designs that sample different age cohorts over overlapping time periods.

Testing measurement invariance

Testing measurement invariance involves evaluating whether the same construct is being measured in a consistent, comparable way across multiple time points in longitudinal research.

This includes assessing configural, metric, and scalar invariance through confirmatory factor analytic approaches. Ensuring invariance gives more confidence when drawing inferences about change over time.

Missing data

Missing data can occur during initial sampling if certain groups are underrepresented or fail to respond.

Attrition over time is the main source – participants dropping out for various reasons. The consequences of missing data are reduced statistical power and potential bias if dropout is nonrandom.

Handling missing data appropriately in longitudinal studies is critical to reducing bias and maintaining power.

It is important to minimize attrition by tracking participants, keeping contact info up to date, engaging them, and providing incentives over time.

Techniques like maximum likelihood estimation and multiple imputation are better alternatives to older methods like listwise deletion. Assumptions about missing data mechanisms (e.g., missing at random) shape the analytic approaches taken.

Accelerated longitudinal designs

Accelerated longitudinal designs purposefully create missing data across age groups.

Accelerated longitudinal designs strategically sample different age cohorts at overlapping periods. For example, assessing 6th, 7th, and 8th graders at yearly intervals would cover 6-8th grade development over a 3-year study rather than following a single cohort over that timespan.

This increases the speed and cost-efficiency of longitudinal data collection and enables the examination of age/cohort effects. Appropriate multilevel statistical models are required to analyze the resulting complex data structure.

In addition to those considerations, optimizing the time lags between measurements, maximizing participant retention, and thoughtfully selecting analysis models that align with the research questions and hypotheses are also vital in ensuring robust longitudinal research.

So, careful methodology is key throughout the design and analysis process when working with repeated-measures data.

Cohort effects

A cohort refers to a group born in the same year or time period. Cohort effects occur when different cohorts show differing trajectories over time.

Cohort effects can bias results if not accounted for, especially in accelerated longitudinal designs which assume cohort equivalence.

Detecting cohort effects is important but can be challenging as they are confounded with age and time of measurement effects.

Cohort effects can also interfere with estimating other effects like retest effects. This happens because comparing groups to estimate retest effects relies on cohort equivalence.

Overall, researchers need to test for and control cohort effects which could otherwise lead to invalid conclusions. Careful study design and analysis is required.

Retest effects

Retest effects refer to gains in performance that occur when the same or similar test is administered on multiple occasions.

For example, familiarity with test items and procedures may allow participants to improve their scores over repeated testing above and beyond any true change.

Specific examples include:

  • Memory tests – Learning which items tend to be tested can artificially boost performance over time
  • Cognitive tests – Becoming familiar with the testing format and particular test demands can inflate scores
  • Survey measures – Remembering previous responses can bias future responses over multiple administrations
  • Interviews – Comfort with the interviewer and process can lead to increased openness or recall

To estimate retest effects, performance of retested groups is compared to groups taking the test for the first time. Any divergence suggests inflated scores due to retesting rather than true change.

If unchecked in analysis, retest gains can be confused with genuine intraindividual change or interindividual differences.

This undermines the validity of longitudinal findings. Thus, testing and controlling for retest effects are important considerations in longitudinal research.

Data Analysis

Longitudinal data involves repeated assessments of variables over time, allowing researchers to study stability and change. A variety of statistical models can be used to analyze longitudinal data, including latent growth curve models, multilevel models, latent state-trait models, and more.

Latent growth curve models allow researchers to model intraindividual change over time. For example, one could estimate parameters related to individuals’ baseline levels on some measure, linear or nonlinear trajectory of change over time, and variability around those growth parameters. These models require multiple waves of longitudinal data to estimate.

Multilevel models are useful for hierarchically structured longitudinal data, with lower-level observations (e.g., repeated measures) nested within higher-level units (e.g., individuals). They can model variability both within and between individuals over time.

Latent state-trait models decompose the covariance between longitudinal measurements into time-invariant trait factors, time-specific state residuals, and error variance. This allows separating stable between-person differences from within-person fluctuations.

There are many other techniques like latent transition analysis, event history analysis, and time series models that have specialized uses for particular research questions with longitudinal data. The choice of model depends on the hypotheses, timescale of measurements, age range covered, and other factors.

In general, these various statistical models allow investigation of important questions about developmental processes, change and stability over time, causal sequencing, and both between- and within-person sources of variability. However, researchers must carefully consider the assumptions behind the models they choose.

Longitudinal vs. Cross-Sectional Studies

Longitudinal studies and cross-sectional studies are two different observational study designs where researchers analyze a target population without manipulating or altering the natural environment in which the participants exist.

Yet, there are apparent differences between these two forms of study. One key difference is that longitudinal studies follow the same sample of people over an extended period of time, while cross-sectional studies look at the characteristics of different populations at a given moment in time.

Longitudinal studies tend to require more time and resources, but they can be used to detect cause-and-effect relationships and establish patterns among subjects.

On the other hand, cross-sectional studies tend to be cheaper and quicker but can only provide a snapshot of a point in time and thus cannot identify cause-and-effect relationships.

Both studies are valuable for psychologists to observe a given group of subjects. Still, cross-sectional studies are more beneficial for establishing associations between variables, while longitudinal studies are necessary for examining a sequence of events.

1. Are longitudinal studies qualitative or quantitative?

Longitudinal studies are typically quantitative. They collect numerical data from the same subjects to track changes and identify trends or patterns.

However, they can also include qualitative elements, such as interviews or observations, to provide a more in-depth understanding of the studied phenomena.

2. What’s the difference between a longitudinal and case-control study?

Case-control studies compare groups retrospectively and cannot be used to calculate relative risk. Longitudinal studies, though, can compare groups either retrospectively or prospectively.

In case-control studies, researchers study one group of people who have developed a particular condition and compare them to a sample without the disease.

Case-control studies look at a single subject or a single case, whereas longitudinal studies are conducted on a large group of subjects.

3. Does a longitudinal study have a control group?

Yes, a longitudinal study can have a control group . In such a design, one group (the experimental group) would receive treatment or intervention, while the other group (the control group) would not.

Both groups would then be observed over time to see if there are differences in outcomes, which could suggest an effect of the treatment or intervention.

However, not all longitudinal studies have a control group, especially observational ones and not testing a specific intervention.

Baltes, P. B., & Nesselroade, J. R. (1979). History and rationale of longitudinal research. In J. R. Nesselroade & P. B. Baltes (Eds.), (pp. 1–39). Academic Press.

Cook, N. R., & Ware, J. H. (1983). Design and analysis methods for longitudinal research. Annual review of public health , 4, 1–23.

Fitchett, G., Rybarczyk, B., Demarco, G., & Nicholas, J.J. (1999). The role of religion in medical rehabilitation outcomes: A longitudinal study. Rehabilitation Psychology, 44, 333-353.

Harvard Second Generation Study. (n.d.). Harvard Second Generation Grant and Glueck Study. Harvard Study of Adult Development. Retrieved from https://www.adultdevelopmentstudy.org.

Le Mare, L., & Audet, K. (2006). A longitudinal study of the physical growth and health of postinstitutionalized Romanian adoptees. Pediatrics & child health, 11 (2), 85-91.

Luo, Y., Hawkley, L. C., Waite, L. J., & Cacioppo, J. T. (2012). Loneliness, health, and mortality in old age: a national longitudinal study. Social science & medicine (1982), 74 (6), 907–914.

Marques, S. C., Pais-Ribeiro, J. L., & Lopez, S. J. (2011). The role of positive psychology constructs in predicting mental health and academic achievement in children and adolescents: A two-year longitudinal study. Journal of Happiness Studies: An Interdisciplinary Forum on Subjective Well-Being, 12( 6), 1049–1062.

Sidney W.A. Dekker & Wilmar B. Schaufeli (1995) The effects of job insecurity on psychological health and withdrawal: A longitudinal study, Australian Psychologist, 30: 1,57-63.

Stice, E., Mazotti, L., Krebs, M., & Martin, S. (1998). Predictors of adolescent dieting behaviors: A longitudinal study. Psychology of Addictive Behaviors, 12 (3), 195–205.

Tegan Cruwys, Katharine H Greenaway & S Alexander Haslam (2015) The Stress of Passing Through an Educational Bottleneck: A Longitudinal Study of Psychology Honours Students, Australian Psychologist, 50:5, 372-381.

Thomas, L. (2020). What is a longitudinal study? Scribbr. Retrieved from https://www.scribbr.com/methodology/longitudinal-study/

Van der Vorst, H., Engels, R. C. M. E., Meeus, W., & Deković, M. (2006). Parental attachment, parental control, and early development of alcohol use: A longitudinal study. Psychology of Addictive Behaviors, 20 (2), 107–116.

Further Information

  • Schaie, K. W. (2005). What can we learn from longitudinal studies of adult development?. Research in human development, 2 (3), 133-158.
  • Caruana, E. J., Roman, M., Hernández-Sánchez, J., & Solli, P. (2015). Longitudinal studies. Journal of thoracic disease, 7 (11), E537.

Print Friendly, PDF & Email

helpful professor logo

10 Famous Examples of Longitudinal Studies

10 Famous Examples of Longitudinal Studies

Chris Drew (PhD)

Dr. Chris Drew is the founder of the Helpful Professor. He holds a PhD in education and has published over 20 articles in scholarly journals. He is the former editor of the Journal of Learning Development in Higher Education. [Image Descriptor: Photo of Chris]

Learn about our Editorial Process

longitudinal studies examples and definition, explained below

A longitudinal study is a study that observes a subject or subjects over an extended period of time. They may run into several weeks, months, or years. An examples is the Up Series which has been going since 1963.

Longitudinal studies are deployed most commonly in psychology and sociology, where the intention is to observe the changes in the subject over years, across a lifetime, and sometimes, even across generations.

There have been several famous longitudinal studies in history. Some of the most well-known examples are listed below.

Examples of Longitudinal Studies

1. up series.

Duration: 1963 to Now

The Up Series is a continuing longitudinal study that studies the lives of 14 subjects in Britain at 7-year intervals.

The study is conducted in the form of interviews in which the subjects report the changes that have occurred in their lives in the last 7 years since the last interview.

The interviews are filmed and form the subject matter of the critically acclaimed Up series of documentary films directed by Michael Apsted. 

When it was first conceived, the aim of the study was to document the life progressions of a cross-section of British children through the second half of the 20th century in light of the rapid social, economic, political, and demographic changes occuring in Britain.

14 children were selected from different socio-economic backgrounds for the first study in 1963 in which all were 7 years old.

The latest installment was filmed in 2019 by which time the participants had reached 63 years of age. 

The study noted that life outcomes of subjects were determined to a large extent by their socio-economic and demographic circumstances, and that chances for upward mobility remained limited in late 20th century Britain (Pearson, 2012).

2. Minnesota Twin Study

Duration: 1979 to 1990 (11 years)

Siblings who are twins not only look alike but often display similar behavioral and personality traits.

This raises an oft-asked question: how much of this similarity is genetic and how much of it is the result of the twins growing up together in a similar environment. 

The Minnesota twin study was a longitudinal study that set out to find an answer to this question by studying a group of twins from 1979 to 1990 under the supervision of Thomas J Bouchard.

The study found that identical twins who were reared apart in different environments did not display any greater chances of being different from each other than twins that were raised in the same environment.

The study concluded that the similarities and differences between twins are genetic in nature, rather than being the result of their environment (Bouchard et. al., 1990).

3. Grant Study

Duration: 1942 – Present

The Grant Study is one of the most ambitious longitudinal studies. It attempts to answer a philosophical question that has been central to human existence since the beginning of time – what is the secret to living a good life? (Shenk, 2009).

It does so by studying the lives of 268 male Harvard graduates who are interrogated at least every two years with the help of questionnaires, personal interviews, and gleaning information about their physical and mental well-being from their physicians.

Begun in 1942, the study continues to this day.

The study has provided researchers with several interesting insights into what constitutes the human quality of life. 

For instance:

  • It reveals that the quality of our relationships is more influential than IQ when it comes to our financial success.
  • It suggests that our relationships with our parents during childhood have a lasting impact on our mental and physical well-being until late into our lives.

In short, the results gleaned from the study (so far) strongly indicate that the quality of our relationships is one of the biggest factors in determining our quality of life. 

4. Terman Life Cycle Study

Duration: 1921 – Present

The Terman Life-Cycle Study, also called the Genetic Studies of Genius, is one of the longest studies ever conducted in the field of psychology.

Commenced in 1921, it continues to this day, over 100 years later!

The objective of the study at its commencement in 1921 was to study the life trajectories of exceptionally gifted children, as measured by standardized intelligence tests.

Lewis Terman, the principal investigator of the study, wanted to dispel the then-prevalent notion that intellectually gifted children tended to be:

  • socially inept, and
  • physically deficient

To this end, Terman selected 1528 students from public schools in California based on their scores on several standardized intelligence tests such as the Stanford-Binet Intelligence scales, National Intelligence Test, and the Army Alpha Test.

It was discovered that intellectually gifted children had the same social skills and the same level of physical development as other children.

As the study progressed, following the selected children well into adulthood and in their old age, it was further discovered that having higher IQs did not affect outcomes later in life in a significant way (Terman & Oden, 1959).

5. National Food Survey

Duration: 1940 to 2000 (60 years)

The National Food Survey was a British study that ran from 1940 to 2000. It attempted to study food consumption, dietary patterns, and household expenditures on food by British citizens.

Initially commenced to measure the effects of wartime rationing on the health of British citizens in 1940, the survey was extended and expanded after the end of the war to become a comprehensive study of British dietary consumption and expenditure patterns. 

After 2000, the survey was replaced by the Expenditure and Food Survey, which lasted till 2008. It was further replaced by the Living Costs and Food Survey post-2008. 

6. Millennium Cohort Study

Duration: 2000 to Present

The Millennium Cohort Study (MCS) is a study similar to the Up Series study conducted by the University of London.

Like the Up series, it aims to study the life trajectories of a group of British children relative to the socio-economic and demographic changes occurring in Britain. 

However, the subjects of the Millenium Cohort Study are children born in the UK in the year 2000-01.

Also unlike the Up Series, the MCS has a much larger sample size of 18,818 subjects representing a much wider ethnic and socio-economic cross-section of British society. 

7. The Study of Mathematically Precocious Youths

Duration: 1971 to Present

The Study of Mathematically Precocious Youths (SMPY) is a longitudinal study initiated in 1971 at the Johns Hopkins University.

At the time of its inception, the study aimed to study children who were exceptionally gifted in mathematics as evidenced from their Scholastic Aptitude Test (SAT) scores.

Later the study shifted to Vanderbilt University and was expanded to include children who scored exceptionally high in the verbal section of the SATs as well.

The study has revealed several interesting insights into the life paths, career trajectories, and lifestyle preferences of academically gifted individuals. For instance, it revealed:

  • Children with exceptionally high mathematical scores tended to gravitate towards academic, research, or corporate careers in the STEM fields.
  • Children with better verbal abilities went into academic, research, or corporate careers in the social sciences and humanities .

8. Baltimore Longitudinal Study of Aging

Duration: 1958 to Present

The Baltimore Longitudinal Study of Aging (BLSA) was initiated in 1958 to study the effects of aging, making it the longest-running study on human aging in America.

With a sample size of over 3200 volunteer subjects, the study has revealed crucial information about the process of human aging.

For instance, the study has shown that:

  • The most common ailments associated with the elderly such as diabetes, hypertension, and dementia are not an inevitable outcome of growing old, but rather result from genetic and lifestyle factors.
  • Aging does not proceed uniformly in humans, and all humans age differently. 

9. Nurses’ Health Study

Duration: 1976 to Present

The Nurses’ Health Study began in 1976 to study the effects of oral contraceptives on women’s health.

The first commercially available birth control pill was approved by the Food and Drug Administration (FDA) in 1960, and the use of such pills rapidly spread across the US and the UK.

At the same time, a lot of misinformation prevailed about the perceived harmful effects of using oral contraceptives.

The nurses’ health study aimed to study the long-term effects of the use of these pills by researching a sample composed of female nurses.

Nurses were specially chosen for the study because of their medical awareness and hence the ease of data collection that this enabled.

Over time, the study expanded to include not just oral contraceptives but also smoking, exercise, and obesity within the ambit of its research.

As its scope widened, so did the sample size and the resources required for continuing the research.

As a result, the study is now believed to be one of the largest and the most expensive observational health studies in history.

10. The Seattle 500 Study

Duration: 1974 to Present

The Seattle 500 Study is a longitudinal study being conducted by the University of Washington.

It observes a cohort of 500 individuals in the city of Seattle to determine the effects of prenatal habits on human health.

In particular, the study attempts to track patterns of substance abuse and mental health among the subjects and correlate them to the prenatal habits of the parents.  

From the examples above, it is clear that longitudinal studies are essential because they provide a unique perspective into certain issues which can not be acquired through any other method .

Especially in research areas that study developmental or life span issues, longitudinal studies become almost inevitable.

A major drawback of longitudinal studies is that because of their extended timespan, the results are likely to be influenced by epochal events. 

For instance, in the Genetic Studies of Genius described above, the life prospects of all the subjects would have been impacted by events such as the Great Depression and the Second World War.

The female participants in the study, despite their intellectual precocity, spent their lives as home makers because of the cultural norms of the era. Thus, despite their scale and scope, longitudinal studies do not always succeed in controlling background variables. 

Bouchard, T. J. Jr, Lykken, D. T., McGue, M., Segal, N. L., & Tellegen, A. (1990). Sources of human psychological differences: the Minnesota study of twins reared apart. Science , 250 (4978), 223–228. doi: https://doi.org/10.1126/science.2218526

Pearson, A. (2012, May) Seven Up!: A tale of two Englands that, shamefully, still exist The Telegraph https://www.telegraph.co.uk/comment/columnists/allison-pearson/9269805/Seven-Up-A-tale-of-two-Englands-that-shamefully-still-exist.html  

Shenk, J.W. (2009, June) What makes us happy? The Atlantic https://www.theatlantic.com/magazine/archive/2009/06/what-makes-us-happy/307439/  

Terman, L. M.  &  Oden, M. (1959). The Gifted group at mid-Life: Thirty-five years’ follow-up of the superior child . Genetic Studies of Genius Volume V . Stanford University Press.

Chris

  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd-2/ 25 Number Games for Kids (Free and Easy)
  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd-2/ 25 Word Games for Kids (Free and Easy)
  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd-2/ 25 Outdoor Games for Kids
  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd-2/ 50 Incentives to Give to Students

Leave a Comment Cancel Reply

Your email address will not be published. Required fields are marked *

  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case NPS+ Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

example of a longitudinal research question

Home Market Research

What is a Longitudinal Study?: Definition and Explanation

What is a longitudinal study and what are it's uses

In this article, we’ll cover all you need to know about longitudinal research. 

Let’s take a closer look at the defining characteristics of longitudinal studies, review the pros and cons of this type of research, and share some useful longitudinal study examples. 

Content Index

What is a longitudinal study?

Types of longitudinal studies, advantages and disadvantages of conducting longitudinal surveys.

  • Longitudinal studies vs. cross-sectional studies

Types of surveys that use a longitudinal study

Longitudinal study examples.

A longitudinal study is a research conducted over an extended period of time. It is mostly used in medical research and other areas like psychology or sociology. 

When using this method, a longitudinal survey can pay off with actionable insights when you have the time to engage in a long-term research project.

Longitudinal studies often use surveys to collect data that is either qualitative or quantitative. Additionally, in a longitudinal study, a survey creator does not interfere with survey participants. Instead, the survey creator distributes questionnaires over time to observe changes in participants, behaviors, or attitudes. 

Many medical studies are longitudinal; researchers note and collect data from the same subjects over what can be many years.

LEARN ABOUT:   Action Research

Longitudinal studies are versatile, repeatable, and able to account for quantitative and qualitative data . Consider the three major types of longitudinal studies for future research:

Types of longitudinal studies

Panel study: A panel survey involves a sample of people from a more significant population and is conducted at specified intervals for a more extended period. 

One of the panel study’s essential features is that researchers collect data from the same sample at different points in time. Most panel studies are designed for quantitative analysis , though they may also be used to collect qualitative data and unit of analysis .

LEARN ABOUT: Level of Analysis

Cohort Study: A cohort study samples a cohort (a group of people who typically experience the same event at a given point in time). Medical researchers tend to conduct cohort studies. Some might consider clinical trials similar to cohort studies. 

In cohort studies, researchers merely observe participants without intervention, unlike clinical trials in which participants undergo tests.

Retrospective study: A retrospective study uses already existing data, collected during previously conducted research with similar methodology and variables. 

While doing a retrospective study, the researcher uses an administrative database, pre-existing medical records, or one-to-one interviews.

As we’ve demonstrated, a longitudinal study is useful in science, medicine, and many other fields. There are many reasons why a researcher might want to conduct a longitudinal study. One of the essential reasons is, longitudinal studies give unique insights that many other types of research fail to provide. 

Advantages of longitudinal studies

  • Greater validation: For a long-term study to be successful, objectives and rules must be established from the beginning. As it is a long-term study, its authenticity is verified in advance, which makes the results have a high level of validity.
  • Unique data: Most research studies collect short-term data to determine the cause and effect of what is being investigated. Longitudinal surveys follow the same principles but the data collection period is different. Long-term relationships cannot be discovered in a short-term investigation, but short-term relationships can be monitored in a long-term investigation.
  • Allow identifying trends: Whether in medicine, psychology, or sociology, the long-term design of a longitudinal study enables trends and relationships to be found within the data collected in real time. The previous data can be applied to know future results and have great discoveries.
  • Longitudinal surveys are flexible: Although a longitudinal study can be created to study a specific data point, the data collected can show unforeseen patterns or relationships that can be significant. Because this is a long-term study, the researchers have a flexibility that is not possible with other research formats.

Additional data points can be collected to study unexpected findings, allowing changes to be made to the survey based on the approach that is detected.

Disadvantages of longitudinal studies

  • Research time The main disadvantage of longitudinal surveys is that long-term research is more likely to give unpredictable results. For example, if the same person is not found to update the study, the research cannot be carried out. It may also take several years before the data begins to produce observable patterns or relationships that can be monitored.
  • An unpredictability factor is always present It must be taken into account that the initial sample can be lost over time. Because longitudinal studies involve the same subjects over a long period of time, what happens to them outside of data collection times can influence the data that is collected in the future. Some people may decide to stop participating in the research. Others may not be in the correct demographics for research. If these factors are not included in the initial research design, they could affect the findings that are generated.
  • Large samples are needed for the investigation to be meaningful To develop relationships or patterns, a large amount of data must be collected and extracted to generate results.
  • Higher costs Without a doubt, the longitudinal survey is more complex and expensive. Being a long-term form of research, the costs of the study will span years or decades, compared to other forms of research that can be completed in a smaller fraction of the time.

create-longitudinal-surveys

Longitudinal studies vs. Cross-sectional studies

Longitudinal studies are often confused with cross-sectional studies. Unlike longitudinal studies, where the research variables can change during a study, a cross-sectional study observes a single instance with all variables remaining the same throughout the study. A longitudinal study may follow up on a cross-sectional study to investigate the relationship between the variables more thoroughly.

Longitudinal studies take a longer time, from years
to even a few decades.
Cross-sectional studies are quick to conduct compared to longitudinal studies.
A longitudinal study requires an investigator to
observe the participants at different time intervals.
A cross-sectional study is conducted over a specified period of time.
Longitudinal studies can offer researchers a cause
and effect relationship.
Cross-sectional studies cannot offer researchers a cause-and-effect relationship.
In longitudinal studies, only one variable can be
observed or studied.
With cross-sectional studies, different variables can be observed at a single moment.
Longitudinal studies tend to be more expensive. Cross-sectional studies are more accessible for companies and researchers.

The design of the study is highly dependent on the nature of the research questions . Whenever a researcher decides to collect data by surveying their participants, what matters most are the questions that are asked in the survey.

Cross-sectional Study vs Longitudinal study

Knowing what information a study should gather is the first step in determining how to conduct the rest of the study. 

With a longitudinal study, you can measure and compare various business and branding aspects by deploying surveys. Some of the classic examples of surveys that researchers can use for longitudinal studies are:

Market trends and brand awareness: Use a market research survey and marketing survey to identify market trends and develop brand awareness. Through these surveys, businesses or organizations can learn what customers want and what they will discard. This study can be carried over time to assess market trends repeatedly, as they are volatile and tend to change constantly.

Product feedback: If a business or brand launches a new product and wants to know how it is faring with consumers, product feedback surveys are a great option. Collect feedback from customers about the product over an extended time. Once you’ve collected the data, it’s time to put that feedback into practice and improve your offerings.

Customer satisfaction: Customer satisfaction surveys help an organization get to know the level of satisfaction or dissatisfaction among its customers. A longitudinal survey can gain feedback from new and regular customers for as long as you’d like to collect it, so it’s useful whether you’re starting a business or hoping to make some improvements to an established brand.

Employee engagement: When you check in regularly over time with a longitudinal survey, you’ll get a big-picture perspective of your company culture. Find out whether employees feel comfortable collaborating with colleagues and gauge their level of motivation at work.

Now that you know the basics of how researchers use longitudinal studies across several disciplines let’s review the following examples:

Example 1: Identical twins

Consider a study conducted to understand the similarities or differences between identical twins who are brought up together versus identical twins who were not. The study observes several variables, but the constant is that all the participants have identical twins.

In this case, researchers would want to observe these participants from childhood to adulthood, to understand how growing up in different environments influences traits, habits, and personality.

LEARN MORE ABOUT: Personality Survey

Over many years, researchers can see both sets of twins as they experience life without intervention. Because the participants share the same genes, it is assumed that any differences are due to environmental analysis , but only an attentive study can conclude those assumptions.

Example 2: Violence and video games

A group of researchers is studying whether there is a link between violence and video game usage. They collect a large sample of participants for the study. To reduce the amount of interference with their natural habits, these individuals come from a population that already plays video games. The age group is focused on teenagers (13-19 years old).

The researchers record how prone to violence participants in the sample are at the onset. It creates a baseline for later comparisons. Now the researchers will give a log to each participant to keep track of how much and how frequently they play and how much time they spend playing video games. This study can go on for months or years. During this time, the researcher can compare video game-playing behaviors with violent tendencies. Thus, investigating whether there is a link between violence and video games.

Conducting a longitudinal study with surveys is straightforward and applicable to almost any discipline. With our survey software you can easily start your own survey today. 

GET STARTED

MORE LIKE THIS

Qualtrics vs Google Forms Comparison

Qualtrics vs Google Forms: Which is the Best Platform?

Jul 24, 2024

SurveyMonkey vs. Typeform

TypeForm vs. SurveyMonkey: Which is Better in 2024?

Surveymonkey-vs-google-forms

SurveyMonkey vs Google Forms: A Detailed Comparison

Jul 23, 2024

Typeform vs Jotform

Jotform vs Typeform: Which is the Best Option? Comparison (2024)

Other categories.

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Tuesday CX Thoughts (TCXT)
  • Uncategorized
  • What’s Coming Up
  • Workforce Intelligence
  • What’s a Longitudinal Study? Types, Uses & Examples

busayo.longe

Research can take anything from a few minutes to years or even decades to complete. When a systematic investigation goes on for an extended period, it’s most likely that the researcher is carrying out a longitudinal study of the sample population. So how does this work? 

In the most simple terms, a longitudinal study involves observing the interactions of the different variables in your research population, exposing them to various causal factors, and documenting the effects of this exposure. It’s an intelligent way to establish causal relationships within your sample population. 

In this article, we’ll show you several ways to adopt longitudinal studies for your systematic investigation and how to avoid common pitfalls. 

What is a Longitudinal Study? 

A longitudinal study is a correlational research method that helps discover the relationship between variables in a specific target population. It is pretty similar to a cross-sectional study , although in its case, the researcher observes the variables for a longer time, sometimes lasting many years. 

For example, let’s say you are researching social interactions among wild cats. You go ahead to recruit a set of newly-born lion cubs and study how they relate with each other as they grow. Periodically, you collect the same types of data from the group to track their development. 

The advantage of this extended observation is that the researcher can witness the sequence of events leading to the changes in the traits of both the target population and the different groups. It can identify the causal factors for these changes and their long-term impact. 

Characteristics of Longitudinal Studies

1. Non-interference: In longitudinal studies, the researcher doesn’t interfere with the participants’ day-to-day activities in any way. When it’s time to collect their responses , the researcher administers a survey with qualitative and quantitative questions . 

2. Observational: As we mentioned earlier, longitudinal studies involve observing the research participants throughout the study and recording any changes in traits that you notice. 

3. Timeline: A longitudinal study can span weeks, months, years, or even decades. This dramatically contrasts what is obtainable in cross-sectional studies that only last for a short time. 

Cross-Sectional vs. Longitudinal Studies 

  • Definition 

A cross-sectional study is a type of observational study in which the researcher collects data from variables at a specific moment to establish a relationship among them. On the other hand, longitudinal research observes variables for an extended period and records all the changes in their relationship. 

Longitudinal studies take a longer time to complete. In some cases, the researchers can spend years documenting the changes among the variables plus their relationships. For cross-sectional studies, this isn’t the case. Instead, the researcher collects information in a relatively short time frame and makes relevant inferences from this data. 

While cross-sectional studies give you a snapshot of the situation in the research environment, longitudinal studies are better suited for contexts where you need to analyze a problem long-term. 

  • Sample Data

Longitudinal studies repeatedly observe the same sample population, while cross-sectional studies are conducted with different research samples. 

Because longitudinal studies span over a more extended time, they typically cost more money than cross-sectional observations. 

Types of Longitudinal Studies 

The three main types of longitudinal studies are: 

  • Panel Study
  • Retrospective Study
  • Cohort Study 

These methods help researchers to study variables and account for qualitative and quantitative data from the research sample. 

1. Panel Study 

In a panel study, the researcher uses data collection methods like surveys to gather information from a fixed number of variables at regular but distant intervals, often spinning into a few years. It’s primarily designed for quantitative research, although you can use this method for qualitative data analysis . 

When To Use Panel Study

If you want to have first-hand, factual information about the changes in a sample population, then you should opt for a panel study. For example, medical researchers rely on panel studies to identify the causes of age-related changes and their consequences. 

Advantages of Panel Study  

  • It helps you identify the causal factors of changes in a research sample. 
  • It also allows you to witness the impact of these changes on the properties of the variables and information needed at different points of their existing relationship. 
  • Panel studies can be used to obtain historical data from the sample population. 

Disadvantages of Panel Studies

  • Conducting a panel study is pretty expensive in terms of time and resources. 
  • It might be challenging to gather the same quality of data from respondents at every interval. 

2. Retrospective Study

In a retrospective study, the researcher depends on existing information from previous systematic investigations to discover patterns leading to the study outcomes. In other words, a retrospective study looks backward. It examines exposures to suspected risk or protection factors concerning an outcome established at the start of the study.

When To Use Retrospective Study 

Retrospective studies are best for research contexts where you want to quickly estimate an exposure’s effect on an outcome. It also helps you to discover preliminary measures of association in your data. 

Medical researchers adopt retrospective study methods when they need to research rare conditions. 

Advantages of Retrospective Study

  • Retrospective studies happen at a relatively smaller scale and do not require much time to complete. 
  • It helps you to study rare outcomes when prospective surveys are not feasible.

Disadvantages of Retrospective Study

  • It is easily affected by recall bias or misclassification bias.
  • It often depends on convenience sampling, which is prone to selection bias. 

3. Cohort Study  

A cohort study entails collecting information from a group of people who share specific traits or have experienced a particular occurrence simultaneously. For example, a researcher might conduct a cohort study on a group of Black school children in the U.K. 

During cohort study, the researcher exposes some group members to a specific characteristic or risk factor. Then, she records the outcome of this exposure and its impact on the exposed variables. 

When To Use Cohort Study

You should conduct a cohort study if you’re looking to establish a causal relationship within your data sets. For example, in medical research, cohort studies investigate the causes of disease and establish links between risk factors and effects. 

Advantages of Cohort Studies

  • It allows you to study multiple outcomes that can be associated with one risk factor. 
  • Cohort studies are designed to help you measure all variables of interest. 

Disadvantages of Cohort Studies

  • Cohort studies are expensive to conduct.
  • Throughout the process, the researcher has less control over variables. 

When Would You Use a Longitudinal Study? 

If you’re looking to discover the relationship between variables and the causal factors responsible for changes, you should adopt a longitudinal approach to your systematic investigation. Longitudinal studies help you to analyze change over a meaningful time. 

How to Perform a Longitudinal Study?

There are only two approaches you can take when performing a longitudinal study. You can either source your own data or use previously gathered data.

1. Sourcing for your own data

Collecting your own data is a more verifiable method because you can trust your own data. The way you collect your data is also heavily dependent on the type of study you’re conducting.

If you’re conducting a retrospective study, you’d have to collect data on events that have already happened. An example is going through records to find patterns in cancer patients.

For a prospective study, you collect the data in real-time. This means finding a sample population, following them, and documenting your findings over the course of your study.

Irrespective of what study type you’d be conducting, you need a versatile data collection tool to help you accurately record your data. One we strongly recommend is Formplus . Signup here for free.

2. Using previously gathered data

Governmental and research institutes often carry out longitudinal studies and make the data available to the public. So you can pick up their previously researched data and use them for your own study. An example is the UK data service website .

Using previously gathered data isn’t just easy, they also allow you to carry out research over a long period of time. 

The downside to this method is that it’s very restrictive because you can only use the data set available to you. You also have to thoroughly examine the source of the data given to you. 

Advantages of a Longitudinal Study 

  • Longitudinal studies help you discover variable patterns over time, leading to more precise causal relationships and research outcomes. 
  • When researching developmental trends, longitudinal studies allow you to discover changes across lifespans and arrive at valid research outcomes. 
  • They are highly flexible, which means the researcher can adjust the study’s focus while it is ongoing. 
  • Unlike other research methods, longitudinal studies collect unique, long-term data and highlight relationships that cannot be discovered in a short-term investigation. 
  • You can collect additional data to study unexpected findings at any point in your systematic investigation. 

Disadvantages and Limitations of a Longitudinal Study 

  • It’s difficult to predict the results of longitudinal studies because of the extended time frame. Also, it may take several years before the data begins to produce observable patterns or relationships that can be monitored. 
  • It costs lots of money to sustain a research effort for years. You’ll keep incurring costs every year compared to other forms of research that can be completed in a smaller fraction of the time.
  • Longitudinal studies require a large sample size which might be challenging to achieve. Without this, the entire investigation will have little or no impact. 
  • Longitudinal studies often experience panel attrition. This happens when some members of the research sample are unable to complete the study due to several reasons like changes in contact details, refusal, incapacity, and even death. 

Longitudinal Studies Examples

How does a longitudinal study work in the real world? To answer this, let’s consider a few typical scenarios. 

A researcher wants to know the effects of a low-carb diet on weight loss. So, he gathers a group of obese men and kicks off the systematic investigation using his preferred longitudinal study method. He records information like how much they weigh, the number of carbs in their diet, and the like at different points. All these data help him to arrive at valid research outcomes. 

Use for Free: Macros Calories Diet Plan Template

A researcher wants to know if there’s any relationship between children who drink milk before school and high classroom performance . First, he uses a sampling technique to gather a large research population. 

Then, he conducts a baseline survey to establish the premise of the research for later comparison. Next, the researcher gives a log to each participant to keep track of predetermined research variables . 

Example 3  

You decide to study how a particular diet affects athletes’ performance over time. First, you gather your sample population , establish a baseline for the research, and observe and record the required data.

Longitudinal Studies Frequently Asked Questions (FAQs) 

  • Are Longitudinal Studies Quantitative or Qualitative?

Longitudinal studies are primarily a qualitative research method because the researcher observes and records changes in variables over an extended period. However, it can also be used to gather quantitative data depending on your research context. 

  • What Is Most Likely the Biggest Problem with Longitudinal Research?

The biggest challenge with longitudinal research is panel attrition. Due to the length of the research process, some variables might be unable to complete the study for one reason or the other. When this happens, it can distort your data and research outcomes. 

  • What is Longitudinal Data Collection?

Longitudinal data collection is the process of gathering information from the same sample population over a long period. Longitudinal data collection uses interviews, surveys, and observation to collect the required information from research sources. 

  • What is the Difference Between Longitudinal Data and a Time Series Analysis?

Because longitudinal studies collect data over a long period, they are often mistaken for time series analysis. So what’s the real difference between these two concepts? 

In a time series analysis, the researcher focuses on a single individual at multiple time intervals. Meanwhile, longitudinal data focuses on multiple individuals at various time intervals. 

Logo

Connect to Formplus, Get Started Now - It's Free!

  • cohort study
  • cross-sectional study
  • longitudinal study
  • longitudinal study faq
  • panel study
  • retrospective cohort study
  • sample data
  • busayo.longe

Formplus

You may also like:

11 Retrospective vs Prospective Cohort Study Differences

differences between retrospective and prospective cohort studies in definitions, examples, data collection, analysis, advantages, sample...

example of a longitudinal research question

What is Pure or Basic Research? + [Examples & Method]

Simple guide on pure or basic research, its methods, characteristics, advantages, and examples in science, medicine, education and psychology

Selection Bias in Research: Types, Examples & Impact

In this article, we’ll discuss the effects of selection bias, how it works, its common effects and the best ways to minimize it.

Cross-Sectional Studies: Types, Pros, Cons & Uses

In this article, we’ll look at what cross-sectional studies are, how it applies to your research and how to use Formplus to collect...

Formplus - For Seamless Data Collection

Collect data the right way with a versatile data collection tool. try formplus and transform your work productivity today..

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

  • Knowledge Base
  • Methodology
  • Longitudinal Study | Definition, Approaches & Examples

Longitudinal Study | Definition, Approaches & Examples

Published on 5 May 2022 by Lauren Thomas . Revised on 24 October 2022.

In a longitudinal study, researchers repeatedly examine the same individuals to detect any changes that might occur over a period of time.

Longitudinal studies are a type of correlational research in which researchers observe and collect data on a number of variables without trying to influence those variables.

While they are most commonly used in medicine, economics, and epidemiology, longitudinal studies can also be found in the other social or medical sciences.

Table of contents

How long is a longitudinal study, longitudinal vs cross-sectional studies, how to perform a longitudinal study, advantages and disadvantages of longitudinal studies, frequently asked questions about longitudinal studies.

No set amount of time is required for a longitudinal study, so long as the participants are repeatedly observed. They can range from as short as a few weeks to as long as several decades. However, they usually last at least a year, oftentimes several.

One of the longest longitudinal studies, the Harvard Study of Adult Development , has been collecting data on the physical and mental health of a group of men in Boston, in the US, for over 80 years.

Prevent plagiarism, run a free check.

The opposite of a longitudinal study is a cross-sectional study. While longitudinal studies repeatedly observe the same participants over a period of time, cross-sectional studies examine different samples (or a ‘cross-section’) of the population at one point in time. They can be used to provide a snapshot of a group or society at a specific moment.

Cross-sectional vs longitudinal studies

Both types of study can prove useful in research. Because cross-sectional studies are shorter and therefore cheaper to carry out, they can be used to discover correlations that can then be investigated in a longitudinal study.

If you want to implement a longitudinal study, you have two choices: collecting your own data or using data already gathered by somebody else.

Using data from other sources

Many governments or research centres carry out longitudinal studies and make the data freely available to the general public. For example, anyone can access data from the 1970 British Cohort Study, which has followed the lives of 17,000 Brits since their births in a single week in 1970, through the UK Data Service website .

These statistics are generally very trustworthy and allow you to investigate changes over a long period of time. However, they are more restrictive than data you collect yourself. To preserve the anonymity of the participants, the data collected is often aggregated so that it can only be analysed on a regional level. You will also be restricted to whichever variables the original researchers decided to investigate.

If you choose to go down this route, you should carefully examine the source of the dataset as well as what data are available to you.

Collecting your own data

If you choose to collect your own data, the way you go about it will be determined by the type of longitudinal study you choose to perform. You can choose to conduct a retrospective or a prospective study.

  • In a retrospective study , you collect data on events that have already happened.
  • In a prospective study , you choose a group of subjects and follow them over time, collecting data in real time.

Retrospective studies are generally less expensive and take less time than prospective studies, but they are more prone to measurement error.

Like any other research design , longitudinal studies have their trade-offs: they provide a unique set of benefits, but also come with some downsides.

Longitudinal studies allow researchers to follow their subjects in real time. This means you can better establish the real sequence of events, allowing you insight into cause-and-effect relationships.

Longitudinal studies also allow repeated observations of the same individual over time. This means any changes in the outcome variable cannot be attributed to differences between individuals.

Prospective longitudinal studies eliminate the risk of recall bias , or the inability to correctly recall past events.

Disadvantages

Longitudinal studies are time-consuming and often more expensive than other types of studies, so they require significant commitment and resources to be effective.

Since longitudinal studies repeatedly observe subjects over a period of time, any potential insights from the study can take a while to be discovered.

Attrition, which occurs when participants drop out of a study, is common in longitudinal studies and may result in invalid conclusions.

Longitudinal studies and cross-sectional studies are two different types of research design . In a cross-sectional study you collect data from a population at a specific point in time; in a longitudinal study you repeatedly collect data from the same sample over an extended period of time.

Longitudinal study Cross-sectional study
observations Observations at a in time
Observes the multiple times Observes (a ‘cross-section’) in the population
Follows in participants over time Provides of society at a given point

Longitudinal studies can last anywhere from weeks to decades, although they tend to be at least a year long.

The 1970 British Cohort Study , which has collected data on the lives of 17,000 Brits since their births in 1970, is one well-known example of a longitudinal study .

Longitudinal studies are better to establish the correct sequence of events, identify changes over time, and provide insight into cause-and-effect relationships, but they also tend to be more expensive and time-consuming than other types of studies.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

Thomas, L. (2022, October 24). Longitudinal Study | Definition, Approaches & Examples. Scribbr. Retrieved 29 July 2024, from https://www.scribbr.co.uk/research-methods/longitudinal-study-design/

Is this article helpful?

Lauren Thomas

Lauren Thomas

Other students also liked, correlational research | guide, design & examples, a quick guide to experimental design | 5 steps & examples, descriptive research design | definition, methods & examples.

  • Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar

Statistics By Jim

Making statistics intuitive

Longitudinal Study: Overview, Examples & Benefits

By Jim Frost Leave a Comment

What is a Longitudinal Study?

A longitudinal study is an experimental design that takes repeated measurements of the same subjects over time. These studies can span years or even decades. Unlike cross-sectional studies , which analyze data at a single point, longitudinal studies track changes and developments, producing a more dynamic assessment.

A cohort study is a specific type of longitudinal study focusing on a group of people sharing a common characteristic or experience within a defined period.

Imagine tracking a group of individuals over time. Researchers collect data regularly, analyzing how specific factors evolve or influence outcomes. This method offers a dynamic view of trends and changes.

Diagram that illustrates a longitudinal study.

Consider a study tracking 100 high school students’ academic performances annually for ten years. Researchers observe how various factors like teaching methods, family background, and personal habits impact their academic growth over time.

Researchers frequently use longitudinal studies in the following fields:

  • Psychology: Understanding behavioral changes.
  • Sociology: Observing societal trends.
  • Medicine: Tracking disease progression.
  • Education: Assessing long-term educational outcomes.

Learn more about Experimental Designs: Definition and Types .

Duration of Longitudinal Studies

Typically, the objectives dictate how long researchers run a longitudinal study. Studies focusing on rapid developmental phases, like early childhood, might last a few years. On the other hand, exploring long-term trends, like aging, can span decades. The key is to align the duration with the research goals.

Implementing a Longitudinal Study: Your Options

When planning a longitudinal study, you face a crucial decision: gather new data or use existing datasets.

Option 1: Utilizing Existing Data

Governments and research centers often share data from their longitudinal studies. For instance, the U.S. National Longitudinal Surveys (NLS) has been tracking thousands of Americans since 1979, offering a wealth of data accessible through the Bureau of Labor Statistics .

This type of data is usually reliable, offering insights over extended periods. However, it’s less flexible than the data that the researchers can collect themselves. Often, details are aggregated to protect privacy, limiting analysis to broader regions. Additionally, the original study’s variables restrict you, and you can’t tailor data collection to meet your study’s needs.

If you opt for existing data, scrutinize the dataset’s origin and the available information.

Option 2: Collecting Data Yourself

If you decide to gather your own data, your approach depends on the study type: retrospective or prospective.

A retrospective longitudinal study focuses on past events. This type is generally quicker and less costly but more prone to errors.

The prospective form of this study tracks a subject group over time, collecting data as events unfold. This approach allows the researchers to choose the variables they’ll measure and how they’ll measure them. Usually, these studies produce the best data but are more expensive.

While retrospective studies save time and money, prospective studies, though more resource-intensive, offer greater accuracy.

Learn more about Retrospective and Prospective Studies .

Advantages of a Longitudinal Study

Longitudinal studies can provide insight into developmental phases and long-term changes, which cross-sectional studies might miss.

These studies can help you determine the sequence of events. By taking multiple observations of the same individuals over time, you can attribute changes to the other variables rather than differences between subjects. This benefit of having the subjects be their own controls is one that applies to all within-subjects studies, also known as repeated measures design. Learn more about Repeated Measures Designs .

Consider a longitudinal study examining the influence of a consistent reading program on children’s literacy development. In a longitudinal framework, factors like innate linguistic ability, which typically don’t fluctuate significantly, are inherently accounted for by using the same group of students over time. This approach allows for a more precise assessment of the reading program’s direct impact over the study’s duration.

Collectively, these benefits help you establish causal relationships. Consequently, longitudinal studies excel in revealing how variables change over time and identifying potential causal relationships .

Disadvantages of a Longitudinal Study

A longitudinal study can be time-consuming and expensive, given its extended duration.

For example, a 30-year study on the aging process may require substantial funding for decades and a long-term commitment from researchers and staff.

Over time, participants may selectively drop out, potentially skewing results and reducing the study’s effectiveness.

For instance, in a study examining the long-term effects of a new fitness regimen, more physically fit participants might be less likely to drop out than those finding the regimen challenging. This scenario potentially skews the results to exaggerate the program’s effectiveness.

Maintaining consistent data collection methods and standards over a long period can be challenging.

For example, a longitudinal study that began using face-to-face interviews might face consistency issues if it later shifts to online surveys, potentially affecting the quality and comparability of the responses.

In conclusion, longitudinal studies are powerful tools for understanding changes over time. While they come with challenges, their ability to uncover trends and causal relationships makes them invaluable in many fields. As with any research method, understanding their strengths and limitations is critical to effectively utilizing their potential.

Newman AB. An overview of the design, implementation, and analyses of longitudinal studies on aging . J Am Geriatr Soc. 2010 Oct;58 Suppl 2:S287-91. doi: 10.1111/j.1532-5415.2010.02916.x. PMID: 21029055; PMCID: PMC3008590.

Share this:

example of a longitudinal research question

Reader Interactions

Comments and questions cancel reply.

  • Bipolar Disorder
  • Therapy Center
  • When To See a Therapist
  • Types of Therapy
  • Best Online Therapy
  • Best Couples Therapy
  • Best Family Therapy
  • Managing Stress
  • Sleep and Dreaming
  • Understanding Emotions
  • Self-Improvement
  • Healthy Relationships
  • Student Resources
  • Personality Types
  • Guided Meditations
  • Verywell Mind Insights
  • 2024 Verywell Mind 25
  • Mental Health in the Classroom
  • Editorial Process
  • Meet Our Review Board
  • Crisis Support

What Is a Longitudinal Study?

Tracking Variables Over Time

Steve McAlister / The Image Bank / Getty Images

The Typical Longitudinal Study

Potential pitfalls, frequently asked questions.

A longitudinal study follows what happens to selected variables over an extended time. Psychologists use the longitudinal study design to explore possible relationships among variables in the same group of individuals over an extended period.

Once researchers have determined the study's scope, participants, and procedures, most longitudinal studies begin with baseline data collection. In the days, months, years, or even decades that follow, they continually gather more information so they can observe how variables change over time relative to the baseline.

For example, imagine that researchers are interested in the mental health benefits of exercise in middle age and how exercise affects cognitive health as people age. The researchers hypothesize that people who are more physically fit in their 40s and 50s will be less likely to experience cognitive declines in their 70s and 80s.

Longitudinal vs. Cross-Sectional Studies

Longitudinal studies, a type of correlational research , are usually observational, in contrast with cross-sectional research . Longitudinal research involves collecting data over an extended time, whereas cross-sectional research involves collecting data at a single point.

To test this hypothesis, the researchers recruit participants who are in their mid-40s to early 50s. They collect data related to current physical fitness, exercise habits, and performance on cognitive function tests. The researchers continue to track activity levels and test results for a certain number of years, look for trends in and relationships among the studied variables, and test the data against their hypothesis to form a conclusion.

Examples of Early Longitudinal Study Design

Examples of longitudinal studies extend back to the 17th century, when King Louis XIV periodically gathered information from his Canadian subjects, including their ages, marital statuses, occupations, and assets such as livestock and land. He used the data to spot trends over the years and understand his colonies' health and economic viability.

In the 18th century, Count Philibert Gueneau de Montbeillard conducted the first recorded longitudinal study when he measured his son every six months and published the information in "Histoire Naturelle."

The Genetic Studies of Genius (also known as the Terman Study of the Gifted), which began in 1921, is one of the first studies to follow participants from childhood into adulthood. Psychologist Lewis Terman's goal was to examine the similarities among gifted children and disprove the common assumption at the time that gifted children were "socially inept."

Types of Longitudinal Studies

Longitudinal studies fall into three main categories.

  • Panel study : Sampling of a cross-section of individuals
  • Cohort study : Sampling of a group based on a specific event, such as birth, geographic location, or experience
  • Retrospective study : Review of historical information such as medical records

Benefits of Longitudinal Research

A longitudinal study can provide valuable insight that other studies can't. They're particularly useful when studying developmental and lifespan issues because they allow glimpses into changes and possible reasons for them.

For example, some longitudinal studies have explored differences and similarities among identical twins, some reared together and some apart. In these types of studies, researchers tracked participants from childhood into adulthood to see how environment influences personality , achievement, and other areas.

Because the participants share the same genetics , researchers chalked up any differences to environmental factors . Researchers can then look at what the participants have in common and where they differ to see which characteristics are more strongly influenced by either genetics or experience. Note that adoption agencies no longer separate twins, so such studies are unlikely today. Longitudinal studies on twins have shifted to those within the same household.

As with other types of psychology research, researchers must take into account some common challenges when considering, designing, and performing a longitudinal study.

Longitudinal studies require time and are often quite expensive. Because of this, these studies often have only a small group of subjects, which makes it difficult to apply the results to a larger population.

Selective Attrition

Participants sometimes drop out of a study for any number of reasons, like moving away from the area, illness, or simply losing motivation . This tendency, known as selective attrition , shrinks the sample size and decreases the amount of data collected.

If the final group no longer reflects the original representative sample , attrition can threaten the validity of the experiment. Validity refers to whether or not a test or experiment accurately measures what it claims to measure. If the final group of participants doesn't represent the larger group accurately, generalizing the study's conclusions is difficult.

The World’s Longest-Running Longitudinal Study

Lewis Terman aimed to investigate how highly intelligent children develop into adulthood with his "Genetic Studies of Genius." Results from this study were still being compiled into the 2000s. However, Terman was a proponent of eugenics and has been accused of letting his own sexism , racism , and economic prejudice influence his study and of drawing major conclusions from weak evidence. However, Terman's study remains influential in longitudinal studies. For example, a recent study found new information on the original Terman sample, which indicated that men who skipped a grade as children went on to have higher incomes than those who didn't.

A Word From Verywell

Longitudinal studies can provide a wealth of valuable information that would be difficult to gather any other way. Despite the typical expense and time involved, longitudinal studies from the past continue to influence and inspire researchers and students today.

A longitudinal study follows up with the same sample (i.e., group of people) over time, whereas a cross-sectional study examines one sample at a single point in time, like a snapshot.

A longitudinal study can occur over any length of time, from a few weeks to a few decades or even longer.

That depends on what researchers are investigating. A researcher can measure data on just one participant or thousands over time. The larger the sample size, of course, the more likely the study is to yield results that can be extrapolated.

Piccinin AM, Knight JE. History of longitudinal studies of psychological aging . Encyclopedia of Geropsychology. 2017:1103-1109. doi:10.1007/978-981-287-082-7_103

Terman L. Study of the gifted . In: The SAGE Encyclopedia of Educational Research, Measurement, and Evaluation. 2018. doi:10.4135/9781506326139.n691

Sahu M, Prasuna JG. Twin studies: A unique epidemiological tool .  Indian J Community Med . 2016;41(3):177-182. doi:10.4103/0970-0218.183593

Almqvist C, Lichtenstein P. Pediatric twin studies . In:  Twin Research for Everyone . Elsevier; 2022:431-438.

Warne RT. An evaluation (and vindication?) of Lewis Terman: What the father of gifted education can teach the 21st century . Gifted Child Q. 2018;63(1):3-21. doi:10.1177/0016986218799433

Warne RT, Liu JK. Income differences among grade skippers and non-grade skippers across genders in the Terman sample, 1936–1976 . Learning and Instruction. 2017;47:1-12. doi:10.1016/j.learninstruc.2016.10.004

Wang X, Cheng Z. Cross-sectional studies: Strengths, weaknesses, and recommendations .  Chest . 2020;158(1S):S65-S71. doi:10.1016/j.chest.2020.03.012

Caruana EJ, Roman M, Hernández-Sánchez J, Solli P. Longitudinal studies .  J Thorac Dis . 2015;7(11):E537-E540. doi:10.3978/j.issn.2072-1439.2015.10.63

By Kendra Cherry, MSEd Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

example of a longitudinal research question

What (Exactly) Is A Longitudinal Study?

A plain-language explanation & definition (with examples).

By: Derek Jansen (MBA) | June 2020

If you’re new to the world of research, or it’s your first time writing a dissertation or thesis, you’re probably feeling a bit overwhelmed by all the technical lingo that’s hitting you. If you’ve landed here, chances are one of these terms is “longitudinal study”, “longitudinal survey” or “longitudinal research”.

Worry not – in this post, we’ll explain exactly:

  • What a longitudinal study is (and what the alternative is)
  • What the main advantages of a longitudinal study are
  • What the main disadvantages of a longitudinal study are
  • Whether to use a longitudinal or cross-sectional study for your research

What is a longitudinal study, survey and research?

What is a longitudinal study?

A longitudinal study or a longitudinal survey (both of which make up longitudinal research) is a study where the same data are collected more than once,  at different points in time . The purpose of a longitudinal study is to assess not just  what  the data reveal at a fixed point in time, but to understand  how (and why) things change  over time.

Longitudinal research involves a study where the same data are collected more than once, at different points in time

Example: Longitudinal vs Cross-Sectional

Here are two examples – one of a longitudinal study and one of a cross-sectional study – to give you an idea of what these two approaches look like in the real world:

Longitudinal study: a study which assesses how a group of 13-year old children’s attitudes and perspectives towards income inequality evolve over a period of 5 years, with the same group of children surveyed each year, from 2020 (when they are all 13) until 2025 (when they are all 18).

Cross-sectional study: a study which assesses a group of teenagers’ attitudes and perspectives towards income equality at a single point in time. The teenagers are aged 13-18 years and the survey is undertaken in January 2020.

From this example, you can probably see that the topic of both studies is still broadly the same (teenagers’ views on income inequality), but the data produced could potentially be very different . This is because the longitudinal group’s views will be shaped by the events of the next five years, whereas the cross-sectional group all have a “2020 perspective”. 

Additionally, in the cross-sectional group, each age group (i.e. 13, 14, 15, 16, 17 and 18) are all different people (obviously!) with different life experiences – whereas, in the longitudinal group, each the data at each age point is generated by the same group of people (for example, John Doe will complete a survey at age 13, 14, 15, and so on). 

There are, of course, many other factors at play here and many other ways in which these two approaches differ – but we won’t go down that rabbit hole in this post.

There are many differences between longitudinal and cross-sectional studies

What are the advantages of a longitudinal study?

Longitudinal studies and longitudinal surveys offer some major benefits over cross-sectional studies. Some of the main advantages are:

Patterns  – because longitudinal studies involve collecting data at multiple points in time from the same respondents, they allow you to identify emergent patterns across time that you’d never see if you used a cross-sectional approach. 

Order  – longitudinal studies reveal the order in which things happened, which helps a lot when you’re trying to understand causation. For example, if you’re trying to understand whether X causes Y or Y causes X, it’s essential to understand which one comes first (which a cross-sectional study cannot tell you).

Bias  – because longitudinal studies capture current data at multiple points in time, they are at lower risk of recall bias . In other words, there’s a lower chance that people will forget an event, or forget certain details about it, as they are only being asked to discuss current matters.

Need a helping hand?

example of a longitudinal research question

What are the disadvantages of a longitudinal study?

As you’ve seen, longitudinal studies have some major strengths over cross-sectional studies. So why don’t we just use longitudinal studies for everything? Well, there are (naturally) some disadvantages to longitudinal studies as well.

Cost  – compared to cross-sectional studies, longitudinal studies are typically substantially more expensive to execute, as they require maintained effort over a long period of time.

Slow  – given the nature of a longitudinal study, it takes a lot longer to pull off than a cross-sectional study. This can be months, years or even decades. This makes them impractical for many types of research, especially dissertations and theses at Honours and Masters levels (where students have a predetermined timeline for their research)

Drop out  – because longitudinal studies often take place over many years, there is a very real risk that respondents drop out over the length of the study. This can happen for any number of reasons (for examples, people relocating, starting a family, a new job, etc) and can have a very detrimental effect on the study.

Some disadvantages to longitudinal studies include higher cost, longer execution time  and higher dropout rates.

Which one should you use?

Choosing whether to use a longitudinal or cross-sectional study for your dissertation, thesis or research project requires a few considerations. Ultimately, your decision needs to be informed by your overall research aims, objectives and research questions (in other words, the nature of the research determines which approach you should use). But you also need to consider the practicalities. You should ask yourself the following:

  • Do you really need a view of how data changes over time, or is a snapshot sufficient?
  • Is your university flexible in terms of the timeline for your research?
  • Do you have the budget and resources to undertake multiple surveys over time?
  • Are you certain you’ll be able to secure respondents over a long period of time?

If your answer to any of these is no, you need to think carefully about the viability of a longitudinal study in your situation. Depending on your research objectives, a cross-sectional design might do the trick. If you’re unsure, speak to your research supervisor or connect with one of our friendly Grad Coaches .

example of a longitudinal research question

Psst... there’s more!

This post was based on one of our popular Research Bootcamps . If you're working on a research project, you'll definitely want to check this out ...

Submit a Comment Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

  • Print Friendly

Study.com

In order to continue enjoying our site, we ask that you confirm your identity as a human. Thank you very much for your cooperation.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • BMC Med Res Methodol

Logo of bmcmrm

Qualitative longitudinal research in health research: a method study

Åsa audulv.

1 Department of Nursing, Umeå University, Umeå, Sweden

Elisabeth O. C. Hall

2 Faculty of Health, Aarhus University, Aarhus, Denmark

3 Faculty of Health Sciences, University of Faroe Islands, Thorshavn, Faroe Islands Denmark

Åsa Kneck

4 Department of Health Care Sciences, Ersta Sköndal Bräcke University College, Stockholm, Sweden

Thomas Westergren

5 Department of Health and Nursing Science, University of Agder, Kristiansand, Norway

6 Department of Public Health, University of Stavanger, Stavanger, Norway

Mona Kyndi Pedersen

7 Center for Clinical Research, North Denmark Regional Hospital, Hjørring, Denmark

8 Department of Clinical Medicine, Aalborg University, Aalborg, Denmark

Hanne Aagaard

9 Lovisenberg Diaconale Univeristy of College, Oslo, Norway

Kristianna Lund Dam

Mette spliid ludvigsen.

10 Department of Clinical Medicine-Randers Regional Hospital, Aarhus University, Aarhus, Denmark

11 Faculty of Nursing and Health Sciences, Nord University, Bodø, Norway

Associated Data

The datasets used and analyzed in this current study are available in supplementary file  6 .

Qualitative longitudinal research (QLR) comprises qualitative studies, with repeated data collection, that focus on the temporality (e.g., time and change) of a phenomenon. The use of QLR is increasing in health research since many topics within health involve change (e.g., progressive illness, rehabilitation). A method study can provide an insightful understanding of the use, trends and variations within this approach. The aim of this study was to map how QLR articles within the existing health research literature are designed to capture aspects of time and/or change.

This method study used an adapted scoping review design. Articles were eligible if they were written in English, published between 2017 and 2019, and reported results from qualitative data collected at different time points/time waves with the same sample or in the same setting. Articles were identified using EBSCOhost. Two independent reviewers performed the screening, selection and charting.

A total of 299 articles were included. There was great variation among the articles in the use of methodological traditions, type of data, length of data collection, and components of longitudinal data collection. However, the majority of articles represented large studies and were based on individual interview data. Approximately half of the articles self-identified as QLR studies or as following a QLR design, although slightly less than 20% of them included QLR method literature in their method sections.

Conclusions

QLR is often used in large complex studies. Some articles were thoroughly designed to capture time/change throughout the methodology, aim and data collection, while other articles included few elements of QLR. Longitudinal data collection includes several components, such as what entities are followed across time, the tempo of data collection, and to what extent the data collection is preplanned or adapted across time. Therefore, there are several practices and possibilities researchers should consider before starting a QLR project.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12874-022-01732-4.

Health research is focused on areas and topics where time and change are relevant. For example, processes such as recovery or changes in health status. However, relating time and change can be complicated in research, as the representation of reality in research publications is often collected at one point in time and fixed in its presentation, although time and change are always present in human life and experiences. Qualitative longitudinal research (QLR; also called longitudinal qualitative research, LQR) has been developed to focus on subjective experiences of time or change using qualitative data materials (e.g., interviews, observations and/or text documents) collected across a time span with the same participants and/or in the same setting [ 1 , 2 ]. QLR within health research may have many benefits. Firstly, human experiences are not fixed and consistent, but changing and diverse, therefore people’s experiences in relation to a health phenomenon may be more comprehensively described by repeated interviews or observations over time. Secondly, experiences, behaviors, and social norms unfold over time. By using QLR, researchers can collect empirical data that represents not only recalled human conceptions but also serial and instant situations reflecting transitions, trajectories and changes in people’s health experiences, personal development or health care organizations [ 3 – 5 ].

Key features of QLR

Whether QLR is a methodological approach in its own right or a design element of a particular study within a traditional methodological approach (e.g., ethnography or grounded theory) is debated [ 1 , 6 ]. For example, Bennett et al. [ 7 ] describe QLR as untied to methodology, giving researchers the flexibility to develop a suitable design for each study. McCoy [ 6 ] suggests that epistemological and ontological standpoints from interpretative phenomenological analysis (IPA) align with QLR traditions, thus making longitudinal IPA a suitable methodology. Plano-Clark et al. [ 8 ] described how longitudinal qualitative elements can be used in mixed methods studies, thus creating longitudinal mixed methods. In contrast, several researchers have argued that QLR is an emerging methodology [ 1 , 5 , 9 , 10 ]. For example, Thomson et al. [ 9 ] have stated “What distinguishes longitudinal qualitative research is the deliberate way in which temporality is designed into the research process, making change a central focus of analytic attention” (p. 185). Tuthill et al. [ 5 ] concluded that some of the confusion might have arisen from the diversity of data collection methods and data materials used within QLR research. However, there are no investigations showing to what extent QLR studies use QLR as a distinct methodology versus using a longitudinal data collection as a more flexible design element in combination with other qualitative methodologies.

QLR research should focus on aspects of temporality, time and/or change [ 11 – 13 ]. The concepts of time and change are seen as inseparable since change is happening with the passing of time [ 13 ]. However, time can be conceptualized in different ways. Time is often understood from a chronological perspective, and is viewed as fixed, objective, continuous and measurable (e.g., clock time, duration of time). However, time can also be understood from within, as the experience of the passing of time and/or the perspective from the current moment into the constructed conception of a history or future. From this perspective, time is seen as fluid, meaning that events, contexts and understandings create a subjective experience of time and change. Both the chronological and fluid understanding of time influence QLR research [ 11 ]. Furthermore, there is a distinction between over-time, which constitutes a comparison of the difference between points in time, often with a focus on the latter point or destination, and through-time, which means following an aspect across time while trying to understand the change that occurs [ 11 ]. In this article, we will mostly use the concept of across time to include both perspectives.

Some authors assert that QLR studies should include a qualitative data collection with the same sample across time [ 11 , 13 ], whereas Thomson et al. [ 9 ] also suggest the possibility of returning to the same data collection site with the same or different participants. When a QLR study involves data collection in shorter engagements, such as serial interviews, these engagements are often referred to as data collection time points. Data collection in time waves relates to longer engagements, such as field work/observation periods. There is no clear-cut definition for the minimum time span of a QLR study; instead, the length of the data collection period must be decided based upon what processes or changes are the focus of the study [ 13 ].

Most literature describing QLR methods originates from the social sciences, where the approach has a long tradition [ 1 , 10 , 14 ]. In health research, one-time-data collection studies have been the norm within qualitative methods [ 15 ], although health research using QLR methods has increased in recent years [ 2 , 5 , 16 , 17 ]. However, collecting and managing longitudinal data has its own sets of challenges, especially regarding how to integrate perspectives of time and/or change in the data collection and subsequent analysis [ 1 ]. Therefore, a study of QLR articles from the health research literature can provide an insightful understanding of the use, trends and variations of how methods are used and how elements of time/change are integrated in QLR studies. This could, in turn, provide inspiration for using different possibilities of collecting data across time when using QLR in health research. The aim of this study was to map how QLR articles within the existing health research literature are designed to capture aspects of time and/or change.

More specifically, the research questions were:

  • What methodological approaches are described to inform QLR research?
  • What methodological references are used to inform QLR research?
  • How are longitudinal perspectives articulated in article aims?
  • How is longitudinal data collection conducted?

In this method study, we used an adapted scoping review method [ 18 – 20 ]. Method studies are research conducted on research studies to investigate how research design elements are applied across a field [ 21 ]. However, since there are no clear guidelines for method studies, they often use adapted versions of systematic reviews or scoping review methods [ 21 ]. The adaptations of the scoping review method consisted of 1) using a large subsample of studies (publications from a three-year period) instead of including all QLR articles published, and 2) not including grey literature. The reporting of this study was guided by the Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews (PRISMA-ScR) checklist [ 20 , 22 ] (see Additional file 1 ). A (unpublished) protocol was developed by the research team during the spring of 2019.

Eligibility criteria

In line with method study recommendations [ 21 ], we decided to draw on a manageable subsample of published QLR research. Articles that were eligible for inclusion were health research primary studies written in English, published between 2017 and 2019, and with a longitudinal qualitative data collection. Our operating definition for qualitative longitudinal data collection was data collected at different time points (e.g., repeated interviews) or time waves (e.g., periods of field work) involving the same sample or conducted in the same setting(s). We intentionally selected a broad inclusion criterion for QLR since we wanted a wide variety of articles. The selected time period was chosen because the first QLR method article directed towards health research was published in 2013 [ 1 ] and during the following years the methodological resources for QLR increased [ 3 , 8 , 17 , 23 – 25 ], thus we could expect that researchers publishing QLR in 2017–2019 should be well-grounded in QLR methods. Further, we found that from 2012 to 2019 the rate of published QLR articles were steady at around 100 publications per year, so including those from a three-year period would give a sufficient number of articles (~ 300 articles) for providing an overview of the field. Published conference abstracts, protocols, articles describing methodological issues, review articles, and non-research articles (e.g., editorials) were excluded.

Search strategy

Relevant articles were identified through systematic searches in EBSCOhost, including biomedical and life science research and nursing and allied health literature. A librarian who specialized in systematic review searches developed and performed the searches, in collaboration with the author team (LF, TW & ÅA). In the search, the term “longitudinal” was combined with terms for qualitative research (for the search strategy see Additional file 2 ). The searches were conducted in the autumn of 2019 (last search 2019-09-10).

Study selection

All identified citations were imported into EndNote X9 ( www.endnote.com ) and further imported into Rayyan QCRI online software [ 26 ], and duplicates were removed. All titles and abstracts were screened against the eligibility criteria by two independent reviewers (ÅA & EH), and conflicting decisions were discussed until resolved. After discussions by the team, we decided to include articles published between 2017 and 2019, that selection alone included 350 records with diverse methods and designs. The full texts of articles that were eligible for inclusion were retrieved. In the next stage, two independent reviewers reviewed each full text article to make final decisions regarding inclusion (ÅA, EH, Julia Andersson). In total, disagreements occurred in 8% of the decisions, and were resolved through discussion. Critical appraisal was not assessed since the study aimed to describe the range of how QLR is applied and not aggregate research findings [ 21 , 22 ].

Data charting and analysis

A standardized charting form was developed in Excel (Excel 2016). The charting form was reviewed by the research team and pretested in two stages. The tests were performed to increase internal consistency and reduce the risk of bias. First, four articles were reviewed by all the reviewers, and modifications were made to the form and charting instructions. In the next stage, all reviewers used the charting form on four other articles, and the convergence in ratings was 88%. Since the convergence was under 90%, charting was performed in duplicate to reduce errors in the data. At the end of the charting process, the convergence among the reviewers was 95%. The charting was examined by the first author, who revised the charting in cases of differences.

Data items that were charted included 1) the article characteristics (e.g., authors, publication year, journal, country), 2) the aim and scope (e.g., phenomenon of interest, population, contexts), 3) the stated methodology and analysis method, 4) text describing the data collection (e.g., type of data material, number of participants, time frame of data collection, total amount of data material), and 5) the qualitative methodological references used in the methods section. Extracted text describing data collection could consist of a few sentences or several sections from the articles (and sometimes figures) concerning data collection practices, rational for time periods and research engagement in the field. This was later used to analyze how the longitudinal data collection was conducted and elements of longitudinal design. To categorize the qualitative methodology approaches, a framework from Cresswell [ 27 ] was used (including the categories for grounded theory, phenomenology, ethnography, case study and narrative research). Overall, data items needed to be explicitly stated in the articles in order to be charted. For example, an article was categorized as grounded theory if it explicitly stated “in this grounded theory study” but not if it referred to the literature by Glaser and Strauss without situating itself as a grounded theory study (See Additional file 3 for the full instructions for charting).

All charting forms were compiled into a single Microsoft Excel spreadsheet (see Supplementary files for an overview of the articles). Descriptive statistics with frequencies and percentages were calculated to summarize the data. Furthermore, an iterative coding process was used to group the articles and investigate patterns of, for example, research topics, words in the aims, or data collection practices. Alternative ways of grouping and presenting the data were discussed by the research team.

Search and selection

A total of 2179 titles and abstracts were screened against the eligibility criteria (see Fig.  1 ). The full text of one article could not be found and the article was excluded [ 28 ]. Fifty full text articles were excluded. Finally, 299 articles, representing 271 individual studies, were included in this study (see additional files 4 and 5 respectively for tables of excluded and included articles).

An external file that holds a picture, illustration, etc.
Object name is 12874_2022_1732_Fig1_HTML.jpg

PRISMA diagram of study selection]

General characteristics and research areas of the included articles

The articles were published in many journals ( n  = 193), and 138 of these journals were represented with one article each. BMJ Open was the most prevalent journal ( n  = 11), followed by the Journal of Clinical Nursing ( n  = 8). Similarly, the articles represented many countries ( n  = 41) and all the continents; however, a large part of the studies originated from the US or UK ( n  = 71, 23.7% and n  = 70, 23.4%, respectively). The articles focused on the following types of populations: patients, families−/caregivers, health care providers, students, community members, or policy makers. Approximately 20% ( n  = 63, 21.1%) of the articles collected data from two or more of these types of population(s) (see Table  1 ).

Characteristics of the included QLR articles

 Europe141 (47.2)
 North America85 (28.4)
 Oceania33 (11.0)
 Africa23 (7.7)
 Asia10 (3.3)
 South America3 (1.0)
 Several continents3 (1.0)
(Articles could include several types of populations)
 Patients (individuals with a health condition)122 (40.8)
 Family members/caregivers72 (24.1)
 Community members (citizens, people in low income areas, volunteers)63 (21.1)
 Health care providers61 (20.4)
 Students or pupils (mostly health care education)26 (8.7)
 Policy makers14 (4.7)
 Managers15 (5.0)
 Teachers7 (2.3)
 US national news organizations1 (0.3)
 Disease experience/beliefs52 (17.4)
 Health care navigation and/or health care-patient relationships48 (16.1)
 Experiences with health care trials/interventions or treatment43 (14.4)
 Implementation of health care practices/routines32 (10.7)
 Life transitions and development (pregnancy, breastfeeding, parenthood, adolescence, aging)23 (7.7)
 Societal adversities (violence, housing, drug addiction, criminality)22 (7.4)
 Health care providers’ professional development20 (6.7)
 Education18 (6.0)
 Family caregiving14 (4.7)
 Health behaviors and sports (e.g., physical activity, smoking cessation, talent development)11 (3.7)
 Policy development and social reform5 (1.7)
 Experience of technology (assistive technology, aids and adaptations)4 (1.3)
 Disaster experiences (flooding, earthquakes)3 (1.0)
(from which participants were recruited. Articles could have several contexts)
 Specialist care/Hospital84 (28.1)
 Emergency/intensive/neonatal care15 (5.0)
 Primary care12 (4.0)
 Residential homes/nursing homes7 (2.3)
46 (15.8)
32 (10.7)
27 (9.0)
 Rural11 (3.7)
 Urban16 (5.4)
 Socially vulnerable area25 (8.63)
 Diversity of contexts (e.g., rural and urban area)14 (4.7)

Approximately half of the articles ( n  = 158, 52.8%) articulated being part of a larger research project. Of them, 95 described a project with both quantitative and qualitative methods. They represented either 1) a qualitative study embedded in an intervention, evaluation or implementation study ( n  = 66, 22.1%), 2) a longitudinal cohort study collecting both quantitative and qualitative material ( n  = 23, 7.7%), or 3) qualitative longitudinal material collected together with a cross sectional survey (n = 6, 2.0%). Forty-eight articles (16.1%) described belonging to a larger qualitative project presented in several research articles.

Methodological traditions

Approximately one-third ( n  = 109, 36.5%) of the included articles self-identified with one of the qualitative traditions recognized by Cresswell [ 27 ] (case study: n  = 36, 12.0%; phenomenology: n  = 35, 11.7%; grounded theory: n  = 22, 7.4%; ethnography: n  = 13, 4.3%; narrative method: n = 3, 1.0%). In nine articles, the authors described using a mix of two or more of these qualitative traditions. In addition, 19 articles (6.4%) self-identified as mixed methods research.

Every second article self-identified as having a qualitative longitudinal design ( n  = 156, 52.2%); either they self-identified as “a longitudinal qualitative study” or “using a longitudinal qualitative research design”. However, in some articles, this was stated in the title and/or abstract and nowhere else in the article. Fifty-two articles (17.4%) self-identified both as having a QLR design and following one of the methodological approaches (case study: n  = 8; phenomenology: n  = 23; grounded theory: n  = 9; ethnography: n  = 6; narrative method: n  = 2; mixed methods: n  = 4).

The other 143 articles used various terms to situate themselves in relation to a longitudinal design. Twenty-seven articles described themselves as a longitudinal study (9.0%) or a longitudinal study within a specific qualitative tradition (e.g., a longitudinal grounded theory study or a longitudinal mixed method study) ( n  = 64, 21.4%). Furthermore, 36 articles (12.0%) referred to using longitudinal data materials (e.g., longitudinal data or longitudinal interviews). Nine of the articles (3.0%) used the term longitudinal in relation to the data analysis or aim (e.g., the aim was to longitudinally describe), used terms such as serial or repeated in relation to the data collection design ( n  = 2, 0.7%), or did not use any term to address the longitudinal nature of their design ( n  = 5, 1.7%).

Use of methodological references

The mean number of qualitative method references in the methods sections was 3.7 (range 0 to 16), and 20 articles did not have any qualitative method reference in their methods sections. 1 Commonly used method references were generic books on qualitative methods, seminal works within qualitative traditions, and references specializing in qualitative analysis methods (see Table  2 ). It should be noted that some references were comprehensive books and thus could include sections about QLR without being focused on the QLR method. For example, Miles et al. [ 31 ] is all about analysis and coding and includes a chapter regarding analyzing change.

Most frequently used method references (8 most used) and QLR method references (5 most used). Citations in Google Scholar were used as an indication of how widely used the references are; searches conducted in Google Scholar 2022-01-02

N (%)Description
 Braun & Clark [ ]43 (14.4)Early, widespread description of thematic analysis. 117,046 citations in Google Scholar.
 Patton [ ]29 (9.7)Early, comprehensive book about conducting research using qualitative methods. References included 2nd, 3rd and 4th editions, published between 1990 and 2015. 111,407 citations in Google Scholar.
 Miles, Huberman & Saldaña [ ]22 (7.4)Comprehensive book about analysis and coding. This edition was coauthored with Saldana who has previously written about QLR. 420 citations in Google Scholar. The book is a developed version and the first edition was published in 1994 [ ] (144,063 citations in Google Scholar). This latter edition was used by 14 articles in the sample.
 Smith, Flowers & Larkin [ ]20 (6.7)Comprehensive book on Interpretative Phenomenological Analysis. 605 citations in Google Scholar.
 Hsieh & Shannon [ ]19 (6.4)Widespread early overview of content analysis. 36,554 citations in Google Scholar.
 Glaser & Strauss [ ]17 (5.7)First book describing grounded theory. 150,386 citations in Google Scholar.
 Tong., et al., [ ]16 (5.4)First guidelines on the reporting of qualitative articles within health research. 14,302 citations in Google Scholar.
 Calman, Brunton & Molassiotis [ ]15 (5.0)One of the first articles describing the QLR method from a health research perspective. 211 citations in Google Scholar.
 Saldaña [ ]15 (5.0)Methodological book with influence on the further development of QLR, mainly drawing on ethnographical traditions and examples from theatre education. 880 citations in Google Scholar.
 Murray [ ]11 (3.7)Article giving practical advice on the use of serial interviewing. 301 citations in Google Scholar.
 Grossoehme & Lipstein [ ]7 (2.3)Article about QLR analysis, giving examples and advice regarding two different analysis approaches. 147 citations in Google Scholar.
 Thomson & Holland [ ]5 (1.7)One article of several that originated from an early report on how QLR was used in UK. This article outlines several challenges and solutions when working with QLR. 424 citations in Google Scholar.

Only approximately 20% ( n  = 58) of the articles referred to the QLR method literature in their methods sections. 2 The mean number of QLR method references (counted for articles using such sources) was 1.7 (range 1 to 6). Most articles using the QLR method literature also used other qualitative methods literature (except two articles using one QLR literature reference each [ 39 , 40 ]). In total, 37 QLR method references were used, and 24 of the QLR method references were only referred to by one article each.

Longitudinal perspectives in article aims

In total, 231 (77.3%) articles had one or several terms related to time or change in their aims, whereas 68 articles (22.7%) had none. Over one hundred different words related to time or change were identified. Longitudinally oriented terms could focus on changes across time (process, trajectory, transition, pathway or journey), patterns of how something changed (maintenance, continuity, stability, shifts), or phenomena that by nature included change (learning or implementation). Other types of terms emphasized the data collection time period (e.g., over 6 months) or a specific changing situation (e.g., during pregnancy, through the intervention period, or moving into a nursing home). The most common terms used for the longitudinal perspective were change ( n  = 63), over time ( n  = 52), process ( n  = 36), transition ( n  = 24), implementation ( n  = 14), development ( n  = 13), and longitudinal (n = 13). 3

Furthermore, the articles varied in what ways their aims focused on time/change, e.g., the longitudinal perspectives in the aims (see Table  3 ). In 71 articles, the change across time was the phenomenon of interest of the article : for example, articles investigating the process of learning or trajectories of diseases. In contrast, 46 articles investigated change or factors impacting change in relation to a defined outcome : for example, articles investigating factors influencing participants continuing in a physical activity trial. The longitudinal perspective could also be embedded in an article’s context . In such cases, the focus of the article was on experiences that happened during a certain time frame or in a time-related context (e.g., described experiences of the patient-provider relationship during 6 months of rehabilitation).

Different longitudinal perspectives in the articles’ aims and objectives

How time or change is articulated in the aimDescriptionExampleNumber of articles
Time/change as the of interestFocus is on how changes occurs. Articles aimed to investigate phenomena such as process, trajectories or change.Coombs, Parker and de Vries [ ] aimed “to describe how decision-making influences transitions in care when approaching the end of life.” (p. 618) Thus, the focus in the aim was how decision-making influences transitions.n = 71, 23.7%
Time/change related to the of the studyFocus is on the factors, reasons or explanations of why participants reach different outcomes. Articles aimed to investigate mechanisms or factors related to an outcome often in relation to a trial or intervention.Vaghefi et al. [ ] aimed to focus on “the continued use of mHealth apps and the factors underlying this behavior”. (p. 2) In this aim, the emphasis was on whether the participant maintained their use of mHealth apps and possible explanations for their use.  = 46, 15.4%
Time/change as the of the studyFocus is on the subjective experiences of a phenomenon that may change across time. The change is not the preliminary interest. Articles aimed to investigate experiences over a certain time period (such as during the first year of nursing school, through the intervention period, or over 6 months).Andersen et al. [ ] aimed “to explore COPD patients’ and their family members’ experiences of both participation in care during hospitalization for an acute exacerbation in chronic obstructive pulmonary disease, and of the subsequent day-to-day care at home.” (p. 4879) Here the focus of the aim was on the experiences of participation, but in the context of hospitalization and subsequent homecomings.  = 93, 31.1%
Time/change in the aims.No terms connected to time or change in the aims.Albrecht et al. [ ] (p. 68) aimed “to examine the experiences of younger adults diagnosed with acute leukemia who are actively receiving induction chemotherapy”. Their aim did not include any words showing that data were collected across time or that time/change were the focus.  = 68, 22.7%
Time/change illuminated in longitudinal perspectivesArticles combining several of the longitudinal perspectives in the aims and objectives. Articles could have one objective where time/change was the phenomenon of interest and another objective where time/change was the context.

Corepal et al. [ ] aimed “to explore the views and experiences of adolescents who participated in a gamified PA [physical activity] intervention based on Self-determination Theory (SDT), and the temporal changes of these views and experiences over the 1-year study period. Study objectives included: 1. To explore key aspects of a gamified PA intervention over a 1-year period using a qualitative longitudinal research (QLR) method.

2. To discuss key issues relating to the intervention, such as PA opportunities/barriers, the value of competition and types of rewards and so on.

3. To explore the key influences of PA and to determine who benefited from the intervention, how and why it worked for them.

4. To qualitatively chart changes in behaviours, opinions or views as a result of participating in the intervention.” (p2) In this example, Research question 1 use a context approach to time/change; Research question 2 contain no description of time/change; Research question 3 used an outcome perspective; and Research question 4 investigated changes in behavior as a phenomenon.

 = 21, 7.0%

Types of data and length of data collection

The QLR articles were often large and complex in their data collection methods. The median number of participants was 20 (range from one to 1366, the latter being an article with open-ended questions in questionnaires [ 46 ]). Most articles used individual interviews as the data material ( n  = 167, 55.9%) or a combination of data materials ( n  = 98, 32.8%) (e.g., interviews and observations, individual interviews and focus group interviews, or interviews and questionnaires). Forty-five articles (15.1%) presented quantitative and qualitative results. The median number of interviews was 46 (range three to 507), which is large in comparison to many qualitative studies. The observation materials were also comprehensive and could include several hundred hours of observations. Documents were often used as complementary material and included official documents, newspaper articles, diaries, and/or patient records.

The articles’ time spans 4 for data collection varied between a few days and over 20 years, with 60% of the articles’ time spans being 1 year or shorter ( n  = 180) (see Fig.  2 ). The variation in time spans might be explained by the different kinds of phenomena that were investigated. For example, Jensen et al. [ 47 ] investigated hospital care delivery and followed each participant, with observations lasting between four and 14 days. Smithbattle [ 48 ] described the housing trajectories of teen mothers, and collected data in seven waves over 28 years.

An external file that holds a picture, illustration, etc.
Object name is 12874_2022_1732_Fig2_HTML.jpg

Number of articles in relation to the time span of data collection. The time span of data collection is given in months

Three components of longitudinal data collection

In the articles, the data collection was conducted in relation to three different longitudinal data collection components (see Table  4 ).

Components of longitudinal data collection

DescriptionExampleFrequency n (%)
IndividualData are collected from the same individuals across time in an individual mode, e.g., individual interviews, questionnaires, diaries.Albrecht et al. [ ] investigated young adults’ experiences of chemotherapy treatment in the hospital. Seven young adults were interviewed twice, with interviews about one month apart. The young adults were also invited to keep a diary between the two interviews.170 (56.9)
Individual case or dyadsData are collected from cases based upon individuals or dyads. An individual case included a primary participant (e.g., patient) and secondary participants (e.g., family, health care providers). Dyads were based on two connected individuals being equally important (e.g., parents or spouses). Data consisted of individual and/or joint interviews, observations, and/or documents, etc.Denney-Koelsch [ ] investigated couples’ experiences meeting health care providers when pregnant, with a lethal fetal diagnosis. The couples took part in up to five interviews both individually and jointly during the pregnancy and after birth.64 (21.4)
GroupsData are collected from one or several defined groups (e.g., classes of students or health care teams). The groups are followed across time but members of the group can change during the data collection period. Data were often collected with the group, e.g., focus group interviews and/or observations, and complemented with individual interviews, questionnaires or documents.Pyörälä et al. [ ] followed two classes of students over a five year period of education. Data were collected with focus groups and open-ended questions in surveys. Some students took part in several data collection rounds whereas others contributed once during the years of the data collection period.9 (3.0)
Settings (location/trial)Data are collected at the same setting(s) across time. Settings can be locations (e.g., hospital wards, community centers) or trials (e.g., interventions). Articles often included several types of populations (e.g., patients, health care providers, family members). Over the data collection period, some participants contributed on several occasions, while some contributed once. Typical data collection methods included observations and/or recorded intervention sessions, combined with individual interviews, focus group interviews, questionnaires and/or documents.

Lindberg et al. [ ] investigated how new technology was learned and used at an operational unit. Data were collected over four years through observations of training sessions, observations of daily work and medical procedures, observations of meetings and seminars, individual interviews with nurses, doctors, hospital technicians, physicists and technology suppliers, and documents. Some key participants took part in several parts of the data collection period, while others took part once.

Frost et al. [ ] investigated a home rehabilitation program for people with heart failure. Data consisted both of interviews at two time points with the same patients and caregivers, as well as audio recordings of the intervention sessions, and intervention fidelity scores. The timeline for the data collection followed the program with the last interview 12 months after baseline.

55 (18.4)
2) Tempo of data collection
Baseline and follow upData are collected at two points in time. Can be prospectively planned or followed up with previous data material.Young et al. [ ] conducted interviews with 60 women with genetic mutations increasing the risk for breast cancer. Three years later, 12 of the women took part in a follow up interview. The current article was built on data from both interviews with these 12 women.70 (23.4)
Serial time pointsData are collected at several shorter engagements.Lewis et al. [ ] explored women’s experiences of trust in relation to their midwives during pregnancy. Semistructured interviews were conducted at three time-points: in early pregnancy, late pregnancy and two months post-birth.154 (51.5)
Time wavesData are collected during time periods with some time in between the data collection periods.Mozaffar et al. [ ] explored challenges in relation to the integration of electronic prescribing systems. Semistructured interviews were complemented with observations of meetings and documents. Data were collected in two one-month periods with about two years in between the data collection periods.50 (16.7)
Continuous data collectionData are collected continuously for a period of time, for example, with regular observations for several days in a row, observations of all events of a certain kind or including all documents that fulfill specific criteria.

Castro et al. [ ] investigated nurses work-life narratives by analyzing nurses’ blogs. The data material consisted of all blog entries by four bloggers over a one-year period, with a total of 520 entries.

Jensen et al. [ , ] studied patients with Alzheimer’s disease who were receiving hospital care after a hip fracture. The three participants were observed for several day and evening shifts during their whole hospital stay. Observations for each participant ranged from 4 to 14 days.

23 (7.7)
3) Preplanned or adapted data collection
Preplanned data collectionThe data collection is planned by the research team based upon theory, previous research and project capacity.Nash et al. [ ] investigated occupational therapy students’ changes in perspectives of frames of reference during their education. The students were interviewed at four occasions over 15 months; the interviews were scheduled at the end of each course where frames of reference were part of the curriculum.224 (74.9)
Theoretical or analysis driven data collectionData collection is adapted to questions raised during analysis and theoretical ideas, often using several types of data material and/or different groups of participants or stakeholders.Bright et al., [ ] investigated how health care providers engaged people with communication disabilities during rehabilitation. Data were collected in the form of observations and interviews with three patients and 28 providers. The patients were followed during the rehabilitation period for up to 12 weeks. In choosing what situations or events should be observed, the research team drew on insights from the ongoing data collection as well as previous research and theoretical notions of what situations would provide rich data.19 (6.4)
Participant-adapted data collectionData collection is partly preplanned but also adjusted to the individual trajectory of each participant or case to capture essential changes across time. Typically, some participants are followed more closely and for a longer period of time than other participants.Superdock et al. [ ] conducted a study about the influence of religion and spirituality on parental decision-making regarding children’s life-threatening conditions. The parents of 16 children were included as well as the children’s health care providers. The shortest individual case was followed for 6 days whereas the longest was followed for 531 days (median = 380 days). Interviews were held at the time of study enrollment and then on a monthly basis, but additional data collection was performed in the following situations: when a child had encountered a life-threatening event; when a child’s treatment had changed; when a child was discharged from the clinic; and, in some cases, a few weeks after a child’s death.44 (14.7)
Participant entries of dataData are independently entered by the participants. Data often consist of texts or pictures such as diary entries, think aloud methods, or answers to open-ended questions. Prompts can be sent, or participants can be encouraged to enter data in certain situations. Studies can include an entry and/or exit interview.Gordon et al., [ ] investigated experiences of the transition from trainee doctors to trained doctors. During the enrollment interview, the trainee doctors were instructed about how to provide audio diaries. Audio diaries were recorded on smartphones in order to capture thoughts and experiences in the moment. Participants received weekly reminders to provide audio diaries. In total, the audio diaries were collected over a period of 6 to 8 months and thereafter the participants took part in an exit interview.11 (6.7)

Entities followed across time

Four different types of entities were followed across time: 1) individuals, 2) individual cases or dyads, 3) groups, and 4) settings. Every second article ( n  = 170, 56.9%) followed individuals across time, thus following the same participants through the whole data collection period. In contrast, when individual cases were followed across time, the data collection was centered on the primary participants (e.g., people with progressive neurological conditions) who were followed over time, and secondary participants (e.g., family caregivers) might provide complementary data at several time points or only at one-time point. When settings were followed over time, the participating individuals were sometimes the same, and sometimes changed across the data collection period. Typical settings were hospital wards, hospitals, smaller communities or intervention trials. The type of collected data corresponded with what kind of entities were followed longitudinally. Individuals were often followed with serial interviews, whereas groups were commonly followed with focus group interviews complemented with individual interviews, observations and/or questionnaires. Overall, the lengths of data collection periods seemed to be chosen based upon expected changes in the chosen entities. For example, the articles following an intervention setting were structured around the intervention timeline, collecting data before, after and sometimes during the intervention.

Tempo of data collection

The data collection tempo differed among the articles (e.g., the frequency and mode of the data collection). Approximately half ( n  = 154, 51.5%) of the articles used serial time points, collecting data at several reoccurring but shorter sequences (e.g., through serial interviews or open-ended questions in questionnaires). When data were collected in time waves ( n  = 50, 16.7%), the periods of data collection were longer, usually including both interviews and observations; often, time waves included observations of a setting and/or interviews at the same location over several days or weeks.

When comparing the tempo with the type of entities, some patterns were detected (see Fig.  3 ). When individuals were followed, data were often collected at time points, mirroring the use of individual interviews and/or short observations. For research in settings, data were commonly collected in time waves (e.g., observation periods over a few weeks or months). In studies exploring settings across time, time waves were commonly used and combined several types of data, particularly from interviews and observations. Groups were the least common studied entity ( n  = 9, 3.0%), so the numbers should be interpreted with caution, but continuous data collection was used in five of the nine studies. The continuous data collection mode was, for example, collecting electronic diaries [ 62 ] or minutes from committee meetings during a time period [ 63 ].

An external file that holds a picture, illustration, etc.
Object name is 12874_2022_1732_Fig3_HTML.jpg

Tempo of data collection in relation to entities followed over time

Preplanned or adapted data collection

A large majority ( n  = 224, 74.9%) of the articles used preplanned data collection (e.g., in preplanned data collection, all participants were followed across time according to the same data collection plan). For example, all participants were interviewed one, six and twelve months’ post-diagnosis. In contrast to the preplanned data collection approach, 44 articles had a participant-adapted data collection (14.7%), and participants were followed at different frequencies and/or over various lengths of time depending on each participant’s situation. Participant-adapted data collection was more common among articles following individuals or individual cases (see Fig.  4 ). To adapt the data collection to the participants, the researchers created strategies to reach participants when crucial events were happening. Eleven articles used a participant entry approach to data collection ( n  = 11, 6.7%), and the whole or parts of the data were independently sent in by participants in the form of diaries, questionnaires, or blogs. Another approach to data collection was using theoretical or analysis-driven ideas to guide the data collection ( n  = 19, 6.4%). In these articles, the analysis and data collection were conducted simultaneously, and ideas arising in the analysis could be followed up, for example, returning to some participants, recruiting participants with specific experiences, or collecting complementary types of data materials. This approach was most common in the articles following settings across time, which often included observations and interviews with different types of populations. Articles using theoretical or analysis driven data collection were not associated with grounded theory to a greater extent than the other articles in the sample (e.g., did not self-identify as grounded theory or referred to methodological literature within grounded theory traditions to a greater proportion).

An external file that holds a picture, illustration, etc.
Object name is 12874_2022_1732_Fig4_HTML.jpg

Preplanned or adapted data collection in relation to entities followed over time

According to our results, some researchers used QLR as a methodological approach and other researchers used a longitudinal qualitative data collection without aiming to investigate change. Adding to the debate on whether QLR is a methodological approach in its own right or a design element in a particular study we suggest that the use of QLR can be described as layered (see Fig.  5 ). Namely, articles must fulfill several criteria in order to use QLR as a methodological approach, and that is done in some articles. In those articles QLR method references were used, the aim was to investigate change of a phenomenon and the longitudinal elements of the data collection were thoroughly integrated into the method section. On the other hand, some articles using a longitudinal qualitative data collection were just collecting data over time, without addressing time and/or change in the aim. These articles can still be interesting research studies with valuable results, but they are not using the full potential of QLR as a methodological approach. In all, around 40% of the articles had an aim that focused on describing or understanding change (either as phenomenon or outcome); but only about 24% of the articles set out to investigate change across time as their phenomenon of interest.

An external file that holds a picture, illustration, etc.
Object name is 12874_2022_1732_Fig5_HTML.jpg

The QLR onion. The use of QLR design can be described as layered, where researchers use more or less elements of a QLR design. The two inmost layers represents articles using QLR as a methodological approach

Regarding methodological influences, about one-third of the articles self-identify with any of the traditional qualitative methodologies. Using a longitudinal qualitative data collection as an element integrated with another methodological tradition can therefore be seen as one way of working with longitudinal qualitative materials. In our results, the articles referring to methodologies other than QLR preferably used case study, phenomenology and grounded theory methodologies. This was surprising since Neale [ 10 ] identified ethnography, case studies and narrative methods as the main methodological influences on QLR. Our findings might mirror the profound impacts that phenomenology and grounded theory have had on the qualitative field of health research. Regarding phenomenology, the findings can also be influenced by more recent discussions of combining interpretative phenomenological analysis with QLR [ 6 ].

Half of the articles self-identified as QLR studies, but QLR method references were used in less than 20% of the identified articles. This is both surprising and troublesome since use of appropriate method literature might have supported researchers who were struggling with for example a large quantity of materials and complex analysis. A possible explanation for the lack of use of QLR method literature is that QLR as a methodological approach is not well known, and authors might not be aware that method literature exists. It is quite understandable that researchers can describe a qualitative project with longitudinal data collection as a qualitative longitudinal study, without being aware that QLR is a specific form of study. Balmer [ 64 ] described how their group conducted serial interviews with medical students over several years before they became aware of QLR as a method of study. Within our networks, we have met researchers with similar experiences. Likewise, peer reviewers and editorial boards might not be accustomed to evaluating QLR manuscripts. In our results, 138 journals published one article between 2017 and 2019, and that might not be enough for editorial boards and peer reviewers to develop knowledge to enable them to closely evaluate manuscripts with a QLR method.

In 2007, Holland and colleagues [ 65 ] mapped QLR in the UK and described the following four categories of QLR: 1) mixed methods approaches with a QLR component; 2) planned prospective longitudinal studies; 3) follow-up studies complementing a previous data collection with follow-up; and 4) evaluation studies. Examples of all these categories can be found among the articles in this method study; however, our results do paint a more complex picture. According to our results, Holland’s categories are not multi-exclusive. For example, studies with intentions to evaluate or implement practices often used a mixed methods design and were therefore eligible for both categories one and four described above. Additionally, regarding the follow-up studies, it was seldom clearly described if they were planned as a two-time-point study or if researchers had gained an opportunity to follow up on previous data collection. When we tried to categorize QLR articles according to the data collection design, we could not identify multi-exclusive categories. Instead, we identified the following three components of longitudinal data collection: 1) entities followed across time; 2) tempo; and 3) preplanned or adapted data collection approaches. However, the most common combination was preplanned studies that followed individuals longitudinally with three or more time points.

The use of QLR differs between disciplines [ 14 ]. Our results show some patterns for QLR within health research. Firstly, the QLR projects were large and complex; they often included several types of populations and various data materials, and were presented in several articles. Secondly, most studies focused upon the individual perspective, following individuals across time, and using individual interviews. Thirdly, the data collection periods varied, but 53% of the articles had a data collection period of 1 year or shorter. Finally, patients were the most prevalent population, even though topics varied greatly. Previously, two other reviews that focused on QLR in different parts of health research (e.g., nursing [ 4 ] and gerontology [ 66 ]) pointed in the same direction. For example, individual interviews or a combination of data materials were commonly used, and most studies were shorter than 1 year but a wide range existed [ 4 , 66 ].

Considerations when planning a QLR project

Based on our results, we argue that when health researchers plan a QLR study, they should reflect upon their perspective of time/change and decide what part change should play in their QLR study. If researchers decide that change should play the main role in their project, then they should aim to focus on change as the phenomenon of interest. However, in some research, change might be an important part of the plot, without having the main role, and change in relation to the outcomes might be a better perspective. In such studies, participants with change, no change or different kinds of change are compared to explore possible explanations for the change. In our results, change in relation to the outcomes was often used in relation to intervention studies where participants who reached a desired outcome were compared to individuals who did not. Furthermore, for some research studies, change is part of the context in which the research takes place. This can be the case when certain experiences happen during a period of change; for example, when the aim is to explore the experience of everyday life during rehabilitation after stroke. In such cases a longitudinal data collection could be advisable (e.g., repeated interviews often give a deep relationship between interviewer and participants as well as the possibility of gaining greater depth in interview answers during follow-up interviews [ 15 ]), but the study might not be called a QLR study since it does not focus upon change [ 13 ]. We suggest that researchers make informed decisions of what kind of longitudinal perspective they set out to investigate and are transparent with their sources of methodological inspiration.

We would argue that length of data collection period, type of entities, and data materials should be in accordance with the type of change/changing processes that a study focuses on. Individual change is important in health research, but researchers should also remember the possibility of investigating changes in families, working groups, organizations and wider communities. Using these types of entities were less common in our material and could probably grant new perspectives to many research topics within health. Similarly, using several types of data materials can complement the insights that individual interviews can give. A large majority of the articles in our results had a preplanned data collection. Participant-adapted data collection can be a way to work in alignment with a “time-as-fluid” conceptualization of time because the events of subjective importance to participants can be more in focus and participants (or other entities) change processes can differ substantially across cases. In studies with lengthy and spaced-out data collection periods and/or uncertainty in trajectories, researchers should consider participant-adapted or participant entry data collection. For example, some participants can be followed for longer periods and/or with more frequency.

Finally, researchers should consider how to best publish and disseminate their results. Many QLR projects are large, and the results are divided across several articles when they are published. In our results, 21 papers self-identified as a mixed methods project or as part of a larger mixed methods project, but most of these did not include quantitative data in the article. This raises the question of how to best divide a large research project into suitable pieces for publication. It is an evident risk that the more interesting aspects of a mixed methods project are lost when the qualitative and quantitative parts are analyzed and published separately. Similar risks occur, for example, when data have been collected from several types of populations but are then presented per population type (e.g., one article with patient data and another with caregiver data). During the work with our study, we also came across studies where data were collected longitudinally, but the results were divided into publications per time point. We do not argue that these examples are always wrong, there are situations when these practices are appropriate. However, it often appears that data have been divided without much consideration. Instead, we suggest a thematic approach to dividing projects into publications, crafting the individual publications around certain ideas or themes and thus using the data that is most suitable for the particular research question. Combining several types of data and/or several populations in an analysis across time is in fact what makes QLR an interesting approach.

Strengths and limitations

This method study intended to paint a broad picture regarding how longitudinal qualitative methods are used within the health research field by investigating 299 published articles. Method research is an emerging field, currently with limited methodological guidelines [ 21 ], therefore we used scoping review method to support this study. In accordance with scoping review method we did not use quality assessment as a criterion for inclusion [ 18 – 20 ]. This can be seen as a limitation because we made conclusions based upon a set of articles with varying quality. However, we believe that learning can be achieved by looking at both good and bad examples, and innovation may appear when looking beyond established knowledge, or assessing methods from different angles. It should also be noted that the results given in percentages hold no value for what procedures that are better or more in accordance with QLR, the percentages simply state how common a particular procedure was among the articles.

As described, the included articles showed much variation in the method descriptions. As the basis for our results, we have only charted explicitly written text from the articles, which might have led to an underestimation of some results. The researchers might have had a clearer rationale than described in the reports. Issues, such as word restrictions or the journal’s scope, could also have influenced the amount of detail that was provided. Similarly, when charting how articles drew on a traditional methodology, only data from the articles that clearly stated the methodologies they used (e.g., phenomenology) were charted. In some articles, literature choices or particular research strategies could implicitly indicate that the researchers had been inspired by certain methodologies (e.g., referring to grounded theory literature and describing the use of simultaneous data collection and analysis could indicate that the researchers were influenced by grounded theory), but these were not charted as using a particular methodological tradition. We used the articles’ aims and objectives/research questions to investigate their longitudinal perspectives. However, as researchers have different writing styles, information regarding the longitudinal perspectives could have been described in surrounding text rather than in the aim, which might have led to an underestimation of the longitudinal perspectives.

The experience and diversity of the research team in our study was a strength. The nine authors on the team represent ten universities and three countries, and have extensive experience in different types of qualitative research, QLR and review methods. The different level of experiences with QLR within the team (some authors have worked with QLR in several projects and others have qualitative experience but no experience in QLR) resulted in interesting discussions that helped drive the project forward. These experiences have been useful for understanding the field.

Based on a method study of 299 articles, we can conclude that QLR in health research articles published between 2017 and 2019 often contain comprehensive complex studies with a large variation in topics. Some research was thoroughly designed to capture time/change throughout the methodology, focus and data collection, while other articles included a few elements of QLR. Longitudinal data collection included several components, such as what entities were followed across time, the tempo of data collection, and to what extent the data collection was preplanned or adapted across time. In sum, health researchers need to be considerate and make informed choices when designing QLR projects. Further research should delve deeper into what kind of research questions go well with QLR and investigate the best practice examples of presenting QLR findings.

Acknowledgments

The authors wish to acknowledge Ellen Sejersted, librarian at the University of Agder, Kristiansand, Norway, who conducted the literature searches and Julia Andersson, research assistant at the Department of Nursing, Umeå University, Sweden, who supported the data management and took part in the initial screening phases of the project.

Authors’ contributions

ÅA conceived the study. ÅA, EH, TW, LF, MKP, HA, and MSL designed the study. ÅA, TW, and LF were involved in literature searches together with the librarian. ÅA and EH performed the screening of the articles. All authors (ÅA, EH, TW, LF, ÅK, MKP, KLD, HA, MSL) took part in the data charting. ÅA performed the data analysis and discussed the preliminary results with the rest of the team. ÅA wrote the 1st manuscript draft, and ÅK, MSL and EH edited. All authors (ÅA, EH, TW, LF, ÅK, MKP, KLD, HA, MSL) contributed to editing the 2nd draft. MSL and LF provided overall supervision. All authors read and approved the final manuscript.

Authors’ information

All authors represent the nursing discipline, but their research topics differ. ÅA and ÅK have previously worked together with QLR method development. ÅA, EH, TW, LF, MKP, HA, KLD and MSL work together in the Nordic research group PRANSIT, focusing on nursing topics connected to transition theory using a systematic review method, preferably meta synthesis. All authors have extensive experience with qualitative research but various experience with QLR.

Open access funding provided by Umea University. This project was conducted within the authors’ positions and did not receive any specific funding.

Availability of data and materials

Declarations.

Not applicable.

The authors declare that they have no competing interests.

1 Qualitative method references were defined as a journal article or book with a title that indicated an aim to guide researchers in qualitative research methods and/or research theories. Primary studies, theoretical works related to the articles’ research topics, protocols, and quantitative method literature were excluded. References written in a language other than English was also excluded since the authors could not evaluate their content.

2 QLR method references were defined as a journal article or book that 1) focused on qualitative methodological questions, 2) used terms such as ‘longitudinal’ or ‘time’ in the title so it was evident that the focus was on longitudinal qualitative research. Referring to another original QLR study was not counted as using QLR method literature.

3 Words were charted depending on their word stem, e.g., change, changes and changing were all charted as change.

4 It should be noted that here time span refers to the data collection related to each participant or case. Researchers could collect data for 2 years but follow each participant for 6 months.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Just one more step to your free trial.

.surveysparrow.com

Already using SurveySparrow?  Login

By clicking on "Get Started", I agree to the Privacy Policy and Terms of Service .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Enterprise Survey Software

Enterprise Survey Software to thrive in your business ecosystem

NPS® Software

Turn customers into promoters

Offline Survey

Real-time data collection, on the move. Go internet-independent.

360 Assessment

Conduct omnidirectional employee assessments. Increase productivity, grow together.

Reputation Management

Turn your existing customers into raving promoters by monitoring online reviews.

Ticket Management

Build loyalty and advocacy by delivering personalized support experiences that matter.

Chatbot for Website

Collect feedback smartly from your website visitors with the engaging Chatbot for website.

Swift, easy, secure. Scalable for your organization.

Executive Dashboard

Customer journey map, craft beautiful surveys, share surveys, gain rich insights, recurring surveys, white label surveys, embedded surveys, conversational forms, mobile-first surveys, audience management, smart surveys, video surveys, secure surveys, api, webhooks, integrations, survey themes, accept payments, custom workflows, all features, customer experience, employee experience, product experience, marketing experience, sales experience, hospitality & travel, market research, saas startup programs, wall of love, success stories, sparrowcast, nps® benchmarks, learning centre, apps & integrations, testimonials.

Our surveys come with superpowers ⚡

Blog Best Of

What is a Longitudinal Study? Definition, Types & Examples

Kate williams.

Last Updated:  

16 February 2024

Table Of Contents

What is a Longitudinal Study?

  • Types of Longitudinal Studies?

Pros and Cons of Longitudinal Research Design

Examples of longitudinal surveys.

Sonia was conflicted. A few months ago, a survey from a grocery delivery app had asked her if she preferred normal eggs or the free-range ones.

She was financially stressed and couldn’t afford to pay more for free-range eggs, so she picked the normal ones.

But last night, she had watched a popular documentary on Netflix about how hens were treated in cages and now felt much more strongly about wanting to buy free-range eggs.

There was no way for Sonia to communicate this new preference to her grocery delivery app.

But that’s the thing about consumer trends. They are constantly shifting, and one survey taken years ago is not going to give you an accurate picture of the shifts in trends.

That’s why your business needs to understand what a longitudinal study is.

At times, a one-off survey simply isn’t enough to give you the data you need. If you need to observe certain trends, behaviors, or preferences over time, you can use a longitudinal study.

The simplest way to understand what is a longitudinal study is to think of it as a survey taken over time. The passing of time could influence the responses of the same person to the same question. Like Sonia, her preferences for eggs changed since she watched the documentary. That’s the kind of thing that longitudinal research design measures.

As for a formal definition, a longitudinal study is a research method that involves repeated observations of the same variable (e.g. a set of people) over some time. The observations over a period of time might be undertaken in the form of an online survey. It can be tremendously useful in a variety of fields to be able to observe behavior or trends over time.

Longitudinal studies are used in fields like:

  • Clinical psychology to measure a patient’s thoughts over time
  • Market research to observe consumer trends
  • Political polling and sociology, observing life events and societal shifts over time
  • Longitudinal research design is also used in medicine to discover predictors of certain diseases

We are dealing with nuanced changes over time here, and surveys excel at capturing these shifts in attitudes, behaviors, and experiences. Unlike one-time snapshots, surveys repeated over time enable you to track trends and understand how variables evolve. Plus, it is cost-effective and flexible in terms of reach!

For instance, SurveySparrow’s Recurring Surveys let you schedule and automate the entire process.

With this feature, you can share periodic surveys at any frequency that you set. Also, give a slight nudge to those silent respondents over a friendly reminder via email. The best part? The platform’s conversational surveys reap a higher response rate.

Please enter a valid Email ID.

14-Day Free Trial • No Credit Card Required • No Strings Attached

Types of Longitudinal Studies

When talking about what is a longitudinal study, we cannot go without also discussing the types of longitudinal research design. There are different studies based on your needs. When you understand all three types of longitudinal studies, you’ll be able to pick out the one that’s best suited to your needs.

Panel Study

When we want to find out trends in a larger population, we often use a sample size to survey. A panel study simply observes that sample size over time. By doing so, panel studies can identify cultural shifts and new trends in a larger population.

Panel studies are designed for quantitative analysis. Through the data from online surveys, you can identify common patterns in the responses from your sample (which remain the same over time). A comprehensive dashboard will help you make informed decisions.

But what’s the need to visualize?

In panel studies, the same set of people must be studied over time. If you pick a different sample, variations in individual preferences could skew your results.

Observing the same set of people can make sure that what you’re observing is a change over time. Visualizing the change over time will give you a clear idea of the trends and patterns, resulting in informed and effective decision-making.

Cohort Study

A longitudinal cohort study is one in which we study people who share a single characteristic over a period of time. Cohort studies are regularly conducted by medical researchers to ascertain the effects of a new drug or the symptoms of a disease.

In cohort studies, the behaviors of the selected group of people are observed over time to find patterns and trends. Often, these studies can go on for years. They can also be particularly useful for ascertaining consumer trends if you’re trying to research consumers with a specific common characteristic. An example of such a study would be observing the choice of cereal for kids who go to Sunshine Elementary School over time.

If you’re confused between panel studies and cohort studies, don’t worry. The one key difference between cohort studies and panel studies is that the same set of people has to be observed in the latter. In cohort studies, you can pick a different sample of the same demographic to study over time.

Retrospective Study

A retrospective longitudinal study is when you take pre-existing data from previous online surveys and other research. The objective here is to put your results in a larger timeline and observe the variation in results over time. What makes retrospective studies longitudinal is simply the fact that they’re aimed at revealing trends over time.

When understanding what is a longitudinal study, it’ll be well worth your while to look into retrospective studies. For your company, retrospective longitudinal studies can reveal crucial insights without you having to spend a single dime. Since these studies depend on existing data, they not only don’t cost much themselves but also improve the returns from your earlier research efforts.

How can retrospective longitudinal studies be useful to you? Let’s assume, for example, that you conduct an employee engagement survey every year. If your organization has done these surveys for the past 10 years, you now have more than enough material to conduct a retrospective study. You can then find out how employee engagement at your company has varied over time.

Like with every research method , longitudinal studies have their advantages and disadvantages. While trying to understand what is a longitudinal study, it is important to get the particular ways in which they’re useful, and situations in which they’re not.  Let’s go over some of the major pros and cons of longitudinal surveys.

Advantages of Longitudinal Studies

  • Rigorous Insights : A one-off online survey, no matter how well designed, is only so rigorous. Even though the results are often useful, sometimes you need more rigor in your surveys. A longitudinal survey, by observing respondents over time, can offer more rigorous results.
  • Long-term Data : When thinking about what is a longitudinal study, it is crucial to understand that it is best used for a specific type of data collection. When you need to understand trends over the longer term, longitudinal studies are best suited to that task.
  • Discover Trends : Most companies, in one way or another, rely on trends they estimate will be relevant in the future. Longitudinal studies can be great at finding out those trends and capitalizing on them before the competition.
  • Open To Surprises : When designing an online survey, it is very tough to allow for surprises. Mostly, you get what you ask for. With longitudinal surveys, you’re allowing for the possibility that you might spot patterns you didn’t imagine could exist. Longitudinal studies are more flexible in that regard and allow us to discover the unexpected.

Disadvantages of Longitudinal Studies

  • Higher Costs : Because longitudinal research needs to be conducted over time, and in some cases with the same set of people, they end up being costlier than one-off surveys. From conducting the observations to analyzing the data, it can add up financially. Using a cost-effective online survey tool like Surveysparrow can be one way to reduce costs.
  • More Demanding : One of the biggest challenges you can face while conducting a survey is to get enough respondents. Even for normal online surveys, it can be tough to get people to take your survey. Longitudinal surveys are far more demanding, so it is unlikely that anyone will participate without strong incentives.
  • Unpredictability : While unpredictability can sometimes be a good thing, at times it can also lead the whole exercise astray. The success of a longitudinal study depends not just on the resources you invest in it, but also on the respondents who have to participate in a long-term commitment. Things can go wrong when respondents are suddenly unavailable. That’s why there’s always an element of unpredictability with longitudinal surveys.
  • Time-Consuming : Unlike simple online surveys, you don’t get the results instantly with longitudinal surveys. They require a certain vision, and you have to be patient enough to see it through to get your desired results.

Longitudinal surveys have been used by researchers and businesses for a long time now, so there is no dearth of examples. Let’s walk through a few of them so you can better understand what is a longitudinal survey.

Australia’s ‘45 and Up’ Survey

There is no better example to understand what longitudinal research is than the 45 and Up study being conducted in Australia. It aims to understand healthy aging and has 250,000 participants who are aged 45 or older. The idea is to get a better idea of Australians’ health as they age.

Such a study needed to be a longitudinal survey since you can only understand the effects of aging en masse by considering the results over time. The results from this study are being used in areas like cardiovascular research and preventable hospitalizations.

Smoking and Lung Cancer

To understand the effects of smoking, you need to be able to assess its consequences over time. The British Doctors Study, which ran from 1951 to 2001, yielded results that strongly indicated the link between smoking and lung cancer. If not for longitudinal research methods, we might never have known.

Even though the research was first published in 1956, the study went on for almost half a century after that. When thinking about what is a longitudinal study, we must also consider that these studies give results while they’re ongoing. Conclusively proving the link between smoking and cancer required a robust, longitudinal survey.

Growing Up In Ireland

Started in 2006, Growing Up In Ireland is a longitudinal study conducted by the Irish government to understand what children’s life looks like in different age brackets. One cohort that the study started following at 9 years of age is now 23. The long-term study can yield interesting results by following a set of children throughout their childhood.

The thing to remember when thinking about what a longitudinal study is is that they can have broad objectives. You can go in without really knowing what you’re trying to find and what that might lead to. You can then use the surprises along the way to generate actionable insights.

Wrapping Up

If you started out wondering what is a longitudinal study, we hope that we’ve addressed that question and more in this article. If you want to create a longitudinal survey, don’t forget to first plan out your survey. A retrospective study, like we just talked about, can also be a great solution to your problems.

Here at SurveySparrow, we love surveys of all kinds. For certain types of questions, you need to conduct longitudinal surveys, and we’re here to support you through the process. With our online templates and intuitive UI, conducting a longitudinal survey will be much easier.

What we love about recurring surveys is the surprising results they can yield. That is really what drives us at Surveysparrow, that you might find something in the results you didn’t expect, and it might change the course of your company for the better.

Create recurring surveys with SurveySparrow

Unlock insights, elevate experience!

  • 14-Day Free Trial
  • • Cancel Anytime
  • • No Credit Card Required
  • • Need a Demo?

Product Marketing Manager at SurveySparrow

Excels in empowering visionary companies through storytelling and strategic go-to-market planning. With extensive experience in product marketing and customer experience management, she is an accomplished author, podcast host, and mentor, sharing her expertise across diverse platforms and audiences.

You Might Also Like

The only “how well do you know me questions” you’ll ever need, customer is king but customer service is the god | great customer service, 70+ online shopping questionnaire for ecommerce businesses, see it to believe it..

14-Day Free Trial  •  Cancel Anytime  •  No Credit Card Required  •   Need a Demo?

Start your free trial today

No Credit Card Required. 14-Day Free Trial

Request a Demo

Want to learn more about SurveySparrow? We'll be in touch soon!

Conduct Longitudinal Surveys that Get Results!

Get more of your customers to share their current, honest thoughts with you.

14-Day Free Trial • No Credit card required • 40% more completion rate

Hi there, we use cookies to offer you a better browsing experience and to analyze site traffic. By continuing to use our website, you consent to the use of these cookies. Learn More

What is a longitudinal study?

Last updated

20 February 2023

Reviewed by

Longitudinal studies are common in epidemiology, economics, and medicine. People also use them in other medical and social sciences, such as to study customer trends. Researchers periodically observe and collect data from the variables without manipulating the study environment.

A company may conduct a tracking study, surveying a target audience to measure changes in attitudes and behaviors over time. The collected data doesn't change, and the time interval remains consistent. This longitudinal study can measure brand awareness, customer satisfaction , and consumer opinions and analyze the impact of an advertising campaign.

Analyze longitudinal studies

Dovetail streamlines longitudinal study data to help you uncover and share actionable insights

  • Types of longitudinal studies

There are two types of longitudinal studies: Cohort and panel studies.

Panel study

A panel study is a type of longitudinal study that involves collecting data from a fixed number of variables at regular but distant intervals. Researchers follow a group or groups of people over time. Panel studies are designed for quantitative analysis but are also usable for qualitative analysis .

A panel study may research the causes of age-related changes and their effects. Researchers may measure the health markers of a group over time, such as their blood pressure, blood cholesterol, and mental acuity. Then, they can compare the scores to understand how age positively or negatively correlates with these measures.

Cohort study

A cohort longitudinal study involves gathering information from a group of people with something in common, such as a specific trait or experience of the same event. The researchers observe behaviors and other details of the group over time. Unlike panel studies, you can pick a different group to test in cohort studies.

An example of a cohort study could be a drug manufacturer studying the effects on a group of users taking a new drug over a period. A drinks company may want to research consumers with common characteristics, like regular purchasers of sugar-free sodas. This will help the company understand trends within its target market.

  • Benefits of longitudinal research

If you want to study the relationship between variables and causal factors responsible for certain outcomes, you should adopt a longitudinal approach to your investigation.

The benefits of longitudinal research over other research methods include the following:

Insights over time

It gives insights into how and why certain things change over time.

Better information

Researchers can better establish sequences of events and identify trends.

No recall bias

The participants won't have recall bias if you use a prospective longitudinal study. Recall bias is an error that occurs in a study if respondents don't wholly or accurately recall the details of their actions, attitudes, or behaviors.

Because variables can change during the study, researchers can discover new relationships or data points worth further investigation.

Small groups

Longitudinal studies don't need a large group of participants.

  • Potential pitfalls

The challenges and potential pitfalls of longitudinal studies include the following:

A longitudinal survey takes a long time, involves multiple data collections , and requires complex processes, making it more expensive than other research methods.

Unpredictability

Because they take a long time, longitudinal studies are unpredictable. Unexpected events can cause changes in the variables, making earlier data potentially less valuable.

Slow insights

Researchers can take a long time to uncover insights from the study as it involves multiple observations.

Participants can drop out of the study, limiting the data set and making it harder to draw valid conclusions from the results.

Overly specific data

If you study a smaller group to reduce research costs, results will be less generalizable to larger populations versus a study with a larger group.

Despite these potential pitfalls, you can still derive significant value from a well-designed longitudinal study by uncovering long-term patterns and relationships.

  • Longitudinal study designs

Longitudinal studies can take three forms: Repeated cross-sectional, prospective, and retrospective.

Repeated cross-sectional studies

Repeated cross-sectional studies are a type of longitudinal study where participants change across sampling periods. For example, as part of a brand awareness survey , you ask different people from the same customer population about their brand preferences. 

Prospective studies

A prospective study is a longitudinal study that involves real-time data collection, and you follow the same participants over a period. Prospective longitudinal studies can be cohort, where participants have similar characteristics or experiences. They can also be panel studies, where you choose the population sample randomly.

Retrospective studies

Retrospective studies are longitudinal studies that involve collecting data on events that some participants have already experienced. Researchers examine historical information to identify patterns that led to an outcome they established at the start of the study. Retrospective studies are the most time and cost-efficient of the three.

  • How to perform a longitudinal study

When developing a longitudinal study plan, you must decide whether to collect your data or use data from other sources. Each choice has its benefits and drawbacks.

Using data from other sources

You can freely access data from many previous longitudinal studies, especially studies conducted by governments and research institutes. For example, anyone can access data from the 1970 British Cohort Study on the  UK Data Service website .

Using data from other sources saves the time and money you would have spent gathering data. However, the data is more restrictive than the data you collect yourself. You are limited to the variables the original researcher was investigating, and they may have aggregated the data, obscuring some details.

If you can't find data or longitudinal research that applies to your study, the only option is to collect it yourself.

Collecting your own data

Collecting data enhances its relevance, integrity, reliability, and verifiability. Your data collection methods depend on the type of longitudinal study you want to perform. For example, a retrospective longitudinal study collects historical data, while a prospective longitudinal study collects real-time data.

The only way to ensure relevant and reliable data is to use an effective and versatile data collection tool. It can improve the speed and accuracy of the information you collect.

What is a longitudinal study in research?

A longitudinal study is a research design that involves studying the same variables over time by gathering data continuously or repeatedly at consistent intervals.

What is an example of a longitudinal study?

An excellent example of a longitudinal study is market research to identify market trends. The organization's researchers collect data on customers' likes and dislikes to assess market trends and conditions. An organization can also conduct longitudinal studies after launching a new product to understand customers' perceptions and how it is doing in the market.

Why is it called a longitudinal study?

It’s a longitudinal study because you collect data over an extended period. Longitudinal data tracks the same type of information on the same variables at multiple points in time. You collect the data over repeated observations.

What is a longitudinal study vs. a cross-sectional study?

A longitudinal study follows the same people over an extended period, while a cross-sectional study looks at the characteristics of different people or groups at a given time. Longitudinal studies provide insights over an extended period and can establish patterns among variables.

Cross-sectional studies provide insights about a point in time, so they cannot identify cause-and-effect relationships.

Should you be using a customer insights hub?

Do you want to discover previous research faster?

Do you share your research findings with others?

Do you analyze research data?

Start for free today, add your research, and get to key insights faster

Editor’s picks

Last updated: 18 April 2023

Last updated: 27 February 2023

Last updated: 6 February 2023

Last updated: 6 October 2023

Last updated: 5 February 2023

Last updated: 16 April 2023

Last updated: 9 March 2023

Last updated: 12 December 2023

Last updated: 11 March 2024

Last updated: 4 July 2024

Last updated: 6 March 2024

Last updated: 5 March 2024

Last updated: 13 May 2024

Latest articles

Related topics, .css-je19u9{-webkit-align-items:flex-end;-webkit-box-align:flex-end;-ms-flex-align:flex-end;align-items:flex-end;display:-webkit-box;display:-webkit-flex;display:-ms-flexbox;display:flex;-webkit-flex-direction:row;-ms-flex-direction:row;flex-direction:row;-webkit-box-flex-wrap:wrap;-webkit-flex-wrap:wrap;-ms-flex-wrap:wrap;flex-wrap:wrap;-webkit-box-pack:center;-ms-flex-pack:center;-webkit-justify-content:center;justify-content:center;row-gap:0;text-align:center;max-width:671px;}@media (max-width: 1079px){.css-je19u9{max-width:400px;}.css-je19u9>span{white-space:pre;}}@media (max-width: 799px){.css-je19u9{max-width:400px;}.css-je19u9>span{white-space:pre;}} decide what to .css-1kiodld{max-height:56px;display:-webkit-box;display:-webkit-flex;display:-ms-flexbox;display:flex;-webkit-align-items:center;-webkit-box-align:center;-ms-flex-align:center;align-items:center;}@media (max-width: 1079px){.css-1kiodld{display:none;}} build next, decide what to build next.

example of a longitudinal research question

Users report unexpectedly high data usage, especially during streaming sessions.

example of a longitudinal research question

Users find it hard to navigate from the home page to relevant playlists in the app.

example of a longitudinal research question

It would be great to have a sleep timer feature, especially for bedtime listening.

example of a longitudinal research question

I need better filters to find the songs or artists I’m looking for.

Log in or sign up

Get started for free

Illustration

  • Basics of Research Process
  • Methodology

Longitudinal Research Design: Methods and Examples

  • Speech Topics
  • Basics of Essay Writing
  • Essay Topics
  • Other Essays
  • Main Academic Essays
  • Research Paper Topics
  • Basics of Research Paper Writing
  • Miscellaneous
  • Chicago/ Turabian
  • Data & Statistics
  • Admission Writing Tips
  • Admission Advice
  • Other Guides
  • Student Life
  • Studying Tips
  • Understanding Plagiarism
  • Academic Writing Tips
  • Basics of Dissertation & Thesis Writing

Illustration

  • Essay Guides
  • Research Paper Guides
  • Formatting Guides
  • Admission Guides
  • Dissertation & Thesis Guides

longitudinal study

Table of contents

Illustration

Use our free Readability checker

A longitudinal study is a research method used to investigate changes in a group of subjects over an extended period of time. Unlike cross-sectional studies that capture data at a single point in time, longitudinal studies follow participants over a prolonged period. This allows researchers to examine how variables gradually evolve or affect individuals.

In case your research revolves around observing the same group of participants, you need to know well how to conduct longitudinal study. Today we’ll focus on this type of research data collection and find out which scientific areas require it. Its peculiar features and differences from other research types will also be examined.  This article can help a lot with planning and organizing a research project over a long time period. Below you’ll find some tips on completing such work as well as a few helpful examples from a college paper writing service . Feel free to go on in case you aim to complete such work.

What Is a Longitudinal Study: Definition

Let’s define ‘ longitudinal study ’ to begin with. This is an approach when data from the same respondents’ group is gathered repeatedly over a period of time. The reason why the same individuals are continuously observed over an extended period is to find changes and trends which can be analyzed. This approach is essentially observational as you aren’t expected to influence the group’s parameters you are monitoring in any way. It is typically used in scope of correlational research which means collecting data about variables without assuming any dependencies. Let’s find out more about its usage and how much time it could take.

How Long Are Longitudinal Studies?

How long is a longitudinal study? It depends on your topic and research goals. In case characteristics of the subject are changing fast, it might be enough to take just a few measurements one by one. Otherwise, one might have to wait for a long time before measuring again. So, such projects can take weeks or months but they also can extend over years or even decades. Studies like that are common in medicine, psychology and sociology, where it is important to observe how participants’ characteristics evolve.

How to Perform Longitudinal Research?

Before actively engaging in longitudinal research, it is important to understand well what your next steps should be. Let’s define study subtypes that can be used for such research. They are:

  • Collecting and analyzing your own data.
  • Finding data already collected by some other researcher and analyzing it.

Each of these subtypes has certain pros and cons. Gathering data yourself usually gives more confidence but it might be hard to contact the right individuals. Let’s discuss each point in detail. Likewise, you can pay someone to write my research paper .

Longitudinal Study: Data From Other Sources

When doing longitudinal studies of a certain group over a long period, you might find available data about them left from other researchers. Make sure to carefully examine sources of each dataset you decide to reuse. Otherwise previous researchers’ mistakes or bias may influence your results after you’ve analyzed that data. However this approach could be very efficient in case the subject has already been investigated by different researchers. Their results could be compared and gaps or bias could be easier to eliminate. As a result, much time and effort could be saved.

Longitudinal Design: Own Data

When doing longitudinal studies without any significant predecessors’ works available, using your own data is the only reasonable way. This data is collected through surveys, measurement or observations. Thus you have more confidence in these results however this approach requires more time and effort. You need proper research design methods  prior to starting the collection process. If you choose such an approach, keep in mind that it has two major subtypes:

  • retrospective research: collecting data about past events.
  • prospective research: observing ongoing events, making measurements in more or less real time.

Longitudinal Study Types

A longitudinal study can be applied to a wide range of cases. You need to adjust your approach, depending on a specific situation, subject’s peculiarities and your research goals.  There are three major research types you can use for continuous observation:

Longitudinal Cohort Study

Retrospective longitudinal study, longitudinal panel study.

Let’s take a closer look at each type’s definition with our coursework writing service . Dive deep to learn how data is collected and what impact is made on results.

A cohort longitudinal study involves selecting a group based on some unique event which unifies them all. It can be their birth date, geographic location, or historical experience. So there are special relationships between that group’s members which play significant roles for the entire research process. Such a peculiarity is to be carefully selected when doing test design and planning your test steps. Sometimes one unifying event may be more relevant or convenient than another.

This approach takes a special place among longitudinal studies as it involves conducting some historical investigations. As we’ve already mentioned above, during a retrospective, researchers have to make observations and measurements of past events. Collecting historical data and analyzing changes might be easier than tracking live data. However the development of such research design must include checking the credibility of datasets that were used for it.

A panel study involves sampling a cross-section of individuals. This approach is often used for collecting medical data. Such a study when performed continuously is considered more reliable compared to a regular cross sectional study and allows using smaller sample sizes, while still being representative. However, there are various problems that may occur during such studies, especially those which go on for decades. Particularly, such samples can be eventually eroded because of deaths, migration, fatigue, or even by development of response bias.

Longitudinal Research Design

Longitudinal study design requires some serious planning to complete it properly. Keep in mind that your purpose is to directly address some individual change and variation cases. The target population should be chosen carefully so that results achieved through this study would be accurate enough. Another key element is deciding about proper timing. For example you would need bigger intervals to ensure you detect important changes. At the same time, dissertation writers suggest that the intervals shouldn’t be too big. Otherwise, you might lose track of the actual trends within your target population.

Advantages and Disadvantages of Longitudinal Study

Let’s review longitudinal study advantages and disadvantages. Better wrap your head around this information if you are still choosing an optimal approach for your own project. Any study that involves complicated planning and extensive techniques can have some downsides. It is common for them to come together with benefits. So pay close attention to the information below before deciding what method to choose to observe your research subject.

Advantages of Longitudinal Study

These are the benefits of longitudinal study:

  • it can provide unique insight that might not be available any other way. Particularly, it is the only way to investigate lifespan issues. It allows researchers to track changes across the entire generation . Let’s suppose the task is to track the percentage of farms which pass from parents to children in a certain location. Obtaining such information requires using historical records.
  • such observational approach shows dynamics in respondent’s data and thus allows to model trends and understand their influence. Collecting data once provides only a snapshot of your group’s current state. Doing it continuously allows you to observe this group from some new angles. For example, you would get more information about your respondents’ habits if you observe them at least several times.

Longitudinal Study Disadvantages

This is the disadvantages of longitudinal study:

  • it can be quite expensive since numerous repeated measurements require enormous amounts of time and effort. Imagine you need to collect data about a certain group for 10 years. Processing this data alone would require a lot of resources.
  • such high costs may induce another problem: researchers might decide to use lesser samples in order to cut the expenditures. Consequently, results of such studies may not be representative enough.
  • its participants tend to drop out eventually. The reasons may vary: moving to another location, illness, death or just loss of motivation to participate further. As a result, a sample is shrinking and thus decreasing the amount of data collection in research . This process is called selective attrition. A typical example is observing the life of some neighborhood in a big city: numerous people would move in and out so it would be hard to find a single individual who is available for a long time.

Longitudinal Study Examples

Let’s review some longitudinal study example which would be helpful for illustrating the above information.

Longitudinal research example A famous longitudinal case is The Terman Study of the Gifted also known previously as Genetic Studies of Genius. Its founder and the main researcher, Lewis Terman, aimed to investigate how highly intelligent children developed into adulthood. He was also going to disprove the then-prevalent belief that gifted children were typically delicate physically and also socially inept. Initial observations began in 1921, at Stanford University. Eventually it led to confirming that gifted children were not significantly different from their peers in terms of physical development and social skills. The results of this study were still being compiled during the 2000s which makes it the oldest and longest-running longitudinal study in the world. Such a huge period of data collection made it possible to obtain some really unique knowledge, not only about children’s development but about the history of education as well.

Longitudinal: Final Thoughts

In this article we’ve explored the longitudinal research notion and reviewed its main characteristics:

  • conducting observations and measurements continuously over a long period of time
  • some particular new insights which can be obtained by prolonged studies
  • prospective advantages and disadvantages for researchers.

Illustration

Or just unsure of where to start from? No worries, you’ve come to the right place – just check out our writing services . We are a team of skilled authors with significant experience in various academic areas. Just let us know how we can help you and you’ll get your paper completed in time!

Frequently Asked Questions About Longitudinal Studies

1. is a longitudinal study quantitative or qualitative.

According to the definition of a longitudinal study, quantitative methods don’t play any significant role in the process. This approach includes extended case studies, observing individuals over long periods and gaining additional insights thanks to the possibility to analyze changes over time. Since these observations and resulting assumptions mostly consist of descriptions of trends, changes and influences, we can say that it is a purely qualitative approach.

2. Are longitudinal studies more reliable?

Longitudinal studies in general have similar amounts of problems and risks as other studies do. This includes:

  • survey aging and period effects.
  • delayed results.
  • achieving continuity in funding and research direction.
  • cumulative attrition.

These factors can decrease reliability of this study type and must be taken into account when selecting such an approach. 

3. Is attrition a limitation of longitudinal studies?

Depending on how big is the period they take, longitudinal studies may suffer more or less for the attrition factor. It can deteriorate generalizability of findings if participants who stay in a study are significantly different from those who drop out. In case a particular study takes many years, researchers need to see the attrition factor as a serious problem and to develop some ways to counter its negative effect.

4. What is longitudinal data collection?

Longitudinal data collection occurs sequentially from the same respondents over time. This is the core element of this study type. Repeated collection of data allows researchers to see temporal changes and understand what trends are there in this population. It allows viewing it from some new angles and thus to obtain new insights about it. There are certain limitations to such data collection, particularly when the target group tends to change over time.

Joe_Eckel_1_ab59a03630.jpg

Joe Eckel is an expert on Dissertations writing. He makes sure that each student gets precious insights on composing A-grade academic writing.

You may also like

experimental design

Frequently asked questions

What is an example of a longitudinal study.

The 1970 British Cohort Study , which has collected data on the lives of 17,000 Brits since their births in 1970, is one well-known example of a longitudinal study .

Frequently asked questions: Methodology

Attrition refers to participants leaving a study. It always happens to some extent—for example, in randomized controlled trials for medical research.

Differential attrition occurs when attrition or dropout rates differ systematically between the intervention and the control group . As a result, the characteristics of the participants who drop out differ from the characteristics of those who stay in the study. Because of this, study results may be biased .

Action research is conducted in order to solve a particular issue immediately, while case studies are often conducted over a longer period of time and focus more on observing and analyzing a particular ongoing phenomenon.

Action research is focused on solving a problem or informing individual and community-based knowledge in a way that impacts teaching, learning, and other related processes. It is less focused on contributing theoretical input, instead producing actionable input.

Action research is particularly popular with educators as a form of systematic inquiry because it prioritizes reflection and bridges the gap between theory and practice. Educators are able to simultaneously investigate an issue as they solve it, and the method is very iterative and flexible.

A cycle of inquiry is another name for action research . It is usually visualized in a spiral shape following a series of steps, such as “planning → acting → observing → reflecting.”

To make quantitative observations , you need to use instruments that are capable of measuring the quantity you want to observe. For example, you might use a ruler to measure the length of an object or a thermometer to measure its temperature.

Criterion validity and construct validity are both types of measurement validity . In other words, they both show you how accurately a method measures something.

While construct validity is the degree to which a test or other measurement method measures what it claims to measure, criterion validity is the degree to which a test can predictively (in the future) or concurrently (in the present) measure something.

Construct validity is often considered the overarching type of measurement validity . You need to have face validity , content validity , and criterion validity in order to achieve construct validity.

Convergent validity and discriminant validity are both subtypes of construct validity . Together, they help you evaluate whether a test measures the concept it was designed to measure.

  • Convergent validity indicates whether a test that is designed to measure a particular construct correlates with other tests that assess the same or similar construct.
  • Discriminant validity indicates whether two tests that should not be highly related to each other are indeed not related. This type of validity is also called divergent validity .

You need to assess both in order to demonstrate construct validity. Neither one alone is sufficient for establishing construct validity.

  • Discriminant validity indicates whether two tests that should not be highly related to each other are indeed not related

Content validity shows you how accurately a test or other measurement method taps  into the various aspects of the specific construct you are researching.

In other words, it helps you answer the question: “does the test measure all aspects of the construct I want to measure?” If it does, then the test has high content validity.

The higher the content validity, the more accurate the measurement of the construct.

If the test fails to include parts of the construct, or irrelevant parts are included, the validity of the instrument is threatened, which brings your results into question.

Face validity and content validity are similar in that they both evaluate how suitable the content of a test is. The difference is that face validity is subjective, and assesses content at surface level.

When a test has strong face validity, anyone would agree that the test’s questions appear to measure what they are intended to measure.

For example, looking at a 4th grade math test consisting of problems in which students have to add and multiply, most people would agree that it has strong face validity (i.e., it looks like a math test).

On the other hand, content validity evaluates how well a test represents all the aspects of a topic. Assessing content validity is more systematic and relies on expert evaluation. of each question, analyzing whether each one covers the aspects that the test was designed to cover.

A 4th grade math test would have high content validity if it covered all the skills taught in that grade. Experts(in this case, math teachers), would have to evaluate the content validity by comparing the test to the learning objectives.

Snowball sampling is a non-probability sampling method . Unlike probability sampling (which involves some form of random selection ), the initial individuals selected to be studied are the ones who recruit new participants.

Because not every member of the target population has an equal chance of being recruited into the sample, selection in snowball sampling is non-random.

Snowball sampling is a non-probability sampling method , where there is not an equal chance for every member of the population to be included in the sample .

This means that you cannot use inferential statistics and make generalizations —often the goal of quantitative research . As such, a snowball sample is not representative of the target population and is usually a better fit for qualitative research .

Snowball sampling relies on the use of referrals. Here, the researcher recruits one or more initial participants, who then recruit the next ones.

Participants share similar characteristics and/or know each other. Because of this, not every member of the population has an equal chance of being included in the sample, giving rise to sampling bias .

Snowball sampling is best used in the following cases:

  • If there is no sampling frame available (e.g., people with a rare disease)
  • If the population of interest is hard to access or locate (e.g., people experiencing homelessness)
  • If the research focuses on a sensitive topic (e.g., extramarital affairs)

The reproducibility and replicability of a study can be ensured by writing a transparent, detailed method section and using clear, unambiguous language.

Reproducibility and replicability are related terms.

  • Reproducing research entails reanalyzing the existing data in the same manner.
  • Replicating (or repeating ) the research entails reconducting the entire analysis, including the collection of new data . 
  • A successful reproduction shows that the data analyses were conducted in a fair and honest manner.
  • A successful replication shows that the reliability of the results is high.

Stratified sampling and quota sampling both involve dividing the population into subgroups and selecting units from each subgroup. The purpose in both cases is to select a representative sample and/or to allow comparisons between subgroups.

The main difference is that in stratified sampling, you draw a random sample from each subgroup ( probability sampling ). In quota sampling you select a predetermined number or proportion of units, in a non-random manner ( non-probability sampling ).

Purposive and convenience sampling are both sampling methods that are typically used in qualitative data collection.

A convenience sample is drawn from a source that is conveniently accessible to the researcher. Convenience sampling does not distinguish characteristics among the participants. On the other hand, purposive sampling focuses on selecting participants possessing characteristics associated with the research study.

The findings of studies based on either convenience or purposive sampling can only be generalized to the (sub)population from which the sample is drawn, and not to the entire population.

Random sampling or probability sampling is based on random selection. This means that each unit has an equal chance (i.e., equal probability) of being included in the sample.

On the other hand, convenience sampling involves stopping people at random, which means that not everyone has an equal chance of being selected depending on the place, time, or day you are collecting your data.

Convenience sampling and quota sampling are both non-probability sampling methods. They both use non-random criteria like availability, geographical proximity, or expert knowledge to recruit study participants.

However, in convenience sampling, you continue to sample units or cases until you reach the required sample size.

In quota sampling, you first need to divide your population of interest into subgroups (strata) and estimate their proportions (quota) in the population. Then you can start your data collection, using convenience sampling to recruit participants, until the proportions in each subgroup coincide with the estimated proportions in the population.

A sampling frame is a list of every member in the entire population . It is important that the sampling frame is as complete as possible, so that your sample accurately reflects your population.

Stratified and cluster sampling may look similar, but bear in mind that groups created in cluster sampling are heterogeneous , so the individual characteristics in the cluster vary. In contrast, groups created in stratified sampling are homogeneous , as units share characteristics.

Relatedly, in cluster sampling you randomly select entire groups and include all units of each group in your sample. However, in stratified sampling, you select some units of all groups and include them in your sample. In this way, both methods can ensure that your sample is representative of the target population .

A systematic review is secondary research because it uses existing research. You don’t collect new data yourself.

The key difference between observational studies and experimental designs is that a well-done observational study does not influence the responses of participants, while experiments do have some sort of treatment condition applied to at least some participants by random assignment .

An observational study is a great choice for you if your research question is based purely on observations. If there are ethical, logistical, or practical concerns that prevent you from conducting a traditional experiment , an observational study may be a good choice. In an observational study, there is no interference or manipulation of the research subjects, as well as no control or treatment groups .

It’s often best to ask a variety of people to review your measurements. You can ask experts, such as other researchers, or laypeople, such as potential participants, to judge the face validity of tests.

While experts have a deep understanding of research methods , the people you’re studying can provide you with valuable insights you may have missed otherwise.

Face validity is important because it’s a simple first step to measuring the overall validity of a test or technique. It’s a relatively intuitive, quick, and easy way to start checking whether a new measure seems useful at first glance.

Good face validity means that anyone who reviews your measure says that it seems to be measuring what it’s supposed to. With poor face validity, someone reviewing your measure may be left confused about what you’re measuring and why you’re using this method.

Face validity is about whether a test appears to measure what it’s supposed to measure. This type of validity is concerned with whether a measure seems relevant and appropriate for what it’s assessing only on the surface.

Statistical analyses are often applied to test validity with data from your measures. You test convergent validity and discriminant validity with correlations to see if results from your test are positively or negatively related to those of other established tests.

You can also use regression analyses to assess whether your measure is actually predictive of outcomes that you expect it to predict theoretically. A regression analysis that supports your expectations strengthens your claim of construct validity .

When designing or evaluating a measure, construct validity helps you ensure you’re actually measuring the construct you’re interested in. If you don’t have construct validity, you may inadvertently measure unrelated or distinct constructs and lose precision in your research.

Construct validity is often considered the overarching type of measurement validity ,  because it covers all of the other types. You need to have face validity , content validity , and criterion validity to achieve construct validity.

Construct validity is about how well a test measures the concept it was designed to evaluate. It’s one of four types of measurement validity , which includes construct validity, face validity , and criterion validity.

There are two subtypes of construct validity.

  • Convergent validity : The extent to which your measure corresponds to measures of related constructs
  • Discriminant validity : The extent to which your measure is unrelated or negatively related to measures of distinct constructs

Naturalistic observation is a valuable tool because of its flexibility, external validity , and suitability for topics that can’t be studied in a lab setting.

The downsides of naturalistic observation include its lack of scientific control , ethical considerations , and potential for bias from observers and subjects.

Naturalistic observation is a qualitative research method where you record the behaviors of your research subjects in real world settings. You avoid interfering or influencing anything in a naturalistic observation.

You can think of naturalistic observation as “people watching” with a purpose.

A dependent variable is what changes as a result of the independent variable manipulation in experiments . It’s what you’re interested in measuring, and it “depends” on your independent variable.

In statistics, dependent variables are also called:

  • Response variables (they respond to a change in another variable)
  • Outcome variables (they represent the outcome you want to measure)
  • Left-hand-side variables (they appear on the left-hand side of a regression equation)

An independent variable is the variable you manipulate, control, or vary in an experimental study to explore its effects. It’s called “independent” because it’s not influenced by any other variables in the study.

Independent variables are also called:

  • Explanatory variables (they explain an event or outcome)
  • Predictor variables (they can be used to predict the value of a dependent variable)
  • Right-hand-side variables (they appear on the right-hand side of a regression equation).

As a rule of thumb, questions related to thoughts, beliefs, and feelings work well in focus groups. Take your time formulating strong questions, paying special attention to phrasing. Be careful to avoid leading questions , which can bias your responses.

Overall, your focus group questions should be:

  • Open-ended and flexible
  • Impossible to answer with “yes” or “no” (questions that start with “why” or “how” are often best)
  • Unambiguous, getting straight to the point while still stimulating discussion
  • Unbiased and neutral

A structured interview is a data collection method that relies on asking questions in a set order to collect data on a topic. They are often quantitative in nature. Structured interviews are best used when: 

  • You already have a very clear understanding of your topic. Perhaps significant research has already been conducted, or you have done some prior research yourself, but you already possess a baseline for designing strong structured questions.
  • You are constrained in terms of time or resources and need to analyze your data quickly and efficiently.
  • Your research question depends on strong parity between participants, with environmental conditions held constant.

More flexible interview options include semi-structured interviews , unstructured interviews , and focus groups .

Social desirability bias is the tendency for interview participants to give responses that will be viewed favorably by the interviewer or other participants. It occurs in all types of interviews and surveys , but is most common in semi-structured interviews , unstructured interviews , and focus groups .

Social desirability bias can be mitigated by ensuring participants feel at ease and comfortable sharing their views. Make sure to pay attention to your own body language and any physical or verbal cues, such as nodding or widening your eyes.

This type of bias can also occur in observations if the participants know they’re being observed. They might alter their behavior accordingly.

The interviewer effect is a type of bias that emerges when a characteristic of an interviewer (race, age, gender identity, etc.) influences the responses given by the interviewee.

There is a risk of an interviewer effect in all types of interviews , but it can be mitigated by writing really high-quality interview questions.

A semi-structured interview is a blend of structured and unstructured types of interviews. Semi-structured interviews are best used when:

  • You have prior interview experience. Spontaneous questions are deceptively challenging, and it’s easy to accidentally ask a leading question or make a participant uncomfortable.
  • Your research question is exploratory in nature. Participant answers can guide future research questions and help you develop a more robust knowledge base for future research.

An unstructured interview is the most flexible type of interview, but it is not always the best fit for your research topic.

Unstructured interviews are best used when:

  • You are an experienced interviewer and have a very strong background in your research topic, since it is challenging to ask spontaneous, colloquial questions.
  • Your research question is exploratory in nature. While you may have developed hypotheses, you are open to discovering new or shifting viewpoints through the interview process.
  • You are seeking descriptive data, and are ready to ask questions that will deepen and contextualize your initial thoughts and hypotheses.
  • Your research depends on forming connections with your participants and making them feel comfortable revealing deeper emotions, lived experiences, or thoughts.

The four most common types of interviews are:

  • Structured interviews : The questions are predetermined in both topic and order. 
  • Semi-structured interviews : A few questions are predetermined, but other questions aren’t planned.
  • Unstructured interviews : None of the questions are predetermined.
  • Focus group interviews : The questions are presented to a group instead of one individual.

Deductive reasoning is commonly used in scientific research, and it’s especially associated with quantitative research .

In research, you might have come across something called the hypothetico-deductive method . It’s the scientific method of testing hypotheses to check whether your predictions are substantiated by real-world data.

Deductive reasoning is a logical approach where you progress from general ideas to specific conclusions. It’s often contrasted with inductive reasoning , where you start with specific observations and form general conclusions.

Deductive reasoning is also called deductive logic.

There are many different types of inductive reasoning that people use formally or informally.

Here are a few common types:

  • Inductive generalization : You use observations about a sample to come to a conclusion about the population it came from.
  • Statistical generalization: You use specific numbers about samples to make statements about populations.
  • Causal reasoning: You make cause-and-effect links between different things.
  • Sign reasoning: You make a conclusion about a correlational relationship between different things.
  • Analogical reasoning: You make a conclusion about something based on its similarities to something else.

Inductive reasoning is a bottom-up approach, while deductive reasoning is top-down.

Inductive reasoning takes you from the specific to the general, while in deductive reasoning, you make inferences by going from general premises to specific conclusions.

In inductive research , you start by making observations or gathering data. Then, you take a broad scan of your data and search for patterns. Finally, you make general conclusions that you might incorporate into theories.

Inductive reasoning is a method of drawing conclusions by going from the specific to the general. It’s usually contrasted with deductive reasoning, where you proceed from general information to specific conclusions.

Inductive reasoning is also called inductive logic or bottom-up reasoning.

A hypothesis states your predictions about what your research will find. It is a tentative answer to your research question that has not yet been tested. For some research projects, you might have to write several hypotheses that address different aspects of your research question.

A hypothesis is not just a guess — it should be based on existing theories and knowledge. It also has to be testable, which means you can support or refute it through scientific research methods (such as experiments, observations and statistical analysis of data).

Triangulation can help:

  • Reduce research bias that comes from using a single method, theory, or investigator
  • Enhance validity by approaching the same topic with different tools
  • Establish credibility by giving you a complete picture of the research problem

But triangulation can also pose problems:

  • It’s time-consuming and labor-intensive, often involving an interdisciplinary team.
  • Your results may be inconsistent or even contradictory.

There are four main types of triangulation :

  • Data triangulation : Using data from different times, spaces, and people
  • Investigator triangulation : Involving multiple researchers in collecting or analyzing data
  • Theory triangulation : Using varying theoretical perspectives in your research
  • Methodological triangulation : Using different methodologies to approach the same topic

Many academic fields use peer review , largely to determine whether a manuscript is suitable for publication. Peer review enhances the credibility of the published manuscript.

However, peer review is also common in non-academic settings. The United Nations, the European Union, and many individual nations use peer review to evaluate grant applications. It is also widely used in medical and health-related fields as a teaching or quality-of-care measure. 

Peer assessment is often used in the classroom as a pedagogical tool. Both receiving feedback and providing it are thought to enhance the learning process, helping students think critically and collaboratively.

Peer review can stop obviously problematic, falsified, or otherwise untrustworthy research from being published. It also represents an excellent opportunity to get feedback from renowned experts in your field. It acts as a first defense, helping you ensure your argument is clear and that there are no gaps, vague terms, or unanswered questions for readers who weren’t involved in the research process.

Peer-reviewed articles are considered a highly credible source due to this stringent process they go through before publication.

In general, the peer review process follows the following steps: 

  • First, the author submits the manuscript to the editor.
  • Reject the manuscript and send it back to author, or 
  • Send it onward to the selected peer reviewer(s) 
  • Next, the peer review process occurs. The reviewer provides feedback, addressing any major or minor issues with the manuscript, and gives their advice regarding what edits should be made. 
  • Lastly, the edited manuscript is sent back to the author. They input the edits, and resubmit it to the editor for publication.

Exploratory research is often used when the issue you’re studying is new or when the data collection process is challenging for some reason.

You can use exploratory research if you have a general idea or a specific question that you want to study but there is no preexisting knowledge or paradigm with which to study it.

Exploratory research is a methodology approach that explores research questions that have not previously been studied in depth. It is often used when the issue you’re studying is new, or the data collection process is challenging in some way.

Explanatory research is used to investigate how or why a phenomenon occurs. Therefore, this type of research is often one of the first stages in the research process , serving as a jumping-off point for future research.

Exploratory research aims to explore the main aspects of an under-researched problem, while explanatory research aims to explain the causes and consequences of a well-defined problem.

Explanatory research is a research method used to investigate how or why something occurs when only a small amount of information is available pertaining to that topic. It can help you increase your understanding of a given topic.

Clean data are valid, accurate, complete, consistent, unique, and uniform. Dirty data include inconsistencies and errors.

Dirty data can come from any part of the research process, including poor research design , inappropriate measurement materials, or flawed data entry.

Data cleaning takes place between data collection and data analyses. But you can use some methods even before collecting data.

For clean data, you should start by designing measures that collect valid data. Data validation at the time of data entry or collection helps you minimize the amount of data cleaning you’ll need to do.

After data collection, you can use data standardization and data transformation to clean your data. You’ll also deal with any missing values, outliers, and duplicate values.

Every dataset requires different techniques to clean dirty data , but you need to address these issues in a systematic way. You focus on finding and resolving data points that don’t agree or fit with the rest of your dataset.

These data might be missing values, outliers, duplicate values, incorrectly formatted, or irrelevant. You’ll start with screening and diagnosing your data. Then, you’ll often standardize and accept or remove data to make your dataset consistent and valid.

Data cleaning is necessary for valid and appropriate analyses. Dirty data contain inconsistencies or errors , but cleaning your data helps you minimize or resolve these.

Without data cleaning, you could end up with a Type I or II error in your conclusion. These types of erroneous conclusions can be practically significant with important consequences, because they lead to misplaced investments or missed opportunities.

Data cleaning involves spotting and resolving potential data inconsistencies or errors to improve your data quality. An error is any value (e.g., recorded weight) that doesn’t reflect the true value (e.g., actual weight) of something that’s being measured.

In this process, you review, analyze, detect, modify, or remove “dirty” data to make your dataset “clean.” Data cleaning is also called data cleansing or data scrubbing.

Research misconduct means making up or falsifying data, manipulating data analyses, or misrepresenting results in research reports. It’s a form of academic fraud.

These actions are committed intentionally and can have serious consequences; research misconduct is not a simple mistake or a point of disagreement but a serious ethical failure.

Anonymity means you don’t know who the participants are, while confidentiality means you know who they are but remove identifying information from your research report. Both are important ethical considerations .

You can only guarantee anonymity by not collecting any personally identifying information—for example, names, phone numbers, email addresses, IP addresses, physical characteristics, photos, or videos.

You can keep data confidential by using aggregate information in your research report, so that you only refer to groups of participants rather than individuals.

Research ethics matter for scientific integrity, human rights and dignity, and collaboration between science and society. These principles make sure that participation in studies is voluntary, informed, and safe.

Ethical considerations in research are a set of principles that guide your research designs and practices. These principles include voluntary participation, informed consent, anonymity, confidentiality, potential for harm, and results communication.

Scientists and researchers must always adhere to a certain code of conduct when collecting data from others .

These considerations protect the rights of research participants, enhance research validity , and maintain scientific integrity.

In multistage sampling , you can use probability or non-probability sampling methods .

For a probability sample, you have to conduct probability sampling at every stage.

You can mix it up by using simple random sampling , systematic sampling , or stratified sampling to select units at different stages, depending on what is applicable and relevant to your study.

Multistage sampling can simplify data collection when you have large, geographically spread samples, and you can obtain a probability sample without a complete sampling frame.

But multistage sampling may not lead to a representative sample, and larger samples are needed for multistage samples to achieve the statistical properties of simple random samples .

These are four of the most common mixed methods designs :

  • Convergent parallel: Quantitative and qualitative data are collected at the same time and analyzed separately. After both analyses are complete, compare your results to draw overall conclusions. 
  • Embedded: Quantitative and qualitative data are collected at the same time, but within a larger quantitative or qualitative design. One type of data is secondary to the other.
  • Explanatory sequential: Quantitative data is collected and analyzed first, followed by qualitative data. You can use this design if you think your qualitative data will explain and contextualize your quantitative findings.
  • Exploratory sequential: Qualitative data is collected and analyzed first, followed by quantitative data. You can use this design if you think the quantitative data will confirm or validate your qualitative findings.

Triangulation in research means using multiple datasets, methods, theories and/or investigators to address a research question. It’s a research strategy that can help you enhance the validity and credibility of your findings.

Triangulation is mainly used in qualitative research , but it’s also commonly applied in quantitative research . Mixed methods research always uses triangulation.

In multistage sampling , or multistage cluster sampling, you draw a sample from a population using smaller and smaller groups at each stage.

This method is often used to collect data from a large, geographically spread group of people in national surveys, for example. You take advantage of hierarchical groupings (e.g., from state to city to neighborhood) to create a sample that’s less expensive and time-consuming to collect data from.

No, the steepness or slope of the line isn’t related to the correlation coefficient value. The correlation coefficient only tells you how closely your data fit on a line, so two datasets with the same correlation coefficient can have very different slopes.

To find the slope of the line, you’ll need to perform a regression analysis .

Correlation coefficients always range between -1 and 1.

The sign of the coefficient tells you the direction of the relationship: a positive value means the variables change together in the same direction, while a negative value means they change together in opposite directions.

The absolute value of a number is equal to the number without its sign. The absolute value of a correlation coefficient tells you the magnitude of the correlation: the greater the absolute value, the stronger the correlation.

These are the assumptions your data must meet if you want to use Pearson’s r :

  • Both variables are on an interval or ratio level of measurement
  • Data from both variables follow normal distributions
  • Your data have no outliers
  • Your data is from a random or representative sample
  • You expect a linear relationship between the two variables

Quantitative research designs can be divided into two main categories:

  • Correlational and descriptive designs are used to investigate characteristics, averages, trends, and associations between variables.
  • Experimental and quasi-experimental designs are used to test causal relationships .

Qualitative research designs tend to be more flexible. Common types of qualitative design include case study , ethnography , and grounded theory designs.

A well-planned research design helps ensure that your methods match your research aims, that you collect high-quality data, and that you use the right kind of analysis to answer your questions, utilizing credible sources . This allows you to draw valid , trustworthy conclusions.

The priorities of a research design can vary depending on the field, but you usually have to specify:

  • Your research questions and/or hypotheses
  • Your overall approach (e.g., qualitative or quantitative )
  • The type of design you’re using (e.g., a survey , experiment , or case study )
  • Your sampling methods or criteria for selecting subjects
  • Your data collection methods (e.g., questionnaires , observations)
  • Your data collection procedures (e.g., operationalization , timing and data management)
  • Your data analysis methods (e.g., statistical tests  or thematic analysis )

A research design is a strategy for answering your   research question . It defines your overall approach and determines how you will collect and analyze data.

Questionnaires can be self-administered or researcher-administered.

Self-administered questionnaires can be delivered online or in paper-and-pen formats, in person or through mail. All questions are standardized so that all respondents receive the same questions with identical wording.

Researcher-administered questionnaires are interviews that take place by phone, in-person, or online between researchers and respondents. You can gain deeper insights by clarifying questions for respondents or asking follow-up questions.

You can organize the questions logically, with a clear progression from simple to complex, or randomly between respondents. A logical flow helps respondents process the questionnaire easier and quicker, but it may lead to bias. Randomization can minimize the bias from order effects.

Closed-ended, or restricted-choice, questions offer respondents a fixed set of choices to select from. These questions are easier to answer quickly.

Open-ended or long-form questions allow respondents to answer in their own words. Because there are no restrictions on their choices, respondents can answer in ways that researchers may not have otherwise considered.

A questionnaire is a data collection tool or instrument, while a survey is an overarching research method that involves collecting and analyzing data from people using questionnaires.

The third variable and directionality problems are two main reasons why correlation isn’t causation .

The third variable problem means that a confounding variable affects both variables to make them seem causally related when they are not.

The directionality problem is when two variables correlate and might actually have a causal relationship, but it’s impossible to conclude which variable causes changes in the other.

Correlation describes an association between variables : when one variable changes, so does the other. A correlation is a statistical indicator of the relationship between variables.

Causation means that changes in one variable brings about changes in the other (i.e., there is a cause-and-effect relationship between variables). The two variables are correlated with each other, and there’s also a causal link between them.

While causation and correlation can exist simultaneously, correlation does not imply causation. In other words, correlation is simply a relationship where A relates to B—but A doesn’t necessarily cause B to happen (or vice versa). Mistaking correlation for causation is a common error and can lead to false cause fallacy .

Controlled experiments establish causality, whereas correlational studies only show associations between variables.

  • In an experimental design , you manipulate an independent variable and measure its effect on a dependent variable. Other variables are controlled so they can’t impact the results.
  • In a correlational design , you measure variables without manipulating any of them. You can test whether your variables change together, but you can’t be sure that one variable caused a change in another.

In general, correlational research is high in external validity while experimental research is high in internal validity .

A correlation is usually tested for two variables at a time, but you can test correlations between three or more variables.

A correlation coefficient is a single number that describes the strength and direction of the relationship between your variables.

Different types of correlation coefficients might be appropriate for your data based on their levels of measurement and distributions . The Pearson product-moment correlation coefficient (Pearson’s r ) is commonly used to assess a linear relationship between two quantitative variables.

A correlational research design investigates relationships between two variables (or more) without the researcher controlling or manipulating any of them. It’s a non-experimental type of quantitative research .

A correlation reflects the strength and/or direction of the association between two or more variables.

  • A positive correlation means that both variables change in the same direction.
  • A negative correlation means that the variables change in opposite directions.
  • A zero correlation means there’s no relationship between the variables.

Random error  is almost always present in scientific studies, even in highly controlled settings. While you can’t eradicate it completely, you can reduce random error by taking repeated measurements, using a large sample, and controlling extraneous variables .

You can avoid systematic error through careful design of your sampling , data collection , and analysis procedures. For example, use triangulation to measure your variables using multiple methods; regularly calibrate instruments or procedures; use random sampling and random assignment ; and apply masking (blinding) where possible.

Systematic error is generally a bigger problem in research.

With random error, multiple measurements will tend to cluster around the true value. When you’re collecting data from a large sample , the errors in different directions will cancel each other out.

Systematic errors are much more problematic because they can skew your data away from the true value. This can lead you to false conclusions ( Type I and II errors ) about the relationship between the variables you’re studying.

Random and systematic error are two types of measurement error.

Random error is a chance difference between the observed and true values of something (e.g., a researcher misreading a weighing scale records an incorrect measurement).

Systematic error is a consistent or proportional difference between the observed and true values of something (e.g., a miscalibrated scale consistently records weights as higher than they actually are).

On graphs, the explanatory variable is conventionally placed on the x-axis, while the response variable is placed on the y-axis.

  • If you have quantitative variables , use a scatterplot or a line graph.
  • If your response variable is categorical, use a scatterplot or a line graph.
  • If your explanatory variable is categorical, use a bar graph.

The term “ explanatory variable ” is sometimes preferred over “ independent variable ” because, in real world contexts, independent variables are often influenced by other variables. This means they aren’t totally independent.

Multiple independent variables may also be correlated with each other, so “explanatory variables” is a more appropriate term.

The difference between explanatory and response variables is simple:

  • An explanatory variable is the expected cause, and it explains the results.
  • A response variable is the expected effect, and it responds to other variables.

In a controlled experiment , all extraneous variables are held constant so that they can’t influence the results. Controlled experiments require:

  • A control group that receives a standard treatment, a fake treatment, or no treatment.
  • Random assignment of participants to ensure the groups are equivalent.

Depending on your study topic, there are various other methods of controlling variables .

There are 4 main types of extraneous variables :

  • Demand characteristics : environmental cues that encourage participants to conform to researchers’ expectations.
  • Experimenter effects : unintentional actions by researchers that influence study outcomes.
  • Situational variables : environmental variables that alter participants’ behaviors.
  • Participant variables : any characteristic or aspect of a participant’s background that could affect study results.

An extraneous variable is any variable that you’re not investigating that can potentially affect the dependent variable of your research study.

A confounding variable is a type of extraneous variable that not only affects the dependent variable, but is also related to the independent variable.

In a factorial design, multiple independent variables are tested.

If you test two variables, each level of one independent variable is combined with each level of the other independent variable to create different conditions.

Within-subjects designs have many potential threats to internal validity , but they are also very statistically powerful .

Advantages:

  • Only requires small samples
  • Statistically powerful
  • Removes the effects of individual differences on the outcomes

Disadvantages:

  • Internal validity threats reduce the likelihood of establishing a direct relationship between variables
  • Time-related effects, such as growth, can influence the outcomes
  • Carryover effects mean that the specific order of different treatments affect the outcomes

While a between-subjects design has fewer threats to internal validity , it also requires more participants for high statistical power than a within-subjects design .

  • Prevents carryover effects of learning and fatigue.
  • Shorter study duration.
  • Needs larger samples for high power.
  • Uses more resources to recruit participants, administer sessions, cover costs, etc.
  • Individual differences may be an alternative explanation for results.

Yes. Between-subjects and within-subjects designs can be combined in a single study when you have two or more independent variables (a factorial design). In a mixed factorial design, one variable is altered between subjects and another is altered within subjects.

In a between-subjects design , every participant experiences only one condition, and researchers assess group differences between participants in various conditions.

In a within-subjects design , each participant experiences all conditions, and researchers test the same participants repeatedly for differences between conditions.

The word “between” means that you’re comparing different conditions between groups, while the word “within” means you’re comparing different conditions within the same group.

Random assignment is used in experiments with a between-groups or independent measures design. In this research design, there’s usually a control group and one or more experimental groups. Random assignment helps ensure that the groups are comparable.

In general, you should always use random assignment in this type of experimental design when it is ethically possible and makes sense for your study topic.

To implement random assignment , assign a unique number to every member of your study’s sample .

Then, you can use a random number generator or a lottery method to randomly assign each number to a control or experimental group. You can also do so manually, by flipping a coin or rolling a dice to randomly assign participants to groups.

Random selection, or random sampling , is a way of selecting members of a population for your study’s sample.

In contrast, random assignment is a way of sorting the sample into control and experimental groups.

Random sampling enhances the external validity or generalizability of your results, while random assignment improves the internal validity of your study.

In experimental research, random assignment is a way of placing participants from your sample into different groups using randomization. With this method, every member of the sample has a known or equal chance of being placed in a control group or an experimental group.

“Controlling for a variable” means measuring extraneous variables and accounting for them statistically to remove their effects on other variables.

Researchers often model control variable data along with independent and dependent variable data in regression analyses and ANCOVAs . That way, you can isolate the control variable’s effects from the relationship between the variables of interest.

Control variables help you establish a correlational or causal relationship between variables by enhancing internal validity .

If you don’t control relevant extraneous variables , they may influence the outcomes of your study, and you may not be able to demonstrate that your results are really an effect of your independent variable .

A control variable is any variable that’s held constant in a research study. It’s not a variable of interest in the study, but it’s controlled because it could influence the outcomes.

Including mediators and moderators in your research helps you go beyond studying a simple relationship between two variables for a fuller picture of the real world. They are important to consider when studying complex correlational or causal relationships.

Mediators are part of the causal pathway of an effect, and they tell you how or why an effect takes place. Moderators usually help you judge the external validity of your study by identifying the limitations of when the relationship between variables holds.

If something is a mediating variable :

  • It’s caused by the independent variable .
  • It influences the dependent variable
  • When it’s taken into account, the statistical correlation between the independent and dependent variables is higher than when it isn’t considered.

A confounder is a third variable that affects variables of interest and makes them seem related when they are not. In contrast, a mediator is the mechanism of a relationship between two variables: it explains the process by which they are related.

A mediator variable explains the process through which two variables are related, while a moderator variable affects the strength and direction of that relationship.

There are three key steps in systematic sampling :

  • Define and list your population , ensuring that it is not ordered in a cyclical or periodic order.
  • Decide on your sample size and calculate your interval, k , by dividing your population by your target sample size.
  • Choose every k th member of the population as your sample.

Systematic sampling is a probability sampling method where researchers select members of the population at a regular interval – for example, by selecting every 15th person on a list of the population. If the population is in a random order, this can imitate the benefits of simple random sampling .

Yes, you can create a stratified sample using multiple characteristics, but you must ensure that every participant in your study belongs to one and only one subgroup. In this case, you multiply the numbers of subgroups for each characteristic to get the total number of groups.

For example, if you were stratifying by location with three subgroups (urban, rural, or suburban) and marital status with five subgroups (single, divorced, widowed, married, or partnered), you would have 3 x 5 = 15 subgroups.

You should use stratified sampling when your sample can be divided into mutually exclusive and exhaustive subgroups that you believe will take on different mean values for the variable that you’re studying.

Using stratified sampling will allow you to obtain more precise (with lower variance ) statistical estimates of whatever you are trying to measure.

For example, say you want to investigate how income differs based on educational attainment, but you know that this relationship can vary based on race. Using stratified sampling, you can ensure you obtain a large enough sample from each racial group, allowing you to draw more precise conclusions.

In stratified sampling , researchers divide subjects into subgroups called strata based on characteristics that they share (e.g., race, gender, educational attainment).

Once divided, each subgroup is randomly sampled using another probability sampling method.

Cluster sampling is more time- and cost-efficient than other probability sampling methods , particularly when it comes to large samples spread across a wide geographical area.

However, it provides less statistical certainty than other methods, such as simple random sampling , because it is difficult to ensure that your clusters properly represent the population as a whole.

There are three types of cluster sampling : single-stage, double-stage and multi-stage clustering. In all three types, you first divide the population into clusters, then randomly select clusters for use in your sample.

  • In single-stage sampling , you collect data from every unit within the selected clusters.
  • In double-stage sampling , you select a random sample of units from within the clusters.
  • In multi-stage sampling , you repeat the procedure of randomly sampling elements from within the clusters until you have reached a manageable sample.

Cluster sampling is a probability sampling method in which you divide a population into clusters, such as districts or schools, and then randomly select some of these clusters as your sample.

The clusters should ideally each be mini-representations of the population as a whole.

If properly implemented, simple random sampling is usually the best sampling method for ensuring both internal and external validity . However, it can sometimes be impractical and expensive to implement, depending on the size of the population to be studied,

If you have a list of every member of the population and the ability to reach whichever members are selected, you can use simple random sampling.

The American Community Survey  is an example of simple random sampling . In order to collect detailed data on the population of the US, the Census Bureau officials randomly select 3.5 million households per year and use a variety of methods to convince them to fill out the survey.

Simple random sampling is a type of probability sampling in which the researcher randomly selects a subset of participants from a population . Each member of the population has an equal chance of being selected. Data is then collected from as large a percentage as possible of this random subset.

Quasi-experimental design is most useful in situations where it would be unethical or impractical to run a true experiment .

Quasi-experiments have lower internal validity than true experiments, but they often have higher external validity  as they can use real-world interventions instead of artificial laboratory settings.

A quasi-experiment is a type of research design that attempts to establish a cause-and-effect relationship. The main difference with a true experiment is that the groups are not randomly assigned.

Blinding is important to reduce research bias (e.g., observer bias , demand characteristics ) and ensure a study’s internal validity .

If participants know whether they are in a control or treatment group , they may adjust their behavior in ways that affect the outcome that researchers are trying to measure. If the people administering the treatment are aware of group assignment, they may treat participants differently and thus directly or indirectly influence the final results.

  • In a single-blind study , only the participants are blinded.
  • In a double-blind study , both participants and experimenters are blinded.
  • In a triple-blind study , the assignment is hidden not only from participants and experimenters, but also from the researchers analyzing the data.

Blinding means hiding who is assigned to the treatment group and who is assigned to the control group in an experiment .

A true experiment (a.k.a. a controlled experiment) always includes at least one control group that doesn’t receive the experimental treatment.

However, some experiments use a within-subjects design to test treatments without a control group. In these designs, you usually compare one group’s outcomes before and after a treatment (instead of comparing outcomes between different groups).

For strong internal validity , it’s usually best to include a control group if possible. Without a control group, it’s harder to be certain that the outcome was caused by the experimental treatment and not by other variables.

An experimental group, also known as a treatment group, receives the treatment whose effect researchers wish to study, whereas a control group does not. They should be identical in all other ways.

Individual Likert-type questions are generally considered ordinal data , because the items have clear rank order, but don’t have an even distribution.

Overall Likert scale scores are sometimes treated as interval data. These scores are considered to have directionality and even spacing between them.

The type of data determines what statistical tests you should use to analyze your data.

A Likert scale is a rating scale that quantitatively assesses opinions, attitudes, or behaviors. It is made up of 4 or more questions that measure a single attitude or trait when response scores are combined.

To use a Likert scale in a survey , you present participants with Likert-type questions or statements, and a continuum of items, usually with 5 or 7 possible responses, to capture their degree of agreement.

In scientific research, concepts are the abstract ideas or phenomena that are being studied (e.g., educational achievement). Variables are properties or characteristics of the concept (e.g., performance at school), while indicators are ways of measuring or quantifying variables (e.g., yearly grade reports).

The process of turning abstract concepts into measurable variables and indicators is called operationalization .

There are various approaches to qualitative data analysis , but they all share five steps in common:

  • Prepare and organize your data.
  • Review and explore your data.
  • Develop a data coding system.
  • Assign codes to the data.
  • Identify recurring themes.

The specifics of each step depend on the focus of the analysis. Some common approaches include textual analysis , thematic analysis , and discourse analysis .

There are five common approaches to qualitative research :

  • Grounded theory involves collecting data in order to develop new theories.
  • Ethnography involves immersing yourself in a group or organization to understand its culture.
  • Narrative research involves interpreting stories to understand how people make sense of their experiences and perceptions.
  • Phenomenological research involves investigating phenomena through people’s lived experiences.
  • Action research links theory and practice in several cycles to drive innovative changes.

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.

Operationalization means turning abstract conceptual ideas into measurable observations.

For example, the concept of social anxiety isn’t directly observable, but it can be operationally defined in terms of self-rating scores, behavioral avoidance of crowded places, or physical anxiety symptoms in social situations.

Before collecting data , it’s important to consider how you will operationalize the variables that you want to measure.

When conducting research, collecting original data has significant advantages:

  • You can tailor data collection to your specific research aims (e.g. understanding the needs of your consumers or user testing your website)
  • You can control and standardize the process for high reliability and validity (e.g. choosing appropriate measurements and sampling methods )

However, there are also some drawbacks: data collection can be time-consuming, labor-intensive and expensive. In some cases, it’s more efficient to use secondary data that has already been collected by someone else, but the data might be less reliable.

Data collection is the systematic process by which observations or measurements are gathered in research. It is used in many different contexts by academics, governments, businesses, and other organizations.

There are several methods you can use to decrease the impact of confounding variables on your research: restriction, matching, statistical control and randomization.

In restriction , you restrict your sample by only including certain subjects that have the same values of potential confounding variables.

In matching , you match each of the subjects in your treatment group with a counterpart in the comparison group. The matched subjects have the same values on any potential confounding variables, and only differ in the independent variable .

In statistical control , you include potential confounders as variables in your regression .

In randomization , you randomly assign the treatment (or independent variable) in your study to a sufficiently large number of subjects, which allows you to control for all potential confounding variables.

A confounding variable is closely related to both the independent and dependent variables in a study. An independent variable represents the supposed cause , while the dependent variable is the supposed effect . A confounding variable is a third variable that influences both the independent and dependent variables.

Failing to account for confounding variables can cause you to wrongly estimate the relationship between your independent and dependent variables.

To ensure the internal validity of your research, you must consider the impact of confounding variables. If you fail to account for them, you might over- or underestimate the causal relationship between your independent and dependent variables , or even find a causal relationship where none exists.

Yes, but including more than one of either type requires multiple research questions .

For example, if you are interested in the effect of a diet on health, you can use multiple measures of health: blood sugar, blood pressure, weight, pulse, and many more. Each of these is its own dependent variable with its own research question.

You could also choose to look at the effect of exercise levels as well as diet, or even the additional effect of the two combined. Each of these is a separate independent variable .

To ensure the internal validity of an experiment , you should only change one independent variable at a time.

No. The value of a dependent variable depends on an independent variable, so a variable cannot be both independent and dependent at the same time. It must be either the cause or the effect, not both!

You want to find out how blood sugar levels are affected by drinking diet soda and regular soda, so you conduct an experiment .

  • The type of soda – diet or regular – is the independent variable .
  • The level of blood sugar that you measure is the dependent variable – it changes depending on the type of soda.

Determining cause and effect is one of the most important parts of scientific research. It’s essential to know which is the cause – the independent variable – and which is the effect – the dependent variable.

In non-probability sampling , the sample is selected based on non-random criteria, and not every member of the population has a chance of being included.

Common non-probability sampling methods include convenience sampling , voluntary response sampling, purposive sampling , snowball sampling, and quota sampling .

Probability sampling means that every member of the target population has a known chance of being included in the sample.

Probability sampling methods include simple random sampling , systematic sampling , stratified sampling , and cluster sampling .

Using careful research design and sampling procedures can help you avoid sampling bias . Oversampling can be used to correct undercoverage bias .

Some common types of sampling bias include self-selection bias , nonresponse bias , undercoverage bias , survivorship bias , pre-screening or advertising bias, and healthy user bias.

Sampling bias is a threat to external validity – it limits the generalizability of your findings to a broader group of people.

A sampling error is the difference between a population parameter and a sample statistic .

A statistic refers to measures about the sample , while a parameter refers to measures about the population .

Populations are used when a research question requires data from every member of the population. This is usually only feasible when the population is small and easily accessible.

Samples are used to make inferences about populations . Samples are easier to collect data from because they are practical, cost-effective, convenient, and manageable.

There are seven threats to external validity : selection bias , history, experimenter effect, Hawthorne effect , testing effect, aptitude-treatment and situation effect.

The two types of external validity are population validity (whether you can generalize to other groups of people) and ecological validity (whether you can generalize to other situations and settings).

The external validity of a study is the extent to which you can generalize your findings to different groups of people, situations, and measures.

Cross-sectional studies cannot establish a cause-and-effect relationship or analyze behavior over a period of time. To investigate cause and effect, you need to do a longitudinal study or an experimental study .

Cross-sectional studies are less expensive and time-consuming than many other types of study. They can provide useful insights into a population’s characteristics and identify correlations for further research.

Sometimes only cross-sectional data is available for analysis; other times your research question may only require a cross-sectional study to answer it.

Longitudinal studies can last anywhere from weeks to decades, although they tend to be at least a year long.

Longitudinal studies are better to establish the correct sequence of events, identify changes over time, and provide insight into cause-and-effect relationships, but they also tend to be more expensive and time-consuming than other types of studies.

Longitudinal studies and cross-sectional studies are two different types of research design . In a cross-sectional study you collect data from a population at a specific point in time; in a longitudinal study you repeatedly collect data from the same sample over an extended period of time.

Longitudinal study Cross-sectional study
observations Observations at a in time
Observes the multiple times Observes (a “cross-section”) in the population
Follows in participants over time Provides of society at a given point

There are eight threats to internal validity : history, maturation, instrumentation, testing, selection bias , regression to the mean, social interaction and attrition .

Internal validity is the extent to which you can be confident that a cause-and-effect relationship established in a study cannot be explained by other factors.

In mixed methods research , you use both qualitative and quantitative data collection and analysis methods to answer your research question .

The research methods you use depend on the type of data you need to answer your research question .

  • If you want to measure something or test a hypothesis , use quantitative methods . If you want to explore ideas, thoughts and meanings, use qualitative methods .
  • If you want to analyze a large amount of readily-available data, use secondary data. If you want data specific to your purposes with control over how it is generated, collect primary data.
  • If you want to establish cause-and-effect relationships between variables , use experimental methods. If you want to understand the characteristics of a research subject, use descriptive methods.

A confounding variable , also called a confounder or confounding factor, is a third variable in a study examining a potential cause-and-effect relationship.

A confounding variable is related to both the supposed cause and the supposed effect of the study. It can be difficult to separate the true effect of the independent variable from the effect of the confounding variable.

In your research design , it’s important to identify potential confounding variables and plan how you will reduce their impact.

Discrete and continuous variables are two types of quantitative variables :

  • Discrete variables represent counts (e.g. the number of objects in a collection).
  • Continuous variables represent measurable amounts (e.g. water volume or weight).

Quantitative variables are any variables where the data represent amounts (e.g. height, weight, or age).

Categorical variables are any variables where the data represent groups. This includes rankings (e.g. finishing places in a race), classifications (e.g. brands of cereal), and binary outcomes (e.g. coin flips).

You need to know what type of variables you are working with to choose the right statistical test for your data and interpret your results .

You can think of independent and dependent variables in terms of cause and effect: an independent variable is the variable you think is the cause , while a dependent variable is the effect .

In an experiment, you manipulate the independent variable and measure the outcome in the dependent variable. For example, in an experiment about the effect of nutrients on crop growth:

  • The  independent variable  is the amount of nutrients added to the crop field.
  • The  dependent variable is the biomass of the crops at harvest time.

Defining your variables, and deciding how you will manipulate and measure them, is an important part of experimental design .

Experimental design means planning a set of procedures to investigate a relationship between variables . To design a controlled experiment, you need:

  • A testable hypothesis
  • At least one independent variable that can be precisely manipulated
  • At least one dependent variable that can be precisely measured

When designing the experiment, you decide:

  • How you will manipulate the variable(s)
  • How you will control for any potential confounding variables
  • How many subjects or samples will be included in the study
  • How subjects will be assigned to treatment levels

Experimental design is essential to the internal and external validity of your experiment.

I nternal validity is the degree of confidence that the causal relationship you are testing is not influenced by other factors or variables .

External validity is the extent to which your results can be generalized to other contexts.

The validity of your experiment depends on your experimental design .

Reliability and validity are both about how well a method measures something:

  • Reliability refers to the  consistency of a measure (whether the results can be reproduced under the same conditions).
  • Validity   refers to the  accuracy of a measure (whether the results really do represent what they are supposed to measure).

If you are doing experimental research, you also have to consider the internal and external validity of your experiment.

A sample is a subset of individuals from a larger population . Sampling means selecting the group that you will actually collect data from in your research. For example, if you are researching the opinions of students in your university, you could survey a sample of 100 students.

In statistics, sampling allows you to test a hypothesis about the characteristics of a population.

Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.

Quantitative methods allow you to systematically measure variables and test hypotheses . Qualitative methods allow you to explore concepts and experiences in more detail.

Methodology refers to the overarching strategy and rationale of your research project . It involves studying the methods used in your field and the theories or principles behind them, in order to develop an approach that matches your objectives.

Methods are the specific tools and procedures you use to collect and analyze data (for example, experiments, surveys , and statistical tests ).

In shorter scientific papers, where the aim is to report the findings of a specific study, you might simply describe what you did in a methods section .

In a longer or more complex research project, such as a thesis or dissertation , you will probably include a methodology section , where you explain your approach to answering the research questions and cite relevant sources to support your choice of methods.

Ask our team

Want to contact us directly? No problem.  We  are always here for you.

Support team - Nina

Our team helps students graduate by offering:

  • A world-class citation generator
  • Plagiarism Checker software powered by Turnitin
  • Innovative Citation Checker software
  • Professional proofreading services
  • Over 300 helpful articles about academic writing, citing sources, plagiarism, and more

Scribbr specializes in editing study-related documents . We proofread:

  • PhD dissertations
  • Research proposals
  • Personal statements
  • Admission essays
  • Motivation letters
  • Reflection papers
  • Journal articles
  • Capstone projects

Scribbr’s Plagiarism Checker is powered by elements of Turnitin’s Similarity Checker , namely the plagiarism detection software and the Internet Archive and Premium Scholarly Publications content databases .

The add-on AI detector is powered by Scribbr’s proprietary software.

The Scribbr Citation Generator is developed using the open-source Citation Style Language (CSL) project and Frank Bennett’s citeproc-js . It’s the same technology used by dozens of other popular citation tools, including Mendeley and Zotero.

You can find all the citation styles and locales used in the Scribbr Citation Generator in our publicly accessible repository on Github .

Mixture model applications in depression phenotyping: practices, challenges, and recommendations

  • Published: 26 July 2024

Cite this article

example of a longitudinal research question

  • Qimin Liu   ORCID: orcid.org/0000-0003-3840-1136 1 ,
  • Meng Qiu 2 ,
  • Bridget A. Nestor 3 ,
  • Violeta J. Rodriguez 4 &
  • David A. Cole 5  

13 Accesses

10 Altmetric

Explore all metrics

Applications of mixture models are prevalent in studying psychopathology across development, particularly for identifying typical co-occurring symptom presentations (or phenotypes) in depression. Researchers have used both longitudinal and cross-sectional designs with varied statistical methods. The current study focused on studies that applied latent profile analysis, latent class growth analysis, and growth mixture models to phenotype continuously treated depressive symptoms. The current study aims to (a) provide a brief overview of common mixture models that are used in depression phenotyping, (b) review empirical applications of these methods in cross-sectional and longitudinal research of depression, (c) discuss the methodological considerations and recommendations in identifying phenotypes of depression when continuously treated symptoms are used. In 72 studies, we found heterogeneity in mixture model specification, selection, and interpretation. We identified three challenges in current practices: a “garbage in, garbage out” problem, inconsistent use and reporting of model selection criteria, and diverse, incomparable, and incomplete phenotype characterizations. We recommend that researchers: (1) select and justify measures and models based on the research question during model specification; (2) report BIC and bootstrapped likelihood ratio tests of all compared models, grounding model selection on the philosophy of science during model comparison; (3) provide all parameter estimates and use R 2 measures for class characterization during model interpretation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

example of a longitudinal research question

Similar content being viewed by others

Decomposing the heterogeneity of depression at the person-, symptom-, and time-level: latent variable models versus multimode principal component analysis.

example of a longitudinal research question

Unconsidered issues of measurement noninvariance in biological psychiatry: A focus on biological phenotypes of psychopathology

example of a longitudinal research question

A flexible class of parametric distributions for Bayesian linear mixed models

Data availability.

As a review, no new data was collected as part of this manuscript. Data reviewed are summarized in the supplemental material.

More formally, let  \(\varvec{y}=({y_{1}},...,{y_{N}})\) be independent and identically distributed observations from a population containing G  groups, the mixture density can be expressed as. \(\:\begin{array}{c}f\left(\varvec{y}\right)={\sum\:}_{g}^{G}{\pi\:}_{g}{f}_{g}\left(\varvec{y};{\varvec{\theta\:}}_{g}\right),\end{array}\)

where \(\:f(\cdot)\) and \(\:{f}_{g}(\cdot)\) represent the joint density function and density function of the g -th group, respectively.

Other modern phenotyping methods for cross-sectional data include taxonometrics (Meehl, 1995 ) and algorithmic clustering (e.g., K-means; Lloyd, 1982 ). Mixture models have outperformed taxonometric methods in phenotype detection and assignment (Lubke & Tueller, 2010 ). Mixture models also subsume K-means clustering, a special case assuming equal proportions of normal component densities and a common spherical covariance matrix (McLachlan, 1982 , 2011 ). This constraint can lead to poor phenotype identification under some conditions and can lack flexibility needed for the complex social and behavioral data (Vermunt, 2011 ). Unlike K-means and other distance-based clustering approaches, mixture modeling provides parameter estimates that enable statistical inferences and fit indices to quantify model fit (Vermunt, 2011 ). In addition, mixture modeling is highly flexible in that it can accommodate various data types (e.g., categorical, continuous, and mixed-type; Morgan, 2015 ) and easily incorporate auxiliary variables such as external covariates and outcomes (Asparouhov & Muthén, 2014 ). Another advantage of mixture modeling is that it can be readily extended to longitudinal data.

Denote sum scores of observed symptoms as \(\:{S}_{i}\) . In semi-parametric group-based trajectory models, sum scores for an individual of a phenotype \(\:{\zeta}_{i}\)  at time \(\:{T}_{i}\) follow \(\:{S}_{i}|{\zeta\:}_{i},{T}_{i}=z\sim\:Normal\;({\alpha\:}_{z}+{\beta\:}_{z}{T}_{i},{{\upsigma\:}}_{z}^{2})\) . Here \(\:{\alpha}_{z}\) and \(\:{\beta}_{z}\) are the phenotype-specific intercept and slope, respectively. We have the latent phenotype \(\:{\zeta}_{i}\sim\:Multinomial(C,\pi)\) with C indicating the number of phenotypes and \(\:\pi\:\) representing a C -length vector indicating class probability.

For growth curve mixture model, researchers can additionally specify random effects in the intercept and/or the slope parameter, \(\:\left[\begin{array}{c}{\alpha}_{z}\\\:{\beta}_{z}\end{array}\right]\sim\:Normal\left(\left[\begin{array}{c}{\mu}_{{\alpha}_{z}}\\\:{\mu}_{{\beta}_{z}}\end{array}\right],\left[\begin{array}{cc}{\sigma}_{{\alpha}_{z}}^{2}&\:\rho{\sigma}_{{\alpha}_{z}}{\sigma}_{{\beta}_{z}}\:\\\:\rho{\sigma}_{{\alpha}_{z}}{\sigma}_{{\beta}_{z}}&\:{\sigma}_{{\beta}_{z}}^{2}\end{array}\right]\right)\) to account for individual differences. Here \(\:{\mu}_{{\alpha}_{z}}\) and \(\:{\mu}_{{\beta}_{z}}\) are the population means for the phenotype-specific intercept and slope; \(\:{\sigma\:}_{\alpha\:}^{2}\) and \(\:{\sigma\:}_{\beta\:}^{2}\) are the population variance for the phenotype-specific intercept and slope; \(\:{\rho}_{z}\) denotes the correlation between phenotype-specific slope and intercept. We have the latent phenotype \(\:{\zeta\:}_{i}\sim\:Multinomial(C,\pi\:)\) with C indicating the number of phenotypes and \(\:\pi\:\) representing a C -length vector indicating class probability.

To limit the scope of the review, we focus on mixture models that are designed for continuously measured variables.

We wish to clarify that the limitations we discuss are not exclusive to mixture models but also apply to other modeling approaches, such as typical factor-analytic models. These models inherently assume a linear relationship between latent factors and observed indicators, which may prevent the identification of potentially complex (e.g., polynomial) relationships between them.

Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19 (6), 716–723. https://doi.org/10.1109/TAC.1974.1100705

Article   Google Scholar  

American Psychiatric Association. (2013). Diagnostic and statistical Manual of Mental disorders . American Psychiatric Association. https://doi.org/10.1176/appi.books.9780890425596

Book   Google Scholar  

Anderson, J. C., & Gerbing, D. W. (1988). Structural equation modeling in practice: A review and recommended two-step approach. Psychological Bulletin , 103 (3), 411–423. https://doi.org/10.1037/0033-2909.103.3.411

Asparouhov, T., & Muthén, B. (2014). Auxiliary variables in mixture modeling: Three-step approaches using M plus. Structural Equation Modeling: A Multidisciplinary Journal, 21 (3), 329–341. https://doi.org/10.1080/10705511.2014.915181

Bakk, Z., & Kuha, J. (2021). Relating latent class membership to external variables: An overview. British Journal of Mathematical and Statistical Psychology , 74 (2), 340–362. https://doi.org/10.1111/bmsp.12227

Article   PubMed   Google Scholar  

Baptista, M. N., Cunha, F., & Hauck, N. (2019). The latent structure of depression symptoms and suicidal thoughts in Brazilian youths. Journal of Affective Disorders , 254 (November 2018), 90–97. https://doi.org/10.1016/j.jad.2019.05.024

Barton, Y. A., Barkin, S. H., & Miller, L. (2017). Deconstructing depression: A latent profile analysis of potential depressive subtypes in emerging adults. Spirituality in Clinical Practice , 4 (1), 1–21. https://doi.org/10.1037/scp0000126

Bozdogan, H. (1987). Model selection and Akaike’s Information Criterion (AIC): The general theory and its analytical extensions. Psychometrika , 52 (3), 345–370. https://doi.org/10.1007/BF02294361

Burton, R. (1989). The anatomy of melancholy / Robert Burton; edited by Thomas C. Faulkner, Nicolas K. Kiessling, Rhonda L. Blair; with an introduction by J.B. Bamborough. In T. C. Faulkner, N. K. Kiessling, & R. L. Blair (Eds.), Robert Burton’s the anatomy of melancholy . Clarendon Press.

Celeux, G., & Soromenho, G. (1996a). An entropy criterion for assessing the number of clusters in a mixture model. Journal of Classification , 13 (2), 195–212. https://doi.org/10.1007/BF01246098

Celeux, G., & Soromenho, G. (1996b). An entropy criterion for assessing the number of clusters in a mixture model. Journal of Classification , 13 (2), 195–212. https://doi.org/10.1007/BF01246098

Chirinos, D. A., Murdock, K. W., LeRoy, A. S., & Fagundes, C. (2017). Depressive symptom profiles, cardio-metabolic risk and inflammation: Results from the MIDUS study. Psychoneuroendocrinology , 82 , 17–25. https://doi.org/10.1016/j.psyneuen.2017.04.011

Article   PubMed   PubMed Central   Google Scholar  

Collins, L. M., & Lanza, S. T. (2010). Latent class and latent transition analysis: With applications in the social, behavioral, and health sciences . Wiley.

Google Scholar  

Collins, L. M., Fidler, P. L., Wugalter, S. E., & Long, J. D. (1993). Goodness-of-fit testing for latent class models. Multivariate Behavioral Research , 28 (3), 375–389. https://doi.org/10.1207/s15327906mbr2803_4

Cuijpers, P., Reynolds, C. F., Donker, T., Li, J., Andersson, G., & Beekman, A. (2012). Personalized treatment of adult depression: medication, psychotherapy, or both? a systematic review: Research Article: Personalized Treatment of Adult Depression. Depression and Anxiety , 29 (10), 855–864. https://doi.org/10.1002/da.21985

Dziak, J. J., Coffman, D. L., Lanza, S. T., Li, R., & Jermiin, L. S. (2020). Sensitivity and specificity of information criteria. Briefings in Bioinformatics , 21 (2), 553–565. https://doi.org/10.1093/bib/bbz016

Enders, C. K., & Tofighi, D. (2008). The impact of Misspecifying Class-specific residual variances in growth mixture models. Structural Equation Modeling: A Multidisciplinary Journal , 15 (1), 75–95. https://doi.org/10.1080/10705510701758281

Finch, W. H., & French, B. F. (2015). Latent variable modeling with R . Routledge.

Flynt, A., & Dean, N. (2019). Growth mixture modeling with measurement selection. Journal of Classification , 36 (1), 3–25. https://doi.org/10.1007/s00357-018-9275-9

Galen (1952). On the natural faculties. Great Books of the Western World (Vol 10).

Gaston, S., Nugent, N., Peters, E. S., Ferguson, T. F., Trapido, E. J., Robinson, W. T., & Rung, A. L. (2016). Exploring heterogeneity and correlates of depressive symptoms in the women and their children’s Health (WaTCH) study. Journal of Affective Disorders , 205 , 190–199. https://doi.org/10.1016/j.jad.2016.03.067

Gibson, W. A. (1959). Three multivariate models: Factor analysis, latent structure analysis, and latent profile analysis. Psychometrika, 24 (3), 229–252.

Grimm, K. J., & Ram, N. (2009). A second-order growth mixture model for Developmental Research. Research in Human Development , 6 (2–3), 121–143. https://doi.org/10.1080/15427600902911221

Hagenaars, J. A., & McCutcheon, A. L. (2002). In J. A. Hagenaars, & A. L. McCutcheon (Eds.), Applied Latent Class Analysis . Cambridge University Press. https://doi.org/10.1017/CBO9780511499531

Chapter   Google Scholar  

Halpin, P. F., Dolan, C. V., Grasman, R. P. P. P., & De Boeck, P. (2011). On the relation between the linear factor model and the latent profile model. Psychometrika, 76 (4), 564–583. https://doi.org/10.1007/s11336-011-9230-8

Herman, K. C., Cohen, D., Reinke, W. M., Ostrander, R., Burrell, L., McFarlane, E., & Duggan, A. K. (2018). Using latent profile and transition analyses to understand patterns of informant ratings of child depressive symptoms. Journal of School Psychology , 69 (February), 84–99. https://doi.org/10.1016/j.jsp.2018.05.004

Hippocrates (1997). Airs, Waters, and Places. In M. J. Dobson (Ed.), Contours of Death and Disease in Early Modern England (Issue Vol. 29). Cambridge University Press. https://doi.org/10.4159/DLCL.hippocrates_cos-airs_waters_places .1923.

Jansson, Å. (2021). Statistics, Classification, and the Standardisation of Melancholia BT - From Melancholia to Depression: Disordered Mood in Nineteenth-Century Psychiatry (Å. Jansson, Ed.; pp. 123–171). Springer International Publishing. https://doi.org/10.1007/978-3-030-54802-5_5

Kass, R. E., & Wasserman, L. (1995). A reference bayesian test for nested hypotheses and its relationship to the Schwarz Criterion. Journal of the American Statistical Association , 90 (431), 928–934. https://doi.org/10.1080/01621459.1995.10476592

Killian, M. O., Sanchez, K., Eghaneyan, B. H., Cabassa, L. J., & Trivedi, M. H. (2020). Profiles of depression in a treatment-seeking hispanic population: Psychometric properties of the Patient Health Questionnaire-9. International Journal of Methods in Psychiatric Research . https://doi.org/10.1002/mpr.1851

Kim, E. S., & Wang, Y. (2017). Class Enumeration and Parameter Recovery of Growth Mixture Modeling and Second-Order Growth Mixture Modeling in the Presence of Measurement Noninvariance between Latent Classes. Frontiers in Psychology , 8 . https://doi.org/10.3389/fpsyg.2017.01499

Laird, N. M., & Ware, J. H. (1982). Random-effects models for longitudinal data. Biometrics , 38 (4), 963–974. https://doi.org/10.2307/2529876

Lloyd, S. (1982). Least squares quantization in PCM. IEEE Transactions on Information Theory, 28 (2), 129–137. https://doi.org/10.1109/TIT.1982.1056489

Lo, Y., Mendell, N. R., & Rubin, D. B. (2001). Testing the number of components in a normal mixture. Biometrika, 88 (3), 767–778. https://doi.org/10.1093/biomet/88.3.767

Lubke, G. H., & Luningham, J. (2017). Fitting latent variable mixture models. Behaviour Research and Therapy , 98 , 91–102. https://doi.org/10.1016/j.brat.2017.04.003

Lubke, G., & Neale, M. C. (2006). Distinguishing between latent classes and continuous factors: Resolution by maximum likelihood? Multivariate Behavioral Research , 41 (4), 499–532. https://doi.org/10.1207/s15327906mbr4104_4

Lubke, G., & Tueller, S. (2010). Latent class detection and class assignment: A comparison of the MAXEIG taxometric procedure and factor mixture modeling approaches. Structural Equation Modeling: A Multidisciplinary Journal, 17 (4), 605–628. https://doi.org/10.1080/10705511.2010.510050

Masyn, K. E. (2013). Latent class analysis and finite mixture modeling. The Oxford handbook of quantitative methods: Statistical analysis, Vol. 2 (pp. 551–611). Oxford University Press.

Maudsley, H. (1895). The pathology of mind . Macmillan.

McDonald, R. P. (1967). Factor interaction in nonlinear factor analysis*.  ETS Research Bulletin Series, 1967 (2).  https://doi.org/10.1002/j.2333-8504.1967.tb00990.x

McLachlan, G. J. (1982). The classification and mixture maximum likelihood approaches to cluster analysis. In P. R. Krishnaiah & L. Kanal (Eds.), Handbook of statistics (Vol. 2, pp. 199–208). North-Holland.

McLachlan, G. J. (2011). Commentary on Steinley and Brusco (2011): Recommendations and cautions. Psychological Methods, 16 (1), 80–81. https://doi.org/10.1037/a0021141

McLachlan, G., & Peel, D. (2000). Finite Mixture Models (1st ed.). Wiley. https://doi.org/10.1002/0471721182

Meehl, P. E. (1995). Bootstraps taxometrics: Solving the classification problem in psychopathology. American Psychologist, 50 (4), 266–275. https://doi.org/10.1037/0003-066X.50.4.266

Molenaar, P. C. M., & von Eye, A. (1994). On the arbitrary nature of latent variables. In A. von Eye & C. C. Clogg (Eds.), Latent variables analysis: Applications for developmental research (pp. 226–242). Sage Publications Inc.

Mora, P. A., Beamon, T., Preuitt, L., DiBonaventura, M., Leventhal, E. A., & Leventhal, H. (2012). Heterogeneity in depression symptoms and health status among older adults. Journal of Aging and Health , 24 (5), 879–896. https://doi.org/10.1177/0898264312440323 .

Morgan, G. B. (2015). Mixed Mode Latent Class Analysis: An examination of Fit Index performance for classification. Structural Equation Modeling: A Multidisciplinary Journal , 22 (1), 76–86. https://doi.org/10.1080/10705511.2014.935751

Murphy, T. D. (1981). Medical knowledge and statistical methods in early nineteenth-century France. Medical History , 25 (3), 301–319. https://doi.org/10.1017/S0025727300034608

Muthen, B. (2006). Should substance use disorders be considered as categorical or dimensional? Addiction , 101 , 6–16. https://doi.org/10.1111/j.1360-0443.2006.01583.x

Muthén, B., & Muthén, L. K. (2000). Integrating person-centered and variable‐centered analyses: Growth mixture modeling with latent trajectory classes. Alcoholism: Clinical and Experimental Research , 24 (6), 882–891. https://doi.org/10.1111/j.1530-0277.2000.tb02070.x

Muthén, B., & Shedden, K. (1999). Finite Mixture modeling with mixture outcomes using the EM Algorithm. Biometrics , 55 (2), 463–469. https://doi.org/10.1111/j.0006-341X.1999.00463.x

Nagin, D. S. (1999). Analyzing developmental trajectories: A semiparametric, group-based approach. Psychological Methods , 4 (2), 139–157. https://doi.org/10.1037/1082-989X.4.2.139

Nylund, K. L., Asparouhov, T., & Muthén, B. O. (2007). Deciding on the number of classes in latent class analysis and growth mixture modeling: A Monte Carlo Simulation Study. Structural Equation Modeling: A Multidisciplinary Journal , 14 (4), 535–569. https://doi.org/10.1080/10705510701575396

Nylund-Gibson, K., & Choi, A. Y. (2018). Ten frequently asked questions about latent class analysis. Translational Issues in Psychological Science , 4 (4), 440–461. https://doi.org/10.1037/tps0000176

Oh, Y., Joung, Y. S., Baek, J., & Yoo, N. H. (2020). Maternal depression trajectories and child executive function over 9 years. Journal of Affective Disorders , 276 (May), 646–652. https://doi.org/10.1016/j.jad.2020.07.065

Peugh, J., & Fan, X. (2012). How well does growth mixture modeling identify heterogeneous growth trajectories? A Simulation Study examining GMM’s performance characteristics. Structural Equation Modeling: A Multidisciplinary Journal , 19 (2), 204–226. https://doi.org/10.1080/10705511.2012.659618

Peugh, J., & Fan, X. (2013). Modeling unobserved heterogeneity using Latent Profile Analysis: A Monte Carlo Simulation. Structural Equation Modeling: A Multidisciplinary Journal , 20 (4), 616–639. https://doi.org/10.1080/10705511.2013.824780

Raftery, A. (1996). Approximate Bayes factors and accounting for model uncertainty in generalised linear models. Biometrika , 83 (2), 251–266. https://doi.org/10.1093/biomet/83.2.251

Raftery, A. E., & Dean, N. (2006). Variable selection for model-based clustering. Journal of the American Statistical Association , 101 (473), 168–178. https://doi.org/10.1198/016214506000000113

Rusakov, D., & Geiger, D. (2005). Asymptotic model selection for naive bayesian networks. Journal of Machine Learning Research , 6 (1), 1–35

Savage, G. H. (1884). Insanity and allied neuroses: Practical and clinical . Cassell.

Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics , 6 (2). https://doi.org/10.1214/aos/1176344136

Sclove, S. L. (1987). Application of model-selection criteria to some problems in multivariate analysis. Psychometrika, 52 (3), 333–343. https://doi.org/10.1007/BF02294360

Sher, K. J., Jackson, K. M., & Steinley, D. (2011). Alcohol use trajectories and the ubiquitous cat’s cradle: Cause for concern? Journal of Abnormal Psychology , 120 (2), 322–335. https://doi.org/10.1037/a0021813

Shore, L., Toumbourou, J. W., Lewis, A. J., & Kremer, P. (2018). Review: Longitudinal trajectories of child and adolescent depressive symptoms and their predictors—A systematic review and meta-analysis. Child and Adolescent Mental Health , 23 (2), 107–120. https://doi.org/10.1111/camh.12220

Spurk, D., Hirschi, A., Wang, M., Valero, D., & Kauffeld, S. (2020). Latent profile analysis: A review and how to guide of its application within vocational behavior research. Journal of Vocational Behavior , 120 , 103445. https://doi.org/10.1016/j.jvb.2020.103445

Steinley, D., & McDonald, R. R. (2007). Examining factor score distributions to determine the nature of latent spaces. Multivariate Behavioral Research , 42 (1), 133–156. https://doi.org/10.1080/00273170701341217

Sterba, S. K., & Rights, J. D. (2022). R-squared measures for Multilevel Mixture models with Random effects. Structural Equation Modeling: A Multidisciplinary Journal , 29 (4), 489–506. https://doi.org/10.1080/10705511.2021.1962325

Tein, J. Y., Coxe, S., & Cham, H. (2013). Statistical power to detect the correct number of classes in Latent Profile Analysis. Structural Equation Modeling: A Multidisciplinary Journal , 20 (4), 640–657 https://doi.org/10.1080/10705511.2013.824781 .

Ulbricht, C. M., Chrysanthopoulou, S. A., Levin, L., & Lapane, K. L. (2018). The use of latent class analysis for identifying subtypes of depression: A systematic review. Psychiatry Research , 266 (3), 228–246. https://doi.org/10.1016/j.psychres.2018.03.003

Vermunt, J. K. (2011). K-means may perform as well as mixture model clustering but may also be much worse: Comment on Steinley and Brusco (2011). Psychological Methods, 16 (1), 82–88. https://doi.org/10.1037/a0020144

Wagenmakers, E. J. (2007). A practical solution to the pervasive problems ofp values. Psychonomic Bulletin & Review , 14 (5), 779–804. https://doi.org/10.3758/BF03194105

Wasserman, L. (2000). Bayesian model selection and Model Averaging. Journal of Mathematical Psychology , 44 (1), 92–107. https://doi.org/10.1006/jmps.1999.1278

Whittaker, T. A., & Miller, J. E. (2021). Exploring the Enumeration Accuracy of Cross-validation Indices in Latent Class Analysis. Structural Equation Modeling: A Multidisciplinary Journal , 28 (3), 376–390. https://doi.org/10.1080/10705511.2020.1802280

Yu, J., Goldstein, R. B., Haynie, D. L., Luk, J. W., Fairman, B. J., Patel, R. A., Vidal-Ribas, P., Maultsby, K., Gudal, M., & Gilman, S. E. (2021). Resilience factors in the Association between depressive symptoms and suicidality. Journal of Adolescent Health . https://doi.org/10.1016/j.jadohealth.2020.12.004

Download references

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sector.

Author information

Authors and affiliations.

Department of Psychological and Brain Sciences, Boston University, Boston, MA, 02215, USA

University of Notre Dame, Notre Dame, IN, USA

Boston Children’s Hospital, Boston, MA, USA

Bridget A. Nestor

University of Illinois Urbana Champaign, Champaign, IL, USA

Violeta J. Rodriguez

Vanderbilt University, Nashville, TN, USA

David A. Cole

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Qimin Liu .

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplemental material (xlsx 32 kb), rights and permissions.

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Liu, Q., Qiu, M., Nestor, B.A. et al. Mixture model applications in depression phenotyping: practices, challenges, and recommendations. Curr Psychol (2024). https://doi.org/10.1007/s12144-024-06309-6

Download citation

Accepted : 23 June 2024

Published : 26 July 2024

DOI : https://doi.org/10.1007/s12144-024-06309-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Mixture model
  • Latent profile analysis
  • Growth mixture analysis
  • Latent class growth analysis
  • Find a journal
  • Publish with us
  • Track your research
  • Advanced search

Journal of the American Academy of Psychiatry and the Law

Advanced Search

Mental Health and Social Correlates of Reincarceration of Youths as Adults

  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Figures & Data
  • Info & Metrics

The rise in the U.S. prison population over the past 40 years has heightened scrutiny of the incarceration of children and adolescents. Correlates of later reincarceration in this group, especially correlates relating to psychiatric and substance use disorders, are understudied in the U.S. population. We aimed to establish the prevalence and correlates of the reincarceration as adults of people incarcerated before age 18. Data were derived from clinical interviews and from validated diagnostic and psychometric instruments. They were obtained as part of a cross-sectional representative survey of the civilian U.S. population, the National Epidemiological Survey on Alcohol and Related Conditions (NESARC-III). We identified 1,543 adults (4.3% of the NESARC sample) who had been incarcerated before they were 18. Of these, 55.9 percent had subsequently been incarcerated as adults. In addition to variables that have been repeatedly identified in criminological research (less education, past antisocial behavior, and parental imprisonment), substance use disorder, bipolar disorder, and longer childhood incarceration were independently associated with incarceration as an adult. The possibility that psychiatric treatment could reduce reincarceration in this group warrants longitudinal and experimental research.

  • incarceration
  • substance use disorder

Press coverage of the incarceration of children and adolescents 1 comes in the context of long-standing concerns over the personal and societal cost of the threefold increase in U.S. incarceration rates since the 1980s. 2 Although most of this increase took place prior to 2010, the further detention of many of these children and adolescents when they become adults raises questions concerning the effectiveness of deterrence and the unwanted effects of incarceration. From a psychiatric perspective, it raises the question also of the extent to which mental ill-health contributes to the risk of subsequent incarceration.

There are few data describing the psychiatric risk factors that are associated with the reincarceration of children as adults. Research on criminal recidivism suggests that parental criminality, 3 a genetic predisposition to antisocial behavior, neuropsychological deficits, and adverse family and neighborhood environments in childhood are likely to be important. 4 Those adverse environments include low socioeconomic status, parental conflict, harsh discipline, low levels of parental support, and high neighborhood levels of crime. 5

The roles of cause and effect can be difficult to distinguish in statistical associations 6 , 7 but there are reasons to suspect that psychiatric factors are important precursors of reincarceration also. Children and adolescents in detention represent a vulnerable group with high rates of trauma, sexual abuse, suicidal ideation, and substance use. 8 Lifetime incarceration rates for people with any DSM-5 mental disorder are nearly five times higher than those of people without such a condition, 9 even controlling for substance use disorder. 10 People who offend frequently, and hence could be expected to be incarcerated more, have higher rates of psychological problems. 11 , 12

The role of depression, although seemingly limited, appears to be complicated. One study that showed no difference, in terms of mental health, between high-rate adolescent offenders who recidivated as adults and those who did not also showed higher levels of “depression/anxiety” to be associated with lower rates of violent recidivism. 13 The relationship between mental disorder and offending risk can vary with the temporal trajectory of that offending. Reising and colleagues found mental health problems to be associated with offending that persists into adulthood, but not with offending that ceases in adolescence. 14

Researchers continue to point to the benefits of identifying groups of offenders at high risk of reoffending in order to focus services on them. 15 In this study we included as variables both known social risk factors for reincarceration and factors related to mental health, such as reliably ascertained diagnosis and a history of mental health treatment. The sample consisted of participants in a large cross-sectional representative survey of the U.S. population who described being incarcerated before the age of 18. We identified those who described subsequently being reincarcerated as adults. We describe the rate and correlates of reincarceration both overall and separately for subjects with extended (1 week or more) and brief (less than 1 week) childhood incarceration.

Data Source and Study Sample

The data were collected in the third iteration of the National Epidemiological Survey on Alcohol and Related Conditions (NESARC-III), a cross-sectional representative survey of the civilian U.S. population sponsored by the National Institute on Alcohol Abuse and Alcoholism (NIAAA). 16 The NESARC-III conducted in-person interviews with U.S. adults, including the residents of group and rest homes, between April 2012 and June 2013. Ethnic minorities were oversampled to ensure adequate numbers for statistical analysis. Individuals who were institutionalized at the time of the study (in nursing homes, prisons, hospitals, or shelters) were excluded, as were active-duty military personnel.

The NESARC-III sample size was 36,309, and data were available on childhood incarceration for 36,293 subjects. Consent procedures were approved by the National Institutes of Health. All data were deidentified prior to their use in the present study. Details of the NESARC-III methodology have been published previously. 10 , 17

The NESARC-III interview questions are publicly available (Ref. 18 , Sections 1-18). The sample used here was generated using the response to the NESARC-III interview item, “Before you were 18, were you ever in jail, prison, or a juvenile detention center?” The dependent variable, incarceration as an adult, was generated using the response to the NESARC-III interview item, “Since you were 18, were you ever in jail, prison, or a correctional facility?”

The independent sociodemographic variables comprised age, sex, self-defined ethnicity, and marital status. Variables relating to a subject’s behavioral background comprised dichotomous questions concerning a parental history of alcohol or drug abuse, imprisonment, psychiatric hospitalization, suicide attempts, and completed suicide. The questions were “Before you were 18 years old, was a parent or other adult living in your home a problem drinker or alcoholic?”; “Before you were 18 years old, did a parent or other adult living in your home have some similar problems with drugs?”; “Before you were 18 years old, did a parent or other adult living in your home go to jail or prison?”; “Before you were 18 years old, was a parent or other adult living in your home treated or hospitalized for a mental illness?”; “Before you were 18 years old, did a parent or other adult living in your home attempt suicide?”; and “Before you were 18 years old, did a parent or other adult living in your home actually commit suicide?”

A history of childhood neglect or abuse was rated by adding the responses on separate five-point scales (endpoints “never” and “very often”) for each of the following questions, prefaced with “Before you were 18 years old”: “How often were you made to do chores that were too difficult or dangerous for someone your age?”; “How often were you left alone or unsupervised when you were too young to be alone, that is, before you were 10 years old?”; “How often did you go without things you needed like clothes, shoes or school supplies because a parent or other adult living in your home spent the money on themselves?”; “How often did a parent or other adult living in your home make you go hungry or not prepare regular meals?”; “How often did a parent or other adult living in your home ignore or fail to get you medical treatment when you were sick or hurt?”; “How often did a parent or other adult living in your home swear at you, insult you or say hurtful things?”; “How often did a parent or other adult living in your home threaten to hit you or throw something at you, but didn’t do it?”; “How often did a parent or other adult living in your home act in any other way that made you afraid that you would be physically hurt or injured?”; and “How often did a parent or other adult living in your home push, grab, shove, slap or hit you?”

The same five-point scale was used to rate sexual abuse, for which the items were each prefaced with “Before you were 18 years old”: “How often did an adult or other person touch or fondle you in a sexual way when you didn’t want them to or when you were too young to know what was happening?”; “How often did an adult or other person have you touch their body in a sexual way when you didn’t want to or were too young to know what was happening?”; “How often did an adult or other person attempt to have sexual intercourse with you when you didn’t want them to or you were too young to know what was happening?”; and “How often did an adult or other person actually have sexual intercourse with you when you didn’t want them to or you were too young to know what was happening?”

“Supportive family” was rated using the summed positive responses to five question that concerned a subject’s life before age 18 and that included “[I] felt I was part of a close‐knit family” and “My family was a source of strength and support.” The subject interviews also provided data on educational attainment (less than high school diploma, high school diploma or “GED,” college attendance but did not complete, completed college), military service and combat exposure, a history of homelessness (lifetime and before the age of 15), and the length of any incarceration in jail, prison, or juvenile detention center before the age of 18 (using the NESARC item “About how long altogether were you in jail or a juvenile detention center before you were 18?”). To create the variable used here, we dichotomized pre-18 incarceration at the median (one week).

Mental health diagnoses for mood disorders, anxiety disorders, posttraumatic stress disorder (PTSD), eating disorders, substance use disorders, and personality disorders were generated using the Alcohol Use Disorder and Associated Disabilities Interview Schedule (AUDADIS)-5, 19 a structured interview that generates DSM-5 categories. Test-retest reliability for the diagnostic categories generated by the AUDADIS-5 ranges from fair to excellent and is similar to that for the DSM categories generated by other structured interviews. 19 We used DSM-5 categories covering a subject’s lifetime.

In this analysis, as has been done elsewhere, 20 we further used the response on two additional NESARC-III items: “In the last 12 months, did a doctor or other health professional tell you that you had schizophrenia or a psychotic illness or episode?” and “Did this happen before 12 months ago?” to generate a further variable, lifetime “schizophrenia/psychosis.” Subjects were asked whether they had ever attended treatment for mental health symptoms or substance abuse. They were asked whether they had sought help from members of clergy for drug-related problems; used self-help groups, such as Narcotics Anonymous, for drug treatment; or made a suicide attempt.

We used two variables to assess current supports and stressors. “Religion importance” was rated (from 1-4) from the question “How important is religion to you?” Higher scores corresponded to greater importance. Social support was assessed using the Interpersonal Support and Evaluation List-12 (ISEL-12), a 12-item instrument with a potential range of 12-48. 21 Higher scores indicated greater social support. We assessed subjects’ functioning at the time of interview using current income and employment. Employment was evaluated using the response to a question addressing a variety of nonmutually exclusive work experiences in the last 12 months (work full time, disabled or unemployed, retired, employed part time, and unemployment).

Procedure and Statistical Analyses

In NESARC subjects who described having been incarcerated before the age of 18 ( n  = 1,543; henceforth, “the sample”), we examined the rates and bivariate correlations of being reincarcerated after the age of 18 and, using NESARC data weighted to adjust for nonresponse, generated effect sizes (risk ratios and Cohen’s d) for reincarceration. We focused on effect sizes because significance testing (generating P values) is less informative with large sample sizes where small, unimportant effects can be statistically significant. 22

We entered all variables with substantial effect sizes on bivariate analysis (criteria: risk ratio > 1.5 or <.67; Cohen’s d > .2 or <-.2) 23 along with length of childhood incarceration (less than one week, the median duration of incarceration, versus longer periods) into multivariate analyses to identify factors independently associated with adult reincarceration. We thus studied the role of length of childhood incarceration in three ways: bivariate analysis, multivariate analysis, and by examining the rates and correlates of reincarceration in brief and longer term childhood incarceration subgroups separately. All statistical tests were performed using the statistical software SAS version 9.4.

Description of Sample

People incarcerated before age 18 ( n  = 1,543) comprised 4.3 percent of NESARC subjects ( N = 36,293). Extrapolated to the 2020 U.S. census (population 18 or over = 258,343,281 24 ), they thus represent over 10 million adults living in the United States.

The mean age at interview was 42.6 years; 72.9 percent were male, 64.2 percent were white, and 16.1 percent black; 48.7 percent were married; 27.8 percent had not graduated from high school and 33.6 percent had graduated from high school, but not attended college. For 19.2 percent, one or both parents had problems with drug use, and for 27.3 percent, one or both parents had spent time in prison. The sample had experienced more child neglect (16.9 versus 12.1) and child sexual abuse (4.98 versus 4.41) than nonincarcerated NESARC subjects.

Not including personality disorder or substance use, 48.6 percent met criteria for any lifetime psychiatric disorder and 64.3 percent met criteria for any lifetime substance use disorder diagnosis. A total of 16.4 percent had made a suicide attempt.

Analysis of Entire Sample

Of the 1,543 subjects, 863 (55.9%) were subsequently incarcerated as adults (see Table 1 , columns A and D). On bivariate analysis, subsequent incarceration as an adult was substantially associated both with social factors from subjects’ background (parental drug use and imprisonment, not completing college, not having been exposed to military combat, childhood and lifetime homelessness, and not being widowed or retired) and with both aggregated and specific variables related to their mental health (schizophrenia or psychosis, bipolar disorder, eating disorder, panic disorder, schizotypal and antisocial personality disorder, any drug use diagnosis and five specific drug use disorders (marijuana, opioid, cocaine, sedative, and stimulant)). Finally, on bivariate analysis, reincarceration was also substantially associated with reports of attending substance use treatment as well as having contacts with clergy and self-help groups to address one’s problems (see Table 1 , columns A, D, and the first of the columns labeled effect size). Shorter term childhood incarceration was associated with lower risk of reincarceration with a risk ratio of .71 (42.3% versus 57.6%).

  • View inline

Correlates of Subsequent Incarceration (Bivariate; N  = 1,543)

On multivariate analysis, reincarceration in the entire sample of 1,543 was independently associated with variables from the subjects’ background (a parental history of imprisonment, having been incarcerated for a week or more in childhood, and not having graduated from college) as well as with variables relating to psychopathology (antisocial personality disorder, bipolar disorder, and substance use disorder as evidenced by more than one substance use diagnosis or by reporting having attended substance use treatment; see Table 2 ).

Correlates of Incarceration in Adulthood (Multivariate)

Analysis of Subgroups

Of the subjects who had been incarcerated as children for less than a week, 48.2 percent (365 of 757) went on to be incarcerated as adults. In contrast, 63.4 percent (498) of the 786 subjects who had been incarcerated as children for a week or more went on to be incarcerated as adults ( Table 1 , columns B, C, E, and F). On bivariate analysis, risk factors for later incarceration among those who had been incarcerated for less than a week included serious mental illness (schizophrenia or psychosis and bipolar disorder; see Table 1 columns B and E and the second effect size column), whereas risk factors for later incarceration among those who had been incarcerated as children for longer than a week included failing to complete college and lower income (see Table 1 , columns C and F and the third effect size column). For both subgroups, however, multivariate analysis pointed to the over-riding importance of substance use as a substantial correlate of future incarceration (see Table 2 ).

In a large and representative sample of the U.S. population, 1,543 adults (4.3%) reported having been incarcerated before they were 18. Of these, 55.9 percent had subsequently been reincarcerated as adults. The risk of adult reincarceration increased with time spent incarcerated as a child (48.2% for less than one week and 63.4% for incarceration that lasted longer than that). In addition to variables that have been repeatedly identified in criminological research (lack of education, past antisocial behavior, and parental imprisonment), substance use disorder, bipolar disorder, and longer childhood incarceration were independently associated with incarceration as an adult. Restricting the analysis to subjects whose childhood incarceration had lasted for more than one week did not identify additional independent correlates of incarceration in adulthood.

The population prevalence of adolescence-limited and life course persistent antisocial behavior in the United States has been estimated at 11.6 percent and 7.4 percent, respectively, for males and 11.4 percent and 6.9 percent, respectively, for females. 25 The 680 who were incarcerated as children, but not as adults, and the 863 who were incarcerated both as children and adults represent, respectively, 1.9 percent and 2.4 percent of NESARC subjects. That these proportions are so much smaller than those reported for antisocial behavior overall likely reflects the fact that most antisocial behavior does not result in incarceration. As suggested in the Methods section, even compared with other young people who engage in antisocial behavior without being incarcerated, subjects in this study sample represent a high-risk group.

The correlates of adult incarceration that we report here differ in some respects from those reported elsewhere. Although males and members of black and minority ethnic populations are over-represented in incarcerated populations, 26 sociodemographic factors were not associated with reincarceration in this sample. This may be because the sample excluded those currently incarcerated, who remain disproportionately male and black. Although some aspects of a disadvantaged upbringing, namely parental drug abuse and parental imprisonment, were associated with subsequent incarceration (and parental imprisonment independently so on multivariate analysis), others, including child neglect or abuse, child sexual abuse, and a subject’s perception of the lack of a supportive family in childhood, were not. It is unlikely that seeking treatment for substance use, including from clergy and self-help groups, is itself criminogenic; this association is likely explained by seeking treatment acting as an indicator of substance use as well as other problems that would lead a respondent to seek treatment.

Future Research and Policy Implications

Previous research has contrasted persistent antisocial behavior and reincarceration with desistance, a decline in antisocial behavior that has been consistently observed to be a feature of aging and maturation, particularly in males. 27 Although cross-sectional data such as these do not allow causal inferences, these results are consistent with different mental health diagnoses having different effects on desistance.

Schizophrenia or psychosis and bipolar disorder, the only diagnoses studied here that focus on psychotic symptoms, were associated with reduced desistence and with the largest effect sizes on bivariate analysis. In the case of bipolar disorder, the association with reduced desistance was also present on multivariate analysis. We are not aware of this association being demonstrated previously. It may be that impaired psychosocial function, a known correlate of psychosis in this sample, 20 disrupts family and social networks that would otherwise protect against legally significant outcomes. If that is the case, effective treatment of psychotic symptoms, in addition to improving health, might have prevented the criminal behavior that led to reincarceration. The possibility warrants investigation in studies with longitudinal designs.

Treatment of substance use disorder has already been shown to promote desistance. 28 , 29 Future research should focus on the best ways of making effective substance use treatment available to this high-risk group, perhaps most effectively in the community prior to childhood incarceration. 30 The point of release is already known to be a time of vulnerability, particularly with regard to opiate overdose, 31 and these results suggest that coordinating community substance use services when people reenter the community from prison may also have the longer term benefit of preventing further incarceration. 32

Limitations

Several aspects of the Methods warrant caution regarding these results. First, this is a cross-sectional view of individuals participating in processes, such as reincarceration and desistance, that are by their natures longitudinal. Although most cases of serious mental illness and substance use disorder are present by age 18, 33 any causal inferences should await the confirmation of these findings in studies with longitudinal designs. Second, the NESARC-III did not include people who were incarcerated at the time the subject interviews were conducted or those who were living in nursing homes, hospitals, and shelters. The exclusion of those in jails and prisons will have lowered the rates of reincarceration reported here and may have modified the observed correlates of that reincarceration. The exclusion of those in other residential settings may have lowered the prevalences that we found for medical illness, substance abuse, and other forms of mental disorder. Future research seeking to represent the U.S. population should seek to include people in all of these settings.

Third, although the interview items used here have been tested and found reliable, they are based on self-report and did not include collateral data. Some items might have been worded differently had they been written with the present study in mind. The NESARC inquiry regarding adult incarceration, for instance, “Since you were 18, were you ever in jail, prison, or a correctional facility?” does not preclude the possibility that the subject’s incarceration had commenced prior to their becoming 18 and, therefore, represented a continuous period of detention, not reincarceration. We would note, however, that no additional findings emerged when we restricted the analysis to subjects detained in childhood for less than a week, a restriction which will have excluded most of these “childhood into adulthood” cases.

Fourth, beyond looking separately at subjects who experienced brief and longer term childhood incarceration, we have not sought to repeat the analyses for subgroups, for instance, men and women. Fifth, although the NESARC database suited our focus on mental health variables, it was not designed to cover the full range of factors known to affect the risk of incarceration and reincarceration. For instance, the database does not include data on neuropsychological deficits or a subject’s neighborhood environment in childhood. 29 Also, although we were able to study both child sexual abuse and to include a measure of neglect that addressed physical abuse, detained youth and incarcerated adults have high rates of many kinds of trauma. These variables and others that have been linked to incarceration, such as attention deficit hyperactivity disorder (ADHD), should be included in future studies that might usefully also generate separate variables for neglect and physical abuse.

This is the first study of which we are aware using population survey methodology to describe the mental health correlates of reincarceration as adults of people who were incarcerated as children. The findings include a previously unrecognized statistical association between psychosis and subsequent incarceration as an adult. This association warrants testing in studies with longitudinal designs.

Disclosures of financial or other potential conflicts of interest: None.

  • © American Academy of Psychiatry and the Law
  • Armstrong K
  • 2. ↵ Prison Policy Initiative . United States profile [Internet]; 2021 . Available from: https://www.prisonpolicy.org/profiles/US.html#time . Accessed November 1, 2021
  • Mulvey EP ,
  • Steinberg L ,
  • Silberg JL ,
  • Roberson-Nay R ,
  • Kosterman R ,
  • Fergusson D ,
  • McPherson K
  • Buchanan A ,
  • Pittman B ,
  • Oberleitner LMS ,
  • Jennings W ,
  • Maldonado-Molina M ,
  • Reising K ,
  • Farrington D ,
  • Schubert CA ,
  • 18. ↵ National Institute on Alcohol Abuse and Alcoholism . NESARC-III questionnaire [Internet]; 2023 . Available from: https://www.niaaa.nih.gov/research/nesarc-iii/questionnaire . Accessed June 16, 2023
  • Goldstein RB ,
  • Sarason I ,
  • Mermelstein R ,
  • Ferguson CJ
  • 24. ↵ United States Census . U.S. adult population grew faster than nation’s total population from 2010 to 2020 [Internet]; 2021 . Available from: https://www.census.gov/library/stories/2021/08/united-states-adult-population-grew-faster-than-nations-total-population-from-2010-to-2020.html#:∼:text=In%202020%2C%20the%20U.S.%20Census,from%20234.6%20million%20in%202010 . Accessed February 5, 2024
  • Van Hulle CA ,
  • Inciardi JA ,
  • Martin SS ,
  • Dawidoff N.
  • Heikkila HD ,
  • Stefanovics EA
  • Møller LF ,
  • van den Bergh BJ
  • 32. ↵ Substance Abuse and Mental Health Services Administration (US); Office of the Surgeon General (US). Facing Addiction in America: The Surgeon General's Report on Alcohol, Drugs, and Health [Internet]; 2016 . Washington, DC : U.S. Department of Health and Human Services . Available from: https://www.ncbi.nlm.nih.gov/books/NBK424857 . Accessed February 5, 2024
  • Kessler RC ,
  • Berglund P ,

In this issue

  • Table of Contents
  • Index by author

Thank you for your interest in recommending The Journal of the American Academy of Psychiatry and the Law site.

NOTE: We only request your email address so that the person you are recommending the page to knows that you wanted them to see it, and that it is not junk mail. We do not capture any email address.

Citation Manager Formats

  • EndNote (tagged)
  • EndNote 8 (xml)
  • RefWorks Tagged
  • Ref Manager

del.icio.us logo

  • Tweet Widget
  • Facebook Like
  • Google Plus One

Jump to section

Related articles, cited by..., more in this toc section.

  • Attitudes of Forensic Fellowship Psychiatry Directors towards an Applicant Match
  • Legal and Ethics Considerations in Capacity Evaluation for Medical Aid in Dying

Similar Articles

  • Open access
  • Published: 24 July 2024

Using natural language processing to facilitate the harmonisation of mental health questionnaires: a validation study using real-world data

  • Eoin McElroy 1 ,
  • Thomas Wood 2 ,
  • Raymond Bond 3 ,
  • Maurice Mulvenna 3 ,
  • Mark Shevlin 1 ,
  • George B. Ploubidis 4 ,
  • Mauricio Scopel Hoffmann 5 , 6 , 7 , 8 , 9 &
  • Bettina Moltrecht 4  

BMC Psychiatry volume  24 , Article number:  530 ( 2024 ) Cite this article

Pooling data from different sources will advance mental health research by providing larger sample sizes and allowing cross-study comparisons; however, the heterogeneity in how variables are measured across studies poses a challenge to this process.

This study explored the potential of using natural language processing (NLP) to harmonise different mental health questionnaires by matching individual questions based on their semantic content. Using the Sentence-BERT model, we calculated the semantic similarity (cosine index) between 741 pairs of questions from five questionnaires. Drawing on data from a representative UK sample of adults ( N  = 2,058), we calculated a Spearman rank correlation for each of the same pairs of items, and then estimated the correlation between the cosine values and Spearman coefficients. We also used network analysis to explore the model’s ability to uncover structures within the data and metadata.

We found a moderate overall correlation ( r  = .48, p  < .001) between the two indices. In a holdout sample, the cosine scores predicted the real-world correlations with a small degree of error (MAE = 0.05, MedAE = 0.04, RMSE = 0.064) suggesting the utility of NLP in identifying similar items for cross-study data pooling. Our NLP model could detect more complex patterns in our data, however it required manual rules to decide which edges to include in the network.

Conclusions

This research shows that it is possible to quantify the semantic similarity between pairs of questionnaire items from their meta-data, and these similarity indices correlate with how participants would answer the same two items. This highlights the potential of NLP to facilitate cross-study data pooling in mental health research. Nevertheless, researchers are cautioned to verify the psychometric equivalence of matched items.

Peer Review reports

Introduction

There is increased recognition that pooling data from different sources can help us to better understand and treat mental health problems [ 1 ]. Pooling data has statistical benefits (e.g. increased sample sizes), and it can also help uncover important contextual differences across cultures and time [ 2 , 3 , 4 ]. In the UK, initiatives such as the Catalogue of Mental Health Measures [ 5 ], CLOSER [ 6 ], Datamind [ 7 ], and the UK Longitudinal Linkage Collaboration (UK LLC) [ 8 ] have made it easier than ever for researchers to find and pool data from different sources. However, in practice most mental health research is conducted in measurement silos, and more often than not there are inconsistencies in how variables are measured across studies. Indeed, it has been estimated that close to 300 instruments have been developed to measure depression alone [ 9 ]. Self-report questionnaires, one of the most common approaches to measuring mental ill-health, can differ markedly on the types of symptoms they enquire about, even when supposedly measuring the same disorder [ 10 ]. Such heterogeneity in measurement can impede attempts to pool otherwise comparable datasets.

Retrospective harmonisation is an increasingly popular solution to this problem. This refers to the process by which data from different sources are transformed to make them directly comparable [ 11 , 12 ]. When dealing with mental health questionnaires, one approach is to harmonise at the question/item-level. Although questionnaires can differ considerably on the number and nature of questions asked, there is often overlap in the semantic content of certain questions (see Table 1 for an example of similar items from two different measures). By identifying conceptually similar item-pairs, researchers can construct bespoke harmonised subscales that can be used for cross-study analyses.

Recent attempts to harmonise mental health questionnaires have largely relied on expert opinion to match items from different questionnaires [ 15 , 16 ]. For instance, McElroy et al. [ 12 ] explored trends in child mental health using subsets of items from the Rutter Behaviour Scales [ 17 ] and the Strengths and Difficulties Questionnaire [ 18 ]. Two researchers screened the instruments independently of one another, and identified item-pairs they considered conceptually similar. Although inter-rater agreement was high (88%), a third independent rater served as the decision maker when the initial raters disagreed. This process produced a final harmonised subset consisting of seven questions that were consistent across the two scales (Table S1), and psychometric tests supported the equivalence of these items across four different studies.

Although the above results were promising, using expert opinion to match items has a number of inherent weaknesses. First, this approach relies on subjective ratings, and even when multiple raters are used, there will likely be some disagreement on which item-pairs should be included in the harmonised subscales. Second, manually harmonising questionnaires can become exponentially more challenging as the number of instruments increases. The rapid development and adaptation of natural language processing (NLP) technologies offers the chance to increase the speed, inter-rater reliability, and replicability with which questionnaires are retrospectively harmonised, and our research group has developed a free-to-use online tool, Harmony [ 19 ], for this purpose. Harmony (Fig. 1 ) is built in Python 3.10, and uses the Sentence-BERT model [ 20 ] to convert the text of each question within a scale into a unique numeric vector based on its semantic content. The similarity between two questions is then calculated as the distance between their respective vectors, expressed as the cosine similarity index (ranging from -1/ + 1, with values closer to 1 reflecting a greater semantic match).

figure 1

Screenshot of Harmony web interface. Cosine similarity indices presented in circles

While Harmony has the potential to be an important tool for the pooling of mental health data, it needs validation. If Harmony is producing valid matches (i.e. matching questions that describe conceptually similar experiences or behaviours), we would expect the strength of these matches to correspond with the degree to which subjects answer questions in the same way (i.e. the degree to which subject responses to items would correlate). Therefore, this exploratory study aims to quantify the association between the semantic item-matches produced by Harmony and item-pair correlations derived from real-world epidemiological data. We also explore Harmony’s ability to identify complex underlying structures (i.e. clusters of strongly related item-pairs) using a graph theory approach. Again, we compare the identified structure to that found in real-world item-wise correlational data.

For our correlation analyses, we drew on data from Wave 6 of the COVID-19 Psychological Research Consortium (C19PRC) study [ 21 ]. This study began in March 2020 with the aim of monitoring the psychological, social and economic impact of the COVID-19 pandemic in the UK. The study initially comprised a nationally representative sample of 2,025 adults, with ‘top-up’ participants added at later waves. The sixth wave of data collection occurred between August and September 2021. At this sweep, 1,643 participants from earlier waves were re-interviewed, and an additional 415 new respondents were surveyed ( N  = 2,058) and the final sample matched the original sample in terms of the quota-based sampling. All participants had complete data. The mean age of participants was 45.92 years (SD = 15.79), 51.9% of the sample were female, 87.7% of the sample were of white British/Irish ethnicity, 57.6% had post-secondary education, and 64.2% were in either full-time or part-time employment. Wave 6 of the C19PRC study was granted ethical approval by the University of Sheffield [Reference number 033759]. The data and meta-data used in this study can be found at https://osf.io/v2zur/ .

We drew on data from five self-report questionnaires. Two of these questionnaires assessed depression, two covered anxiety, and one measured symptoms of PTSD.

The Patient Health Questionnaire-9 (PHQ-9) [ 22 ] consists of nine questions that align with the DSM-IV criteria for major depressive disorder. Participants were asked about the frequency with which they experienced these depressive symptoms over the preceding two weeks. Response options were on a 4-point Likert scale ranging from 0 (not at all) to 3 (nearly every day). The psychometric properties of the PHQ-9 have been extensively documented [ 23 ].

Participants also completed the Generalized Anxiety Disorder Scale (GAD-7) [ 24 ]. Respondents were asked to indicate on a 4-point Likert scale ranging from 0 (not at all) to 3 (nearly every day), how frequently they were troubled by seven symptoms of anxiety over the preceding two weeks. The reliability and validity of the GAD-7 has been supported widely evidenced [ 25 ].

Two newly developed scales were also administered; the International Depression Questionnaire (IDQ) and the International Anxiety Questionnaire (IAQ) [ 26 ]. These scales were designed to align with the ICD‐11 descriptions of Depressive Episode and Generalized Anxiety Disorder. The IDQ consists of nine questions, and the IAQ has eight. For both questionnaires, responses are indicated on a 5-point Likert scale ranging from 0 (Never) to 4 (Every day). Initial psychometric work suggests these scales are reliable and valid [ 26 ].

The International Trauma Questionnaire (ITQ) [ 27 ] was used to screen for ICD-11 post-traumatic stress disorder (PTSD). The ITQ consists of six questions that can be grouped into two-item symptom clusters of Re-experiencing, Avoidance, and Sense of Threat. Participants were asked to complete the ITQ as follows: “…in relation to your experience of the COVID-19 pandemic, please read each item carefully, then select one of the answers to indicate how much you have been bothered by that problem in the past month”. Responses were indicated on a 5-point Likert scale, ranging from 0 (Not at all) to 4 (Extremely). Three additional questions measure functional impairment caused by the symptoms. The psychometric properties of the ITQ scores have been supported in both general population [ 28 ] and clinical and high-risk [ 29 ] samples.

All 39 questions from the five scales, which were used as input for our NLP analyses, are presented in the supplementary files (Table S2). All questions were scored in the same direction (i.e. higher scores reflected greater frequency/severity of symptomatology), therefore no reverse coding was required.

Pre-processing

First, using the data from the C19PRC study, we calculated a Spearman rank correlation for each pair of questions in the battery. Given there were 39 questions in total, this resulted in 741 correlation coefficients (39*38/2). Second, we imported the questionnaire content, in pdf format, into Harmony, which produced a semantic similarity score (cosine index) for each of the 741 item-pairs. We then merged the results from the above two steps, creating a simple data set where the rows corresponded to item-pairs, and columns corresponded to correlation and cosine values for each item-pair (available in Supplementary file 2).

We explored the association between the correlations from the empirical data and NLP-derived similarity scores by doing the following:

First, we randomly split the dataset into training (80% of item-pairs) and testing samples (20% of item-pairs). Using the training sample, we produced a scatterplot to visualise the relationship between the cosine and correlation scores, and then calculated the Pearson correlation between the two indices. Next, we estimated a linear regression model, with cosine scores as the predictor and correlation coefficients as the outcome variable. We then tested this model in the holdout sample, and calculated the mean absolute error (MAE), and Root Mean Squared Error (RMSE), the Median Absolute Error (MedAE) between what was predicted by our model and the observed correlation coefficients in the holdout sample. These errors were visualised as a violin plot. All of the above analyses were conducted in R version 4.3.1 and visualisations were produced using the ggplot2 package [ 30 ].

Next, to examine the ability of NLP to uncover complex structures using questionnaire meta-data, we estimated and visualised matrices of the item-pair correlations and cosine scores as separate graphical networks using the full dataset ( N  = 741). In the network of cosine scores, nodes (points in space) represented questions, and edges (connections between nodes) reflected the cosine similarity scores between a given pair of questions, with thicker and more saturated lines indicating higher cosine values. We estimated two versions of the correlation network – a bivariate/pairwise correlation network, and a regularised partial correlation network. In the bivariate network, nodes represented questionnaire variables and edges reflected the strength of the correlations between nodes. In the regularised partial correlation network, edges can be interpreted as partial correlation coefficients, with line thickness and saturation reflecting the strength of association between two symptoms after controlling for all other symptoms in the network. In this network, a LASSO penalty was applied to the edges, which shrinks edges and sets very small connections to zero. This is a commonly employed approach in the estimation of networks of mental health data, as it produces a sparse network structure that balances parsimony with explanatory power [ 31 ]. The LASSO utilizes a tuning parameter to control the degree of regularization that is applied. This is selected by minimizing the Extended Bayesian Information Criterion (EBIC). The degree to which the EBIC prefers simpler models is determined by the hyperparameter γ (gamma) – this was set to the recommended default of 0.5 in the present study [ 31 ]. For further detailed information on the estimation of regularised partial correlation networks, we refer readers elsewhere [ 31 , 32 ]. The networks in the present study were estimated and visualised in R using the qgraph package [ 33 ].

After estimating the cosine and correlation networks, we used the walktrap community detection algorithm [ 34 ] to identify communities or clusters of nodes within the three networks. Walktrap is a bottom-up, hierarchical approach to uncovering structures within networks. The central idea of walktrap is to simulate random walks within a given network. Random walks start from a particular node and traverse the network by moving to a neighboring node at each step, following edges randomly. This process is repeated for multiple random walks initiated from each node in the network. Walktrap is based on the idea that nodes within the same community will have similar random walk patterns and thus be close to each other in the clustering. We ran the walktrap algorithm using the igraph package, taking the weighting of edges into account, with the default number of four steps per random walk. Research has shown that the walktrap method produces similar results to other methods for uncovering underlying structures in multivariate data (e.g. exploratory factor analysis, parallel analysis) [ 35 ]. However, the walktrap algorithm can produce a clustering outcome, even in scenarios involving entirely random networks. Consequently, we calculated the modularity index Q [ 36 ] to assess the clarity and coherence of the clustering solutions identified. In real-world data, Q typically ranges from 0.3 to 0.7, with values closer to 0.3 indicating loosely defined communities, while those around 0.7 indicate well-defined and robust community structures [ 36 ].

While the LASSO networks offer a more conservative and interpretable structure than networks consisting of bivariate correlations, to our knowledge, there is no equivalent approach for networks of cosine scores. Furthermore, there are no guidelines for determining when two questions are considered ‘similar enough’ based on their cosine similarity score. To address this, we conducted sensitivity analyses in which we manually set small edges (cosine vales) to zero, to produce increasingly sparse networks. We estimated five additional cosine networks, removing any connections with edge weights below certain thresholds. These thresholds ranged from 0.2 up to 0.6. For each of these networks we also tested for community structures and modularity.

The mean inter-item correlation coefficient in the training sample was r  = 0.64 (SD = 0.07), and the mean cosine value was 0.39 (SD = 0.14). Distributions of these values are presented in Figure S1. The Spearman correlation coefficients appeared relatively normally distributed, whereas the cosine scores were slightly positively skewed. However, skewness and kurtosis values were in acceptable ranges (Figure S1). Figure  2 plots the cosine score and Spearman correlation coefficient of each of the question-pairs in the training sample. The correlation between the cosine and Spearman values was 0.48 ( p  < 0.001; 95% CI = 0.42—0.54), indicating a moderate correlation between the two values.

figure 2

Scatterplot of cosine scores and Spearman correlation coefficients from item-pairs in the training sample ( N  = 592). Each dot represents the value of the cosine score (x-axis) and correlation coefficient (y-axis) of a specific item-pairing

The Rainbow test for non-linearity was conducted, and confirmed that the linear regression model was appropriate for the data ( F  = 0.75; p  = 0.99). In the linear regression model, cosine scores were a significant predictor of item-pair correlations (b = 0.27, a = 0.54, R 2  = 0.23, F[1, 590] = 179.8, p  < 0.01). Next, using the 20% holdout sample, we calculated the mean absolute errors (MAE) between what was predicted by our model and the observed correlation scores. The MAE was 0.05 (SD = 0.04), and the error values are visualised as a violin plot in Fig.  3 . This indicates that when using the semantic similarity between items to predict the actual correlation between participant answers, the model will on average have an error of ± 0.05, which can be considered a minor error. We also calculated the Median Absolute Error (MedAE), and Root Mean Squared Error (RMSE), which are less sensitive to outliers. Both MedAE (0.04) and RMSE (0.064) also suggested a low level of error in our predictive model.

figure 3

Violin plot of absolute error values between predicted and observed correlation scores in holdout sample ( N  = 149)

The cosine, bivariate and LASSO networks are presented in Fig.  4 . As could be expected, the LASSO network was considerably more sparse than the other two. A full breakdown of the clustering of items (including exact question wording) is presented in Supplementary Tables 3, 4 and 5. In the bivariate correlation network, only two clusters were identified – a cluster dominated by PTSD and self-harm items, and a cluster consisting of the remaining items. In the cosine network, four clusters were detected. The first cluster consisted of the four avoidance and re-experiencing items from the ITQ. The second cluster captured several items related to worry and anhedonia. The third cluster included items related to sleep disturbances, fatigue, difficulty relaxing and concentrating. The final cluster of nodes captured a broad array of negative affectivity and psychological distress.

figure 4

Cosine, correlation and LASSO networks. Cosine and LASSO networks based on full dataset ( N  = 741)

The LASSO network produced a 5-cluster solution. Cluster 1 (PTSD) consisted of all six items of the ITQ. Cluster 2 was formed of four items capturing restlessness and thoughts of self-harm. The third cluster was formed of three items capturing concentration problems. Cluster 4 included items that broadly related to feelings of negative affect (e.g. low mood, guilt, worthlessness, hopelessness). The fifth cluster tapped difficulties with sleep, appetite and fatigue. The final cluster had six items from the GAD-7 and six items from the IAQ and broadly captured general anxiety.

The modularity ( Q ) values were extremely low for the bivariate correlation (0.01) and cosine (0.04) networks. By contrast, the Q index in the LASSO network was 0.49, indicating a moderately well-defined and robust community structure. Networks, community structures and Q indices for the cosine networks with different threshold applied to the edges are presented in the Figure S2. The interpretability of clusters, along with modularity, increased considerably as smaller edges (i.e. cosine values) were set to zero in the networks. When cosine values of less than 0.5 were set to zero, the Q index was pushed above the 0.3 value, suggesting a non-random clustering of nodes. When setting the threshold at cosine values of 0.6 or greater, Q rose to 0.52, suggesting a well-defined community structure. However, when the threshold was set this high, many nodes did not have any connections to the broader network.

The present study aimed to test the degree with which NLP measures of semantic similarity were associated with correlations in real-world mental health questionnaire data. We found a moderate correlation between cosine similarity indices (produced by the Sentence-BERT model) and Spearman coefficients for the same item-pairs in a battery of 39 mental health questions. In our holdout sample, we found that the cosine score of an item-pair could predict the real-world correlation coefficient with a mean error of ± 0.05. These findings suggest that cosine scores can, with reasonable accuracy, predict bivariate correlations between pairs of mental health questionnaire items.

Our second aim was to explore whether NLP could uncover more complex structures underlying mental health questionnaire data. Low modularity in the cosine network coupled with the general inconsistency/vagueness of the clustering solution suggested that our NLP tool performed poorly in this regard. This was due to the high level of connectivity that was observed when all cosine scores were included in the network. This was in stark contrast to the LASSO network, which imposed a penalty on the smallest edges, and therefore had high modularity and produced interpretable and meaningful clusters. Indeed, when thresholds were applied to the edges included in the cosine networks, more clearly defined communities of nodes emerged, and modularity indices improved to the 0.3—0.7 range that deemed acceptable in real-world data [ 36 ]. While further research (e.g. simulation work) is required to determine how best to apply thresholds to cosine similarity indices in this context, our findings suggest that NLP methods offer promise in identifying clusters of related variables based solely on meta-data. As such, NLP may become a useful means of approximating correlations between mental health items and scales prior to expensive data collection.

Overall, our findings provide initial support for using NLP as a means of identifying candidate questionnaire items for retrospective harmonisation. However, it is important to note that simply identifying questions based on their semantic similarity does not guarantee the psychometric equivalence of measures. There are many sources of bias that can threaten the comparability of results across different data sources. For instance, methods of administration (e.g. online vs pen and paper) or differences in response options can influence how participants answer questionnaires [ 37 ]. Furthermore, different groups or populations may interpret questions differently or respond in systematically different ways [ 38 ]. Therefore, we recommend that researchers explicitly test the equivalence of conceptually similar items in their data before they are used for cross-study research. There are various methods commonly used for such purposes, such as item response theory (IRT) and multiple group confirmatory factor analysis (CFA). Broadly, these approaches estimate latent variable measurement models of a given construct (e.g. depression) in two or more groups (e.g. samples from two different studies). Increasingly stringent equality constraints are then placed on the measurement parameters in the two groups, which are used to test the plausibility that the items are equivalent and therefore meaningful comparisons can be made across groups [ 38 ]. However, it is also worth noting that NLP models are developing rapidly, therefore the accuracy with which they could be used mirror real-world correlations could be expected to increase.

Our findings suggest there is immense potential for NLP to influence various areas of questionnaire based research. As demonstrated here, NLP could be used to identify candidate items for retrospective harmonisation. NLP tools such as Harmony could also be integrated with data discoverability tools to help researchers find and pool data from different sources. In addition, it would be relatively straightforward to adapt NLP models to facilitate scale development and validation; e.g. by identifying and comparing the semantic overlap of different pools of items.

Strengths and limitations

The present study had a number of strengths. Our empirical data were drawn from a representative UK sample of adults. Our questionnaire battery contained overlapping measures (i.e. two measures of depression, two measures of anxiety), and these were completed by the same participants – meaning our data were well-suited for testing semantic similarity and inter-item correlations. In terms of limitations, the present findings, including our predictive model, generalise only to the present battery of items within a community sample. The use of Harmony in other areas of research would require further validation using a broader range of questionnaires and topics (e.g. wellness, personality). Similarly, further validation in clinical samples may be required if Harmony is to be used to pool data from clinical studies. Second, our study relied on commonly used measures that were developed in Western contexts – therefore it is unclear whether similar results would be produced across different languages and culturally sensitive questionnaires. Third, our findings are based solely on the Sentence-BERT model [ 20 ], and it is possible that alternative NLP models could produce different results. Fourth, although Harmony is sensitive to antonyms (i.e. sentences that convey opposite meanings are coded with negative cosines), further validation work is required to explicitly compare the tools ability to match antonyms and synonyms. However, recent research suggests that the BERT model is accurate at identifying antonyms [ 39 ].

To the best of our knowledge, this is the first study to explore whether NLP methods can be used to match pairs of items from mental health questionnaires based on their semantic content, and whether these matches align with real-world inter-item correlations. Our findings indicate that NLP matches, expressed as cosine indices of similarity, can predict bivariate correlations with a reasonable degree of accuracy. Our NLP model was also able to identify more complex underlying structures within our data, however this required manual constraints to be placed on the edges that were included int the network, and therefore further research is required to establish best practices in this regard. Overall, these findings suggest that NLP can be a useful tool for researchers who wish to identify similar items for cross-study pooling of data. However, it remains important to explore the psychometric equivalence of candidate items.

Availability of data and materials

The data and meta-data from the C19PRC study can be found at https://osf.io/v2zur/ . The correlation and cosine values used in the present analyses are available in Supplementary file 2.

Abbreviations

COVID-19 Psychological Research Consortium

Confirmatory factor analysis

Extended Bayesian Information Criteria

Generalised Anxiety Disorder Assessment

International Anxiety Questionnaire

International Depression Questionnaire

Item response theory

International Trauma Questionnaire

Least absolute shrinkage and selection operator

Mean absolute error

Natural language processing

Patient Health Questionnaire-9

Sentence Bidirectional Encoder Representations from Transformers

Curran PJ, McGinley JS, Bauer DJ, Hussong AM, Burns A, Chassin L, et al. A moderated nonlinear factor model for the development of commensurate measures in integrative data analysis. Multivar Behav Res. 2014;49(3):214–31.

Article   Google Scholar  

Campbell OLK, Bann D, Patalay P. The gender gap in adolescent mental health: a cross-national investigation of 566,829 adolescents across 73 countries. SSM - Popul Health. 2021;13:100742.

Article   PubMed   PubMed Central   Google Scholar  

Gondek D, Bann D, Patalay P, Goodman A, McElroy E, Richards M, et al. Psychological distress from early adulthood to early old age: evidence from the 1946, 1958 and 1970 British birth cohorts. Psychol Med. 2022;52(8):1471–80.

Article   PubMed   Google Scholar  

McElroy E, Tibber M, Fearon P, Patalay P, Ploubidis G. Socioeconomic and sex inequalities in parent-reported adolescent mental ill-health: Time trends in four British birth cohorts. Open Science Framework; 2022. Available from: https://osf.io/3zn2h . Cited 2022 Dec 12.

Catalogue of Mental Health Measures team. Catalogue of mental health measures. 2023. Available from: https://www.cataloguementalhealth.ac.uk/ .

O’Neill D, Benzeval M, Boyd A, Calderwood L, Cooper C, Corti L, et al. Data resource profile: Cohort and Longitudinal Studies Enhancement Resources (CLOSER). Int J Epidemiol. 2019;48(3):675–676i.

Datamind team. Datamind. 2023. Available from: https://datamind.org.uk/ .

Boyd A, Flaig R, Oakley J, Campbell K, Evans K, McLachlan S, et al. The UK Longitudinal Linkage Collaboration: a trusted research environment for the longitudinal research community. Int J Popul Data Sci. 2022;7(3). Available from: https://ijpds.org/article/view/2046 . Cited 2023 Dec 5.

Santor DA, Gregus M, Welch A. FOCUS ARTICLE: eight decades of measurement in depression. Meas Interdiscip Res Perspect. 2006;4(3):135–55.

Fried EI. The 52 symptoms of major depression: lack of content overlap among seven common depression scales. J Affect Disord. 2017;208:191–7.

Fortier I, Raina P, Van Den Heuvel ER, Griffith LE, Craig C, Saliba M, et al. Maelstrom Research guidelines for rigorous retrospective data harmonization. Int J Epidemiol. 2016;46:dyw075.

McElroy E, Villadsen A, Patalay P, Goodman A, Richards M, Northstone K, et al. Harmonisation and measurement properties of mental health measures in six British cohorts. London: CLOSER; 2020.

Google Scholar  

Costello EJ, Angold A. Scales to assess child and adolescent depression: checklists, screens, and nets. J Am Acad Child Adolesc Psychiatry. 1988;27(6):726–37.

Article   CAS   PubMed   Google Scholar  

Radloff LS. The CES-D scale: a self-report depression scale for research in the general population. Appl Psychol Meas. 1977;1(3):385–401.

Hoffmann MS, Moore TM, Axelrud LK, Tottenham N, Pan PM, Miguel EC, et al. An evaluation of item harmonization strategies between assessment tools of psychopathology in children and adolescents. Assessment. 2023;12:107319112311631.

Hoffmann MS, Moore TM, Axelrud LK, Tottenham N, Rohde LA, Milham MP, et al. Harmonizing bifactor models of psychopathology between distinct assessment instruments: reliability, measurement invariance, and authenticity. Int J Methods Psychiatr Res. 2023;32(3):e1959.

Rutter M, Tizard J, Whitmore K. Education, health and behaviour. London: Longman; 1970.

Goodman R. The Strengths and difficulties questionnaire: a research note. J Child Psychol Psychiatry. 1997;38(5):581–6.

McElroy E, Moltrecht B, Scopel Hoffmann M, Wood T A, Ploubidis GB. Harmony – A global platform for contextual harmonisation, translation and cooperation in mental health research. Open Science Framework; 2023. Available from:  https://osf.io/bct6k/ .

Reimers N, Gurevych I. Sentence-bert: Sentence embeddings using siamese bert-networks. 2019.

McBride O, Butter S, Martinez AP, Shevlin M, Murphy J, Hartman TK, et al. An 18-month follow-up of the Covid-19 psychology research consortium study panel: Survey design and fieldwork procedures for Wave 6. Int J Methods Psychiatr Res. 2023;32(2):e1949.

Kroenke K, Spitzer RL, Williams JBW. The PHQ-9: Validity of a brief depression severity measure. J Gen Intern Med. 2001;16(9):606–13.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Kroenke K, Spitzer RL, Williams JB, Löwe B. The patient health questionnaire somatic, anxiety, and depressive symptom scales: a systematic review. Gen Hosp Psychiatry. 2010;32(4):345–59.

Spitzer RL, Kroenke K, Williams JB, Löwe B. A brief measure for assessing generalized anxiety disorder: the GAD-7. Arch Intern Med. 2006;166(10):1092–7.

Johnson SU, Ulvenes PG, Øktedalen T, Hoffart A. Psychometric properties of the general anxiety disorder 7-Item (GAD-7) scale in a heterogeneous psychiatric sample. Front Psychol. 2019;6(10):1713.

Shevlin M, Hyland P, Butter S, McBride O, Hartman TK, Karatzias T, et al. The development and initial validation of self-report measures of ICD-11 depressive episode and generalized anxiety disorder: the international depression Questionnaire (IDQ) and the International Anxiety Questionnaire (IAQ). J Clin Psychol. 2023;79(3):854–70.

Cloitre M, Shevlin M, Brewin CR, Bisson JI, Roberts NP, Maercker A, et al. The International Trauma Questionnaire: development of a self-report measure of ICD-11 PTSD and complex PTSD. Acta Psychiatr Scand. 2018;138(6):536–46.

Redican E, Nolan E, Hyland P, Cloitre M, McBride O, Karatzias T, et al. A systematic literature review of factor analytic and mixture models of ICD-11 PTSD and CPTSD using the international trauma questionnaire. J Anxiety Disord. 2021;79:102381.

Sele P, Hoffart A, Bækkelund H, Øktedalen T. Psychometric properties of the International Trauma Questionnaire (ITQ) examined in a Norwegian trauma-exposed clinical sample. Eur J Psychotraumatology. 2020;11(1):1796187.

Wickham H. ggplot2. Wiley Interdiscip Rev Comput Stat. 2011;3(2):180–5.

Epskamp S, Fried EI. A tutorial on regularized partial correlation networks. Psychol Methods. 2018;23(4):617.

Epskamp S, Borsboom D, Fried EI. Estimating psychological networks and their accuracy: a tutorial paper. Behav Res Methods. 2018;50:195–212.

Epskamp S, Cramer AO, Waldorp LJ, Schmittmann VD, Borsboom D. qgraph: Network visualizations of relationships in psychometric data. J Stat Softw. 2012;48:1–18.

Pons P, Latapy M. Computing communities in large networks using random walks. In: Computer and Information Sciences-ISCIS 2005: 20th International Symposium. October 26-28, 2005. Proceedings 20. Istanbul: Springer; 2005. pp. 284–93.

Golino HF, Epskamp S. Exploratory graph analysis: a new approach for estimating the number of dimensions in psychological research. PLoS ONE. 2017;12(6):e0174035.

Newman ME, Girvan M. Finding and evaluating community structure in networks. Phys Rev E. 2004;69(2):026113.

Article   CAS   Google Scholar  

Patalay P, Hayes D, Deighton J, Wolpert M. A comparison of paper and computer administered strengths and difficulties questionnaire. J Psychopathol Behav Assess. 2016;38:242–50.

Putnick DL, Bornstein MH. Measurement invariance conventions and reporting: the state of the art and future directions for psychological research. Dev Rev. 2016;41:71–90.

Niwa A, Nishiguchi K, Okazaki N. Predicting Antonyms in Context using BERT. In: Proceedings of the 14th International Conference on Natural Language Generation. Aberdeen, Scotland, UK: Association for Computational Linguistics; 2021. p. 48–54. Available from: https://aclanthology.org/2021.inlg-1.6 . Cited 2024 Apr 15.

Download references

Acknowledgements

We are grateful to all participants of the C19PRC study.

This work was supported by the Wellcome Trust (grant number 226697/Z/22/Z). Dr. Mauricio Scopel Hoffmann is supported by the United States National Institutes of Health grant R01MH120482 under his post-doctoral fellowship at UFRGS.

Author information

Authors and affiliations.

School of Psychology, Ulster University, Coleraine, UK

Eoin McElroy & Mark Shevlin

Fast Data Science, London, UK

Thomas Wood

School of Computing, Ulster University, Belfast, UK

Raymond Bond & Maurice Mulvenna

Centre for Longitudinal Studies, University College London, London, UK

George B. Ploubidis & Bettina Moltrecht

Department of Neuropsychiatry, Universidade Federal de Santa Maria (UFSM), Avenida Roraima 1000, Building 26, office 1353, Santa Maria, 97105-900, Brazil

Mauricio Scopel Hoffmann

Graduate Program in Psychiatry and Behavioral Sciences, Universidade Federal Do Rio Grande Do Sul, Rua RamiroBarcelos 2350, Porto Alegre, 90035-003, Brazil

Mental Health Epidemiology Group (MHEG), UFSM, Santa Maria, RS, Brazil

Care Policy and Evaluation Centre, London School of Economics and Political Science, London, UK

National Center for Innovation and Research in Mental Health, São Paulo, Brazil

You can also search for this author in PubMed   Google Scholar

Contributions

All authors were responsible for the study concept and design. EM, MS, and TW undertook the data management. EM undertook the statistical analyses. All authors interpreted the results. EM drafted the initial manuscript. All authors critically reviewed the manuscript and approved the submitted version.

Corresponding author

Correspondence to Eoin McElroy .

Ethics declarations

Ethics approval and consent to participate.

Wave 6 of the COVID-19 Psychological Research Consortium (C19PRC) study was granted ethical approval by the University of Sheffield [Reference number 033759]. Each participant provided written informed consent through the online interface before commencing the survey.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary material 1., supplementary material 2., supplementary material 3., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

McElroy, E., Wood, T., Bond, R. et al. Using natural language processing to facilitate the harmonisation of mental health questionnaires: a validation study using real-world data. BMC Psychiatry 24 , 530 (2024). https://doi.org/10.1186/s12888-024-05954-2

Download citation

Received : 15 January 2024

Accepted : 08 July 2024

Published : 24 July 2024

DOI : https://doi.org/10.1186/s12888-024-05954-2

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Retrospective data harmonisation
  • Harmonisation
  • Meta-analysis
  • Data pooling

BMC Psychiatry

ISSN: 1471-244X

example of a longitudinal research question

U.S. flag

A .gov website belongs to an official government organization in the United States.

A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

  • About YRBSS
  • YRBSS Results
  • Data and Documentation
  • YRBSS Methods
  • YRBSS Questionnaires
  • YRBSS Reports and Publications
  • Data Summary & Trends
  • Results Toolkit
  • Communication Resources
  • YRBSS Data Request Form

Related Topics:

  • View All Home
  • YRBS Explorer
  • Adolescent and School Health
  • View or download the most recent YRBSS questionnaires and documentation.
  • Questionnaires by year are also available.
  • See FAQs about YRBSS questionnaires.

AI generate image with a user download files and information from laptop.

2025 questionnaires

Previous questionnaires.

  • Standard High School
  • National High School
  • Middle School
  • Spanish Version (National High School)
  • YRBS Questionnaire Content 1991-2023
  • YRBS Item Rationale

Frequently asked questions

How are questions selected for inclusion on the yrbs questionnaire.

Before each YRBS cycle begins, CDC seeks input from subject matter experts (both inside and outside of CDC) regarding what questions should be changed, added, or deleted. This input is compiled for review by December of the odd-numbered year preceding the survey cycle (such as, December 1, 2019, for the 2021 YRBS).

Proposed changes, additions, and deletions are then placed on a ballot, which is sent to the YRBS coordinators at all sites (states, territories, and local school districts). Each site votes for or against each proposed change, addition, and deletion. CDC considers the results of this balloting process when finalizing the standard questionnaire, which includes about 89 questions. A majority of sites must approve each change, addition, or deletion before it can be implemented.

For the national YRBS, about 10 questions are added to the standard questionnaire each cycle. These questions typically reflect emerging areas of interest for CDC and stakeholders.

Additional questions of interest are included on an Optional Question List, from which sites can select questions for their questionnaire. Final wording for questionnaires and the Optional Question List is based on the results of cognitive testing and input from subject matter experts.

What is the process for getting a question added to the YRBS questionnaire or getting an existing question changed?

All suggested additions and changes should be submitted using the YRBSS Data Request Form . These suggestions must be received by December 1 of the odd-numbered year preceding the survey cycle (for example, December 1, 2019, for the 2021 YRBS). All suggestions are then compiled and reviewed by CDC before they are added to the ballot process described above.

Are the YRBSS questionnaires available in languages other than English?

Yes. Beginning with the 2021 cycle, the national YRBS questionnaire is available in Spanish. Translation of state and local YRBSS questionnaires is left to the discretion of state and local agencies.

YRBS questionnaires are designed to be administered in a school setting. It is important to consider the language used in regular classrooms and common second languages, if any, spoken by the student population. Check with school officials before deciding whether or not translation is needed.

YRBSS questionnaires in English and Spanish are in the public domain. Questionnaires may be translated to any language. No specific permission is required.

Will asking questions about certain topics actually encourage certain behaviors?

There is no evidence that simply asking students about health behaviors will encourage them to try that behavior.

What is the suggested citation for a YRBSS questionnaire or YRBSS data in a publication?

The YRBSS questionnaire should be cited as follows:

Centers for Disease Control and Prevention. [survey year] Youth Risk Behavior Survey. Available at: www.cdc.gov/YRBS. Accessed on [date].

YRBSS data in a publication should be cited as follows:

Centers for Disease Control and Prevention. [survey year] Youth Risk Behavior Survey Data. Available at: www.cdc.gov/yrbs. Accessed on [date].

What behaviors are assessed by the YRBSS?

The YRBSS assesses six categories of priority health behaviors. These categories are behaviors that contribute to unintentional injuries and violence; sexual behaviors related to unintended pregnancy and sexually transmitted diseases, including HIV infection; alcohol and other drug use; tobacco use; unhealthy dietary behaviors; and inadequate physical activity. In addition, the YRBSS assesses obesity, overweight, and other important health issues.

Can state and local agencies that conduct a YRBSS modify the standard questionnaire?

Yes. State and local agencies that conduct a YRBS can add or delete questions to meet their policy or programmatic needs. Specific guidance on the parameters that must be followed during questionnaire modification is provided to those agencies funded by CDC to conduct a YRBS.

Are transgender students included in the national, state, and local school district YRBS’s?

Yes. All students in sampled classrooms are included as long as they are able to respond to the questionnaire in a private and anonymous matter.

Does the YRBS identify transgender students?

CDC worked with partners and researchers for several cycles to develop a credible question to identify transgender students. A question recommended by CDC was successfully piloted by 11 states and 10 local school districts during the 2017 YRBS cycle.

Cognitive interviews conducted in March 2018 also indicated that the question functioned well. This question is now included in the YRBS Optional Question List for any interested site to use.

Youth Risk Behavior Surveillance System (YRBSS)

YRBSS is the largest public health surveillance system in the U.S, monitoring multiple health-related behaviors among high school students.

IMAGES

  1. 10 Famous Examples of Longitudinal Studies (2024)

    example of a longitudinal research question

  2. 5 Sought-After Longitudinal Study Examples To Explore

    example of a longitudinal research question

  3. What is a Longitudinal Study?

    example of a longitudinal research question

  4. PPT

    example of a longitudinal research question

  5. Longitudinal Research

    example of a longitudinal research question

  6. What is a Longitudinal Study?

    example of a longitudinal research question

VIDEO

  1. longitudinal example

  2. Cross-Sectional and Longitudinal Research

  3. UK Longitudinal Linkage Collaboration: A Trusted Research Environment for Longitudinal Research

  4. Longitudinal Study| Research Method, Business Research Methodology, #bba #shortnotes #bcom

  5. Longitudinal Experiment Design

  6. An example of longitudinal laying of ceramic bricks on a slope

COMMENTS

  1. Longitudinal Study

    Revised on June 22, 2023. In a longitudinal study, researchers repeatedly examine the same individuals to detect any changes that might occur over a period of time. Longitudinal studies are a type of correlational research in which researchers observe and collect data on a number of variables without trying to influence those variables.

  2. Longitudinal Study Design: Definition & Examples

    Panel Study. A panel study is a type of longitudinal study design in which the same set of participants are measured repeatedly over time. Data is gathered on the same variables of interest at each time point using consistent methods. This allows studying continuity and changes within individuals over time on the key measured constructs.

  3. 10 Famous Examples of Longitudinal Studies (2024)

    10 Famous Examples of Longitudinal Studies. A longitudinal study is a study that observes a subject or subjects over an extended period of time. They may run into several weeks, months, or years. An examples is the Up Series which has been going since 1963. Longitudinal studies are deployed most commonly in psychology and sociology, where the ...

  4. What is a Longitudinal Study?

    A longitudinal study requires an investigator to observe the participants at different time intervals. A cross-sectional study is conducted over a specified period of time. Longitudinal studies can offer researchers a cause and effect relationship. Cross-sectional studies cannot offer researchers a cause-and-effect relationship.

  5. What's a Longitudinal Study? Types, Uses & Examples

    2. Observational: As we mentioned earlier, longitudinal studies involve observing the research participants throughout the study and recording any changes in traits that you notice. 3. Timeline: A longitudinal study can span weeks, months, years, or even decades. This dramatically contrasts what is obtainable in cross-sectional studies that ...

  6. Longitudinal Study

    Longitudinal studies also allow repeated observations of the same individual over time. This means any changes in the outcome variable cannot be attributed to differences between individuals. Example: Individual differences. You decide to study how a particular weight-training program affects athletic performance.

  7. Longitudinal Study: Overview, Examples & Benefits

    A longitudinal study is an experimental design that takes repeated measurements of the same subjects over time. These studies can span years or even decades. Unlike cross-sectional studies, which analyze data at a single point, longitudinal studies track changes and developments, producing a more dynamic assessment.

  8. Longitudinal study: design, measures, classic example

    After analyzing these three prominent examples of longitudinal studies in the literature, the components required to successfully perform this type of study become quite clear. While the actual design of the study will depend on the research question being addressed, a general overview of the necessary steps to take will be discussed.

  9. Longitudinal study: design, measures, and classic example

    A longitudinal study is observational and involves the continuous and repeated measurements of selected individuals followed over a period of time. Quantitative and qualitative data is gathered on "any combination of exposures and outcome." For instance, longitudinal studies are useful for observing relationships between the risk factors, development, and treatment outcomes of disease for ...

  10. Longitudinal study

    A longitudinal study (or longitudinal survey, or panel study) is a research design that involves repeated observations of the same variables (e.g., people) over long periods of time (i.e., uses longitudinal data).It is often a type of observational study, although it can also be structured as longitudinal randomized experiment.. Longitudinal studies are often used in social-personality and ...

  11. What Is a Longitudinal Study?

    Longitudinal studies, a type of correlational research, are usually observational, in contrast with cross-sectional research. Longitudinal research involves collecting data over an extended time, whereas cross-sectional research involves collecting data at a single point. To test this hypothesis, the researchers recruit participants who are in ...

  12. PDF Handbook for Conducting Longitudinal Studies: How We Designed and

    important longitudinal research features long-term studies of many different aspects of development in well-defined populations (e.g., studies of large birth cohorts in Norway, Finland, Sweden, New Zealand, the UK). Other longitudinal studies are focused on particular study questions in community or high-risk samples.

  13. Longitudinal studies

    Longitudinal studies employ continuous or repeated measures to follow particular individuals over prolonged periods of time—often years or decades. They are generally observational in nature, with quantitative and/or qualitative data being collected on any combination of exposures and outcomes, without any external influenced being applied.

  14. What Is A Longitudinal Study? A Simple Definition

    A longitudinal study or a longitudinal survey (both of which make up longitudinal research) is a study where the same data are collected more than once, at different points in time. The purpose of a longitudinal study is to assess not just what the data reveal at a fixed point in time, but to understand how (and why) things change over time.

  15. Longitudinal Design

    A longitudinal design is a research study where a sample of the population is studied at intervals to examine the effects of development. In a longitudinal design, you have a group of people and ...

  16. Longitudinal Study

    Longitudinal research is a method used to track participant changes by using multiple measurements with the same sample over time. It is also known as a repeated measures design .

  17. Qualitative longitudinal research in health research: a method study

    Qualitative longitudinal research (QLR) comprises qualitative studies, with repeated data collection, that focus on the temporality (e.g., time and change) of a phenomenon. ... In this example, Research question 1 use a context approach to time/change; Research question 2 contain no description of time/change; Research question 3 used an ...

  18. Longitudinal Research

    Major Developments in Longitudinal Family Research Methods and Longitudinal Datasets. Family scientists' interest in within-individual change has led to a heavy focus over the past half century on panel surveys (Menaghan & Godwin, 1993 ). Over the past 30 years, panel studies have exploded. Today, many large, often publicly available ...

  19. What is a Longitudinal Study? Definition, Types & Examples

    That's the kind of thing that longitudinal research design measures. As for a formal definition, a longitudinal study is a research method that involves repeated observations of the same variable (e.g. a set of people) over some time. The observations over a period of time might be undertaken in the form of an online survey.

  20. Longitudinal Study: Definition, Pros, and Cons

    A longitudinal study is a type of correlational research that involves regular observation of the same variables within the same subjects over a long or short period. These studies can last from a few weeks to several decades. Longitudinal studies are common in epidemiology, economics, and medicine. People also use them in other medical and ...

  21. Longitudinal Research Design: Methods and Examples

    Longitudinal Study Examples. Let's review some longitudinal study example which would be helpful for illustrating the above information. Longitudinal research example. A famous longitudinal case is The Terman Study of the Gifted also known previously as Genetic Studies of Genius.

  22. What is an example of a longitudinal study?

    Attrition refers to participants leaving a study. It always happens to some extent—for example, in randomized controlled trials for medical research. Differential attrition occurs when attrition or dropout rates differ systematically between the intervention and the control group.As a result, the characteristics of the participants who drop out differ from the characteristics of those who ...

  23. 154 questions with answers in LONGITUDINAL STUDIES

    1. Sample repeated measures data based on subject IDs for cross validation of a GENMOD model, 2. Bootstrap a dataset because of a highly skewed outcome measure recorded in an unbalanced long ...

  24. Air pollution metabolomic signatures and chronic respiratory ...

    During a median follow-up of 12.51 years, 8,951 and 5,980 incident COPD and asthma cases were recorded. In multivariable Cox regressions, air pollution was positively associated with CRD risk (for example, hazard ratios [HR] per interquartile range [IQR] increment in PM 2.5: 1.09; 95% confidence interval [CI]: 1.06-1.13).We identified 103, 86, 85, and 90 metabolites in response to PM 2.5, PM ...

  25. Mixture model applications in depression phenotyping: practices

    Applications of mixture models are prevalent in studying psychopathology across development, particularly for identifying typical co-occurring symptom presentations (or phenotypes) in depression. Researchers have used both longitudinal and cross-sectional designs with varied statistical methods. The current study focused on studies that applied latent profile analysis, latent class growth ...

  26. Mental Health and Social Correlates of Reincarceration of Youths as

    Data Source and Study Sample. The data were collected in the third iteration of the National Epidemiological Survey on Alcohol and Related Conditions (NESARC-III), a cross-sectional representative survey of the civilian U.S. population sponsored by the National Institute on Alcohol Abuse and Alcoholism (NIAAA). 16 The NESARC-III conducted in-person interviews with U.S. adults, including the ...

  27. Using natural language processing to facilitate the harmonisation of

    Background Pooling data from different sources will advance mental health research by providing larger sample sizes and allowing cross-study comparisons; however, the heterogeneity in how variables are measured across studies poses a challenge to this process. Methods This study explored the potential of using natural language processing (NLP) to harmonise different mental health ...

  28. YRBSS Questionnaires

    These questions typically reflect emerging areas of interest for CDC and stakeholders. ... These suggestions must be received by December 1 of the odd-numbered year preceding the survey cycle (for example, December 1, 2019, for the 2021 YRBS). All suggestions are then compiled and reviewed by CDC before they are added to the ballot process ...