Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • Systematic Review | Definition, Example, & Guide

Systematic Review | Definition, Example & Guide

Published on June 15, 2022 by Shaun Turney . Revised on November 20, 2023.

A systematic review is a type of review that uses repeatable methods to find, select, and synthesize all available evidence. It answers a clearly formulated research question and explicitly states the methods used to arrive at the answer.

They answered the question “What is the effectiveness of probiotics in reducing eczema symptoms and improving quality of life in patients with eczema?”

In this context, a probiotic is a health product that contains live microorganisms and is taken by mouth. Eczema is a common skin condition that causes red, itchy skin.

Table of contents

What is a systematic review, systematic review vs. meta-analysis, systematic review vs. literature review, systematic review vs. scoping review, when to conduct a systematic review, pros and cons of systematic reviews, step-by-step example of a systematic review, other interesting articles, frequently asked questions about systematic reviews.

A review is an overview of the research that’s already been completed on a topic.

What makes a systematic review different from other types of reviews is that the research methods are designed to reduce bias . The methods are repeatable, and the approach is formal and systematic:

  • Formulate a research question
  • Develop a protocol
  • Search for all relevant studies
  • Apply the selection criteria
  • Extract the data
  • Synthesize the data
  • Write and publish a report

Although multiple sets of guidelines exist, the Cochrane Handbook for Systematic Reviews is among the most widely used. It provides detailed guidelines on how to complete each step of the systematic review process.

Systematic reviews are most commonly used in medical and public health research, but they can also be found in other disciplines.

Systematic reviews typically answer their research question by synthesizing all available evidence and evaluating the quality of the evidence. Synthesizing means bringing together different information to tell a single, cohesive story. The synthesis can be narrative ( qualitative ), quantitative , or both.

Prevent plagiarism. Run a free check.

Systematic reviews often quantitatively synthesize the evidence using a meta-analysis . A meta-analysis is a statistical analysis, not a type of review.

A meta-analysis is a technique to synthesize results from multiple studies. It’s a statistical analysis that combines the results of two or more studies, usually to estimate an effect size .

A literature review is a type of review that uses a less systematic and formal approach than a systematic review. Typically, an expert in a topic will qualitatively summarize and evaluate previous work, without using a formal, explicit method.

Although literature reviews are often less time-consuming and can be insightful or helpful, they have a higher risk of bias and are less transparent than systematic reviews.

Similar to a systematic review, a scoping review is a type of review that tries to minimize bias by using transparent and repeatable methods.

However, a scoping review isn’t a type of systematic review. The most important difference is the goal: rather than answering a specific question, a scoping review explores a topic. The researcher tries to identify the main concepts, theories, and evidence, as well as gaps in the current research.

Sometimes scoping reviews are an exploratory preparation step for a systematic review, and sometimes they are a standalone project.

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

systematic literature reviews

A systematic review is a good choice of review if you want to answer a question about the effectiveness of an intervention , such as a medical treatment.

To conduct a systematic review, you’ll need the following:

  • A precise question , usually about the effectiveness of an intervention. The question needs to be about a topic that’s previously been studied by multiple researchers. If there’s no previous research, there’s nothing to review.
  • If you’re doing a systematic review on your own (e.g., for a research paper or thesis ), you should take appropriate measures to ensure the validity and reliability of your research.
  • Access to databases and journal archives. Often, your educational institution provides you with access.
  • Time. A professional systematic review is a time-consuming process: it will take the lead author about six months of full-time work. If you’re a student, you should narrow the scope of your systematic review and stick to a tight schedule.
  • Bibliographic, word-processing, spreadsheet, and statistical software . For example, you could use EndNote, Microsoft Word, Excel, and SPSS.

A systematic review has many pros .

  • They minimize research bias by considering all available evidence and evaluating each study for bias.
  • Their methods are transparent , so they can be scrutinized by others.
  • They’re thorough : they summarize all available evidence.
  • They can be replicated and updated by others.

Systematic reviews also have a few cons .

  • They’re time-consuming .
  • They’re narrow in scope : they only answer the precise research question.

The 7 steps for conducting a systematic review are explained with an example.

Step 1: Formulate a research question

Formulating the research question is probably the most important step of a systematic review. A clear research question will:

  • Allow you to more effectively communicate your research to other researchers and practitioners
  • Guide your decisions as you plan and conduct your systematic review

A good research question for a systematic review has four components, which you can remember with the acronym PICO :

  • Population(s) or problem(s)
  • Intervention(s)
  • Comparison(s)

You can rearrange these four components to write your research question:

  • What is the effectiveness of I versus C for O in P ?

Sometimes, you may want to include a fifth component, the type of study design . In this case, the acronym is PICOT .

  • Type of study design(s)
  • The population of patients with eczema
  • The intervention of probiotics
  • In comparison to no treatment, placebo , or non-probiotic treatment
  • The outcome of changes in participant-, parent-, and doctor-rated symptoms of eczema and quality of life
  • Randomized control trials, a type of study design

Their research question was:

  • What is the effectiveness of probiotics versus no treatment, a placebo, or a non-probiotic treatment for reducing eczema symptoms and improving quality of life in patients with eczema?

Step 2: Develop a protocol

A protocol is a document that contains your research plan for the systematic review. This is an important step because having a plan allows you to work more efficiently and reduces bias.

Your protocol should include the following components:

  • Background information : Provide the context of the research question, including why it’s important.
  • Research objective (s) : Rephrase your research question as an objective.
  • Selection criteria: State how you’ll decide which studies to include or exclude from your review.
  • Search strategy: Discuss your plan for finding studies.
  • Analysis: Explain what information you’ll collect from the studies and how you’ll synthesize the data.

If you’re a professional seeking to publish your review, it’s a good idea to bring together an advisory committee . This is a group of about six people who have experience in the topic you’re researching. They can help you make decisions about your protocol.

It’s highly recommended to register your protocol. Registering your protocol means submitting it to a database such as PROSPERO or ClinicalTrials.gov .

Step 3: Search for all relevant studies

Searching for relevant studies is the most time-consuming step of a systematic review.

To reduce bias, it’s important to search for relevant studies very thoroughly. Your strategy will depend on your field and your research question, but sources generally fall into these four categories:

  • Databases: Search multiple databases of peer-reviewed literature, such as PubMed or Scopus . Think carefully about how to phrase your search terms and include multiple synonyms of each word. Use Boolean operators if relevant.
  • Handsearching: In addition to searching the primary sources using databases, you’ll also need to search manually. One strategy is to scan relevant journals or conference proceedings. Another strategy is to scan the reference lists of relevant studies.
  • Gray literature: Gray literature includes documents produced by governments, universities, and other institutions that aren’t published by traditional publishers. Graduate student theses are an important type of gray literature, which you can search using the Networked Digital Library of Theses and Dissertations (NDLTD) . In medicine, clinical trial registries are another important type of gray literature.
  • Experts: Contact experts in the field to ask if they have unpublished studies that should be included in your review.

At this stage of your review, you won’t read the articles yet. Simply save any potentially relevant citations using bibliographic software, such as Scribbr’s APA or MLA Generator .

  • Databases: EMBASE, PsycINFO, AMED, LILACS, and ISI Web of Science
  • Handsearch: Conference proceedings and reference lists of articles
  • Gray literature: The Cochrane Library, the metaRegister of Controlled Trials, and the Ongoing Skin Trials Register
  • Experts: Authors of unpublished registered trials, pharmaceutical companies, and manufacturers of probiotics

Step 4: Apply the selection criteria

Applying the selection criteria is a three-person job. Two of you will independently read the studies and decide which to include in your review based on the selection criteria you established in your protocol . The third person’s job is to break any ties.

To increase inter-rater reliability , ensure that everyone thoroughly understands the selection criteria before you begin.

If you’re writing a systematic review as a student for an assignment, you might not have a team. In this case, you’ll have to apply the selection criteria on your own; you can mention this as a limitation in your paper’s discussion.

You should apply the selection criteria in two phases:

  • Based on the titles and abstracts : Decide whether each article potentially meets the selection criteria based on the information provided in the abstracts.
  • Based on the full texts: Download the articles that weren’t excluded during the first phase. If an article isn’t available online or through your library, you may need to contact the authors to ask for a copy. Read the articles and decide which articles meet the selection criteria.

It’s very important to keep a meticulous record of why you included or excluded each article. When the selection process is complete, you can summarize what you did using a PRISMA flow diagram .

Next, Boyle and colleagues found the full texts for each of the remaining studies. Boyle and Tang read through the articles to decide if any more studies needed to be excluded based on the selection criteria.

When Boyle and Tang disagreed about whether a study should be excluded, they discussed it with Varigos until the three researchers came to an agreement.

Step 5: Extract the data

Extracting the data means collecting information from the selected studies in a systematic way. There are two types of information you need to collect from each study:

  • Information about the study’s methods and results . The exact information will depend on your research question, but it might include the year, study design , sample size, context, research findings , and conclusions. If any data are missing, you’ll need to contact the study’s authors.
  • Your judgment of the quality of the evidence, including risk of bias .

You should collect this information using forms. You can find sample forms in The Registry of Methods and Tools for Evidence-Informed Decision Making and the Grading of Recommendations, Assessment, Development and Evaluations Working Group .

Extracting the data is also a three-person job. Two people should do this step independently, and the third person will resolve any disagreements.

They also collected data about possible sources of bias, such as how the study participants were randomized into the control and treatment groups.

Step 6: Synthesize the data

Synthesizing the data means bringing together the information you collected into a single, cohesive story. There are two main approaches to synthesizing the data:

  • Narrative ( qualitative ): Summarize the information in words. You’ll need to discuss the studies and assess their overall quality.
  • Quantitative : Use statistical methods to summarize and compare data from different studies. The most common quantitative approach is a meta-analysis , which allows you to combine results from multiple studies into a summary result.

Generally, you should use both approaches together whenever possible. If you don’t have enough data, or the data from different studies aren’t comparable, then you can take just a narrative approach. However, you should justify why a quantitative approach wasn’t possible.

Boyle and colleagues also divided the studies into subgroups, such as studies about babies, children, and adults, and analyzed the effect sizes within each group.

Step 7: Write and publish a report

The purpose of writing a systematic review article is to share the answer to your research question and explain how you arrived at this answer.

Your article should include the following sections:

  • Abstract : A summary of the review
  • Introduction : Including the rationale and objectives
  • Methods : Including the selection criteria, search method, data extraction method, and synthesis method
  • Results : Including results of the search and selection process, study characteristics, risk of bias in the studies, and synthesis results
  • Discussion : Including interpretation of the results and limitations of the review
  • Conclusion : The answer to your research question and implications for practice, policy, or research

To verify that your report includes everything it needs, you can use the PRISMA checklist .

Once your report is written, you can publish it in a systematic review database, such as the Cochrane Database of Systematic Reviews , and/or in a peer-reviewed journal.

In their report, Boyle and colleagues concluded that probiotics cannot be recommended for reducing eczema symptoms or improving quality of life in patients with eczema. Note Generative AI tools like ChatGPT can be useful at various stages of the writing and research process and can help you to write your systematic review. However, we strongly advise against trying to pass AI-generated text off as your own work.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Student’s  t -distribution
  • Normal distribution
  • Null and Alternative Hypotheses
  • Chi square tests
  • Confidence interval
  • Quartiles & Quantiles
  • Cluster sampling
  • Stratified sampling
  • Data cleansing
  • Reproducibility vs Replicability
  • Peer review
  • Prospective cohort study

Research bias

  • Implicit bias
  • Cognitive bias
  • Placebo effect
  • Hawthorne effect
  • Hindsight bias
  • Affect heuristic
  • Social desirability bias

A literature review is a survey of scholarly sources (such as books, journal articles, and theses) related to a specific topic or research question .

It is often written as part of a thesis, dissertation , or research paper , in order to situate your work in relation to existing knowledge.

A literature review is a survey of credible sources on a topic, often used in dissertations , theses, and research papers . Literature reviews give an overview of knowledge on a subject, helping you identify relevant theories and methods, as well as gaps in existing research. Literature reviews are set up similarly to other  academic texts , with an introduction , a main body, and a conclusion .

An  annotated bibliography is a list of  source references that has a short description (called an annotation ) for each of the sources. It is often assigned as part of the research process for a  paper .  

A systematic review is secondary research because it uses existing research. You don’t collect new data yourself.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Turney, S. (2023, November 20). Systematic Review | Definition, Example & Guide. Scribbr. Retrieved June 20, 2024, from https://www.scribbr.com/methodology/systematic-review/

Is this article helpful?

Shaun Turney

Shaun Turney

Other students also liked, how to write a literature review | guide, examples, & templates, how to write a research proposal | examples & templates, what is critical thinking | definition & examples, "i thought ai proofreading was useless but..".

I've been using Scribbr for years now and I know it's a service that won't disappoint. It does a good job spotting mistakes”

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • My Bibliography
  • Collections
  • Citation manager

Save citation to file

Email citation, add to collections.

  • Create a new collection
  • Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

  • Search in PubMed
  • Search in NLM Catalog
  • Add to Search

How to Write a Systematic Review of the Literature

Affiliations.

  • 1 1 Texas Tech University, Lubbock, TX, USA.
  • 2 2 University of Florida, Gainesville, FL, USA.
  • PMID: 29283007
  • DOI: 10.1177/1937586717747384

This article provides a step-by-step approach to conducting and reporting systematic literature reviews (SLRs) in the domain of healthcare design and discusses some of the key quality issues associated with SLRs. SLR, as the name implies, is a systematic way of collecting, critically evaluating, integrating, and presenting findings from across multiple research studies on a research question or topic of interest. SLR provides a way to assess the quality level and magnitude of existing evidence on a question or topic of interest. It offers a broader and more accurate level of understanding than a traditional literature review. A systematic review adheres to standardized methodologies/guidelines in systematic searching, filtering, reviewing, critiquing, interpreting, synthesizing, and reporting of findings from multiple publications on a topic/domain of interest. The Cochrane Collaboration is the most well-known and widely respected global organization producing SLRs within the healthcare field and a standard to follow for any researcher seeking to write a transparent and methodologically sound SLR. Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA), like the Cochrane Collaboration, was created by an international network of health-based collaborators and provides the framework for SLR to ensure methodological rigor and quality. The PRISMA statement is an evidence-based guide consisting of a checklist and flowchart intended to be used as tools for authors seeking to write SLR and meta-analyses.

Keywords: evidence based design; healthcare design; systematic literature review.

PubMed Disclaimer

Similar articles

  • Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas. Crider K, Williams J, Qi YP, Gutman J, Yeung L, Mai C, Finkelstain J, Mehta S, Pons-Duran C, Menéndez C, Moraleda C, Rogers L, Daniels K, Green P. Crider K, et al. Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217. Cochrane Database Syst Rev. 2022. PMID: 36321557 Free PMC article.
  • Suicidal Ideation. Harmer B, Lee S, Rizvi A, Saadabadi A. Harmer B, et al. 2024 Apr 20. In: StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2024 Jan–. 2024 Apr 20. In: StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2024 Jan–. PMID: 33351435 Free Books & Documents.
  • The future of Cochrane Neonatal. Soll RF, Ovelman C, McGuire W. Soll RF, et al. Early Hum Dev. 2020 Nov;150:105191. doi: 10.1016/j.earlhumdev.2020.105191. Epub 2020 Sep 12. Early Hum Dev. 2020. PMID: 33036834
  • Systematic Reviews in Sports Medicine. DiSilvestro KJ, Tjoumakaris FP, Maltenfort MG, Spindler KP, Freedman KB. DiSilvestro KJ, et al. Am J Sports Med. 2016 Feb;44(2):533-8. doi: 10.1177/0363546515580290. Epub 2015 Apr 21. Am J Sports Med. 2016. PMID: 25899433 Review.
  • The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. Liberati A, Altman DG, Tetzlaff J, Mulrow C, Gøtzsche PC, Ioannidis JP, Clarke M, Devereaux PJ, Kleijnen J, Moher D. Liberati A, et al. J Clin Epidemiol. 2009 Oct;62(10):e1-34. doi: 10.1016/j.jclinepi.2009.06.006. Epub 2009 Jul 23. J Clin Epidemiol. 2009. PMID: 19631507
  • A systematic review and meta-analysis of balance training in patients with chronic ankle instability. Guo Y, Cheng T, Yang Z, Huang Y, Li M, Wang T. Guo Y, et al. Syst Rev. 2024 Feb 12;13(1):64. doi: 10.1186/s13643-024-02455-x. Syst Rev. 2024. PMID: 38347564 Free PMC article.
  • Association between infection and the onset of giant cell arteritis and polymyalgia rheumatica: a systematic review and meta-analysis. Pacoureau L, Barde F, Seror R, Nguyen Y. Pacoureau L, et al. RMD Open. 2023 Nov;9(4):e003493. doi: 10.1136/rmdopen-2023-003493. RMD Open. 2023. PMID: 37949615 Free PMC article.
  • From Social Rejection to Welfare Oblivion: Health and Mental Health in Juvenile Justice in Brazil, Colombia and Spain. Carbonell Á, Georgieva S, Navarro-Pérez JJ, Botija M. Carbonell Á, et al. Int J Environ Res Public Health. 2023 May 29;20(11):5989. doi: 10.3390/ijerph20115989. Int J Environ Res Public Health. 2023. PMID: 37297594 Free PMC article. Review.
  • Why is didactic transposition in disaster education needed by prospective elementary school teachers? Noviana E, Syahza A, Putra ZH, Hadriana, Yustina, Erlinda S, Putri DR, Rusandi MA, Biondi Situmorang DD. Noviana E, et al. Heliyon. 2023 Apr 18;9(4):e15413. doi: 10.1016/j.heliyon.2023.e15413. eCollection 2023 Apr. Heliyon. 2023. PMID: 37128333 Free PMC article. Review.
  • Comparative analysis of efficacy of different combination therapies of α-receptor blockers and traditional Chinese medicine external therapy in the treatment of chronic prostatitis/chronic pelvic pain syndrome: Bayesian network meta-analysis. Zhang K, Zhang Y, Hong S, Cao Y, Liu C. Zhang K, et al. PLoS One. 2023 Apr 20;18(4):e0280821. doi: 10.1371/journal.pone.0280821. eCollection 2023. PLoS One. 2023. PMID: 37079509 Free PMC article.
  • Search in MeSH

Related information

  • Cited in Books

LinkOut - more resources

Full text sources.

  • Ovid Technologies, Inc.

Other Literature Sources

  • scite Smart Citations

full text provider logo

  • Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

X

Library Services

UCL LIBRARY SERVICES

  • Guides and databases
  • Library skills
  • Systematic reviews

What are systematic reviews?

  • Types of systematic reviews
  • Formulating a research question
  • Identifying studies
  • Searching databases
  • Describing and appraising studies
  • Synthesis and systematic maps
  • Software for systematic reviews
  • Online training and support
  • Live and face to face training
  • Individual support
  • Further help

Searching for information

Systematic reviews are a type of literature review of research which require equivalent standards of rigour as primary research. They have a clear, logical rationale that is reported to the reader of the review. They are used in research and policymaking to inform evidence-based decisions and practice. They differ from traditional literature reviews particularly in the following elements of conduct and reporting.

Systematic reviews: 

  • use explicit and transparent methods
  • are a piece of research following a standard set of stages
  • are accountable, replicable and updateable
  • involve users to ensure a review is relevant and useful.

For example, systematic reviews (like all research) should have a clear research question, and the perspective of the authors in their approach to addressing the question is described. There are clearly described methods on how each study in a review was identified, how that study was appraised for quality and relevance and how it is combined with other studies in order to address the review question. A systematic review usually involves more than one person in order to increase the objectivity and trustworthiness of the reviews methods and findings.

Research protocols for systematic reviews may be peer-reviewed and published or registered in a suitable repository to help avoid duplication of reviews and for comparisons to be made with the final review and the planned review.

  • History of systematic reviews to inform policy (EPPI-Centre)
  • Six reasons why it is important to be systematic (EPPI-Centre)
  • Evidence Synthesis International (ESI): Position Statement Describes the issues, principles and goals in synthesising research evidence to inform policy, practice and decisions

On this page

Should all literature reviews be 'systematic reviews', different methods for systematic reviews, reporting standards for systematic reviews.

Literature reviews provide a more complete picture of research knowledge than is possible from individual pieces of research. This can be used to: clarify what is known from research, provide new perspectives, build theory, test theory, identify research gaps or inform research agendas.

A systematic review requires a considerable amount of time and resources, and is one type of literature review.

If the purpose of a review is to make justifiable evidence claims, then it should be systematic, as a systematic review uses rigorous explicit methods. The methods used can depend on the purpose of the review, and the time and resources available.

A 'non-systematic review' might use some of the same methods as systematic reviews, such as systematic approaches to identify studies or quality appraise the literature. There may be times when this approach can be useful. In a student dissertation, for example, there may not be the time to be fully systematic in a review of the literature if this is only one small part of the thesis. In other types of research, there may also be a need to obtain a quick and not necessarily thorough overview of a literature to inform some other work (including a systematic review). Another example, is where policymakers, or other people using research findings, want to make quick decisions and there is no systematic review available to help them. They have a choice of gaining a rapid overview of the research literature or not having any research evidence to help their decision-making. 

Just like any other piece of research, the methods used to undertake any literature review should be carefully planned to justify the conclusions made. 

Finding out about different types of systematic reviews and the methods used for systematic reviews, and reading both systematic and other types of review will help to understand some of the differences. 

Typically, a systematic review addresses a focussed, structured research question in order to inform understanding and decisions on an area. (see the  Formulating a research question  section for examples). 

Sometimes systematic reviews ask a broad research question, and one strategy to achieve this is the use of several focussed sub-questions each addressed by sub-components of the review.  

Another strategy is to develop a map to describe the type of research that has been undertaken in relation to a research question. Some maps even describe over 2,000 papers, while others are much smaller. One purpose of a map is to help choose a sub-set of studies to explore more fully in a synthesis. There are also other purposes of maps: see the box on  systematic evidence maps  for further information. 

Reporting standards specify minimum elements that need to go into the reporting of a review. The reporting standards refer mainly to methodological issues but they are not as detailed or specific as critical appraisal for the methodological standards of conduct of a review.

A number of organisations have developed specific guidelines and standards for both the conducting and reporting on systematic reviews in different topic areas.  

  • PRISMA PRISMA is a reporting standard and is an acronym for Preferred Reporting Items for Systematic Reviews and Meta-Analyses. The Key Documents section of the PRISMA website links to a checklist, flow diagram and explanatory notes. PRISMA is less useful for certain types of reviews, including those that are iterative.
  • eMERGe eMERGe is a reporting standard that has been developed for meta-ethnographies, a qualitative synthesis method.
  • ROSES: RepOrting standards for Systematic Evidence Syntheses Reporting standards, including forms and flow diagram, designed specifically for systematic reviews and maps in the field of conservation and environmental management.

Useful books about systematic reviews

systematic literature reviews

Systematic approaches to a successful literature review

systematic literature reviews

An introduction to systematic reviews

systematic literature reviews

Cochrane handbook for systematic reviews of interventions

Systematic reviews: crd's guidance for undertaking reviews in health care.

systematic literature reviews

Finding what works in health care: Standards for systematic reviews

Book cover image

Systematic Reviews in the Social Sciences

Meta-analysis and research synthesis.

Book cover image

Research Synthesis and Meta-Analysis

Book cover image

Doing a Systematic Review

Literature reviews.

  • What is a literature review?
  • Why are literature reviews important?
  • << Previous: Systematic reviews
  • Next: Types of systematic reviews >>
  • Last Updated: May 30, 2024 4:38 PM
  • URL: https://library-guides.ucl.ac.uk/systematic-reviews
  • Locations and Hours
  • UCLA Library
  • Research Guides
  • Biomedical Library Guides

Systematic Reviews

  • Types of Literature Reviews

What Makes a Systematic Review Different from Other Types of Reviews?

  • Planning Your Systematic Review
  • Database Searching
  • Creating the Search
  • Search Filters and Hedges
  • Grey Literature
  • Managing and Appraising Results
  • Further Resources

Reproduced from Grant, M. J. and Booth, A. (2009), A typology of reviews: an analysis of 14 review types and associated methodologies. Health Information & Libraries Journal, 26: 91–108. doi:10.1111/j.1471-1842.2009.00848.x

Aims to demonstrate writer has extensively researched literature and critically evaluated its quality. Goes beyond mere description to include degree of analysis and conceptual innovation. Typically results in hypothesis or mode Seeks to identify most significant items in the field No formal quality assessment. Attempts to evaluate according to contribution Typically narrative, perhaps conceptual or chronological Significant component: seeks to identify conceptual contribution to embody existing or derive new theory
Generic term: published materials that provide examination of recent or current literature. Can cover wide range of subjects at various levels of completeness and comprehensiveness. May include research findings May or may not include comprehensive searching May or may not include quality assessment Typically narrative Analysis may be chronological, conceptual, thematic, etc.
Mapping review/ systematic map Map out and categorize existing literature from which to commission further reviews and/or primary research by identifying gaps in research literature Completeness of searching determined by time/scope constraints No formal quality assessment May be graphical and tabular Characterizes quantity and quality of literature, perhaps by study design and other key features. May identify need for primary or secondary research
Technique that statistically combines the results of quantitative studies to provide a more precise effect of the results Aims for exhaustive, comprehensive searching. May use funnel plot to assess completeness Quality assessment may determine inclusion/ exclusion and/or sensitivity analyses Graphical and tabular with narrative commentary Numerical analysis of measures of effect assuming absence of heterogeneity
Refers to any combination of methods where one significant component is a literature review (usually systematic). Within a review context it refers to a combination of review approaches for example combining quantitative with qualitative research or outcome with process studies Requires either very sensitive search to retrieve all studies or separately conceived quantitative and qualitative strategies Requires either a generic appraisal instrument or separate appraisal processes with corresponding checklists Typically both components will be presented as narrative and in tables. May also employ graphical means of integrating quantitative and qualitative studies Analysis may characterise both literatures and look for correlations between characteristics or use gap analysis to identify aspects absent in one literature but missing in the other
Generic term: summary of the [medical] literature that attempts to survey the literature and describe its characteristics May or may not include comprehensive searching (depends whether systematic overview or not) May or may not include quality assessment (depends whether systematic overview or not) Synthesis depends on whether systematic or not. Typically narrative but may include tabular features Analysis may be chronological, conceptual, thematic, etc.
Method for integrating or comparing the findings from qualitative studies. It looks for ‘themes’ or ‘constructs’ that lie in or across individual qualitative studies May employ selective or purposive sampling Quality assessment typically used to mediate messages not for inclusion/exclusion Qualitative, narrative synthesis Thematic analysis, may include conceptual models
Assessment of what is already known about a policy or practice issue, by using systematic review methods to search and critically appraise existing research Completeness of searching determined by time constraints Time-limited formal quality assessment Typically narrative and tabular Quantities of literature and overall quality/direction of effect of literature
Preliminary assessment of potential size and scope of available research literature. Aims to identify nature and extent of research evidence (usually including ongoing research) Completeness of searching determined by time/scope constraints. May include research in progress No formal quality assessment Typically tabular with some narrative commentary Characterizes quantity and quality of literature, perhaps by study design and other key features. Attempts to specify a viable review
Tend to address more current matters in contrast to other combined retrospective and current approaches. May offer new perspectives Aims for comprehensive searching of current literature No formal quality assessment Typically narrative, may have tabular accompaniment Current state of knowledge and priorities for future investigation and research
Seeks to systematically search for, appraise and synthesis research evidence, often adhering to guidelines on the conduct of a review Aims for exhaustive, comprehensive searching Quality assessment may determine inclusion/exclusion Typically narrative with tabular accompaniment What is known; recommendations for practice. What remains unknown; uncertainty around findings, recommendations for future research
Combines strengths of critical review with a comprehensive search process. Typically addresses broad questions to produce ‘best evidence synthesis’ Aims for exhaustive, comprehensive searching May or may not include quality assessment Minimal narrative, tabular summary of studies What is known; recommendations for practice. Limitations
Attempt to include elements of systematic review process while stopping short of systematic review. Typically conducted as postgraduate student assignment May or may not include comprehensive searching May or may not include quality assessment Typically narrative with tabular accompaniment What is known; uncertainty around findings; limitations of methodology
Specifically refers to review compiling evidence from multiple reviews into one accessible and usable document. Focuses on broad condition or problem for which there are competing interventions and highlights reviews that address these interventions and their results Identification of component reviews, but no search for primary studies Quality assessment of studies within component reviews and/or of reviews themselves Graphical and tabular with narrative commentary What is known; recommendations for practice. What remains unknown; recommendations for future research
  • << Previous: Home
  • Next: Planning Your Systematic Review >>
  • Last Updated: Apr 17, 2024 2:02 PM
  • URL: https://guides.library.ucla.edu/systematicreviews

systematic literature reviews

What is a Systematic Literature Review?

A systematic literature review (SLR) is an independent academic method that aims to identify and evaluate all relevant literature on a topic in order to derive conclusions about the question under consideration. "Systematic reviews are undertaken to clarify the state of existing research and the implications that should be drawn from this." (Feak & Swales, 2009, p. 3) An SLR can demonstrate the current state of research on a topic, while identifying gaps and areas requiring further research with regard to a given research question. A formal methodological approach is pursued in order to reduce distortions caused by an overly restrictive selection of the available literature and to increase the reliability of the literature selected (Tranfield, Denyer & Smart, 2003). A special aspect in this regard is the fact that a research objective is defined for the search itself and the criteria for determining what is to be included and excluded are defined prior to conducting the search. The search is mainly performed in electronic literature databases (such as Business Source Complete or Web of Science), but also includes manual searches (reviews of reference lists in relevant sources) and the identification of literature not yet published in order to obtain a comprehensive overview of a research topic.

An SLR protocol documents all the information gathered and the steps taken as part of an SLR in order to make the selection process transparent and reproducible. The PRISMA flow-diagram support you in making the selection process visible.

In an ideal scenario, experts from the respective research discipline, as well as experts working in the relevant field and in libraries, should be involved in setting the search terms . As a rule, the literature is selected by two or more reviewers working independently of one another. Both measures serve the purpose of increasing the objectivity of the literature selection. An SLR must, then, be more than merely a summary of a topic (Briner & Denyer, 2012). As such, it also distinguishes itself from “ordinary” surveys of the available literature. The following table shows the differences between an SLR and an “ordinary” literature review.

  • Charts of BSWL workshop (pdf, 2.88 MB)
  • Listen to the interview (mp4, 12.35 MB)

Differences to "common" literature reviews

CharacteristicSLRcommon literature overview
Independent research methodyesno
Explicit formulation of the search objectivesyesno
Identification of all publications on a topicyesno
Defined criteria for inclusion and exclusion of publicationsyesno
Description of search procedureyesno
Literature selection and information extraction by several personsyesno
Transparent quality evaluation of publicationsyesno

What are the objectives of SLRs?

  • Avoidance of research redundancies despite a growing amount of publications
  • Identification of research areas, gaps and methods
  • Input for evidence-based management, which allows to base management decisions on scientific methods and findings
  • Identification of links between different areas of researc

Process steps of an SLR

A SLR has several process steps which are defined differently in the literature (Fink 2014, p. 4; Guba 2008, Transfield et al. 2003). We distinguish the following steps which are adapted to the economics and management research area:

1. Defining research questions

Briner & Denyer (2009, p. 347ff.) have developed the CIMO scheme to establish clearly formulated and answerable research questions in the field of economic sciences:

C – CONTEXT:  Which individuals, relationships, institutional frameworks and systems are being investigated?

I – Intervention:  The effects of which event, action or activity are being investigated?

M – Mechanisms:  Which mechanisms can explain the relationship between interventions and results? Under what conditions do these mechanisms take effect?

O – Outcomes:  What are the effects of the intervention? How are the results measured? What are intended and unintended effects?

The objective of the systematic literature review is used to formulate research questions such as “How can a project team be led effectively?”. Since there are numerous interpretations and constructs for “effective”, “leadership” and “project team”, these terms must be particularized.

With the aid of the scheme, the following concrete research questions can be derived with regard to this example:

Under what conditions (C) does leadership style (I) influence the performance of project teams (O)?

Which constructs have an effect upon the influence of leadership style (I) on a project team’s performance (O)?          

Research questions do not necessarily need to follow the CIMO scheme, but they should:

  • ... be formulated in a clear, focused and comprehensible manner and be answerable;
  • ... have been determined prior to carrying out the SLR;
  • ... consist of general and specific questions.

As early as this stage, the criteria for inclusion and exclusion are also defined. The selection of the criteria must be well-grounded. This may include conceptual factors such as a geographical or temporal restrictions, congruent definitions of constructs, as well as quality criteria (journal impact factor > x).

2. Selecting databases and other research sources

The selection of sources must be described and explained in detail. The aim is to find a balance between the relevance of the sources (content-related fit) and the scope of the sources.

In the field of economic sciences, there are a number of literature databases that can be searched as part of an SLR. Some examples in this regard are:

  • Business Source Complete
  • ProQuest One Business
  • EconBiz        

Our video " Selecting the right databases " explains how to find relevant databases for your topic.

Literature databases are an important source of research for SLRs, as they can minimize distortions caused by an individual literature selection (selection bias), while offering advantages for a systematic search due to their data structure. The aim is to find all database entries on a topic and thus keep the retrieval bias low (tutorial on retrieval bias ).  Besides articles from scientific journals, it is important to inlcude working papers, conference proceedings, etc to reduce the publication bias ( tutorial on publication bias ).

Our online self-study course " Searching economic databases " explains step 2 und 3.

3. Defining search terms

Once the literature databases and other research sources have been selected, search terms are defined. For this purpose, the research topic/questions is/are divided into blocks of terms of equal ranking. This approach is called the block-building method (Guba 2008, p. 63). The so-called document-term matrix, which lists topic blocks and search terms according to a scheme, is helpful in this regard. The aim is to identify as many different synonyms as possible for the partial terms. A precisely formulated research question facilitates the identification of relevant search terms. In addition, keywords from particularly relevant articles support the formulation of search terms.

A document-term matrix for the topic “The influence of management style on the performance of project teams” is shown in this example .

Identification of headwords and keywords

When setting search terms, a distinction must be made between subject headings and keywords, both of which are described below:

  • appear in the title, abstract and/or text
  • sometimes specified by the author, but in most cases automatically generated
  • non-standardized
  • different spellings and forms (singular/plural) must be searched separately

Subject headings

  • describe the content
  • are generated by an editorial team
  • are listed in a standardized list (thesaurus)
  • may comprise various keywords
  • include different spellings
  • database-specific

Subject headings are a standardized list of words that are generated by the specialists in charge of some databases. This so-called index of subject headings (thesaurus) helps searchers find relevant articles, since the headwords indicate the content of a publication. By contrast, an ordinary keyword search does not necessarily result in a content-related fit, since the database also displays articles in which, for example, a word appears once in the abstract, even though the article’s content does not cover the topic.

Nevertheless, searches using both headwords and keywords should be conducted, since some articles may not yet have been assigned headwords, or errors may have occurred during the assignment of headwords. 

To add headwords to your search in the Business Source Complete database, please select the Thesaurus tab at the top. Here you can find headwords in a new search field and integrate them into your search query. In the search history, headwords are marked with the addition DE (descriptor).

The EconBiz database of the German National Library of Economics (ZBW – Leibniz Information Centre for Economics), which also contains German-language literature, has created its own index of subject headings with the STW Thesaurus for Economics . Headwords are integrated into the search by being used in the search query.

Since the indexes of subject headings divide terms into synonyms, generic terms and sub-aspects, they facilitate the creation of a document-term matrix. For this purpose it is advisable to specify in the document-term matrix the origin of the search terms (STW Thesaurus for Economics, Business Source Complete, etc.).

Searching in literature databases

Once the document-term matrix has been defined, the search in literature databases begins. It is recommended to enter each word of the document-term matrix individually into the database in order to obtain a good overview of the number of hits per word. Finally, all the words contained in a block of terms are linked with the Boolean operator OR and thereby a union of all the words is formed. The latter are then linked with each other using the Boolean operator AND. In doing so, each block should be added individually in order to see to what degree the number of hits decreases.

Since the search query must be set up separately for each database, tools such as  LitSonar  have been developed to enable a systematic search across different databases. LitSonar was created by  Professor Dr. Ali Sunyaev (Institute of Applied Informatics and Formal Description Methods – AIFB) at the Karlsruhe Institute of Technology.

Advanced search

Certain database-specific commands can be used to refine a search, for example, by taking variable word endings into account (*) or specifying the distance between two words, etc. Our overview shows the most important search commands for our top databases.

Additional searches in sources other than literature databases

In addition to literature databases, other sources should also be searched. Fink (2014, p. 27) lists the following reasons for this:

  • the topic is new and not yet included in indexes of subject headings;
  • search terms are not used congruently in articles because uniform definitions do not exist;
  • some studies are still in the process of being published, or have been completed, but not published.

Therefore, further search strategies are manual search, bibliographic analysis, personal contacts and academic networks (Briner & Denyer, p. 349). Manual search means that you go through the source information of relevant articles and supplement your hit list accordingly. In addition, you should conduct a targeted search for so-called gray literature, that is, literature not distributed via the book trade, such as working papers from specialist areas and conference reports. By including different types of publications, the so-called publication bias (DBWM video “Understanding publication bias” ) – that is, distortions due to exclusive use of articles from peer-reviewed journals – should be kept to a minimum.

The PRESS-Checklist can support you to check the correctness of your search terms.

4. Merging hits from different databases

In principle, large amounts of data can be easily collected, structured and sorted with data processing programs such as Excel. Another option is to use reference management programs such as EndNote, Citavi or Zotero. The Saxon State and University Library Dresden (SLUB Dresden) provides an  overview of current reference management programs  . Software for qualitative data analysis such as NVivo is equally suited for data processing. A comprehensive overview of the features of different tools that support the SLR process can be found in Bandara et al. (2015).

Our online-self study course "Managing literature with Citavi" shows you how to use the reference management software Citavi.

When conducting an SLR, you should specify for each hit the database from which it originates and the date on which the query was made. In addition, you should always indicate how many hits you have identified in the various databases or, for example, by manual search.

Exporting data from literature databases

Exporting from literature databases is very easy. In  Business Source Complete  , you must first click on the “Share” button in the hit list, then “Email a link to download exported results” at the very bottom and then select the appropriate format for the respective literature program.

Exporting data from the literature database  EconBiz  is somewhat more complex. Here you must first create a marked list and then select each hit individually and add it to the marked list. Afterwards, articles on the list can be exported.

After merging all hits from the various databases, duplicate entries (duplicates) are deleted.

5. Applying inclusion and exclusion criteria

All publications are evaluated in the literature management program applying the previously defined criteria for inclusion and exclusion. Only those sources that survive this selection process will subsequently be analyzed. The review process and inclusion criteria should be tested with a small sample and adjustments made if necessary before applying it to all articles. In the ideal case, even this selection would be carried out by more than one person, with each working independently of one another. It needs to be made clear how discrepancies between reviewers are dealt with. 

The review of the criteria for inclusion and exclusion is primarily based on the title, abstract and subject headings in the databases, as well as on the keywords provided by the authors of a publication in the first step. In a second step the whole article / source will be read.

You can create tag words for the inclusion and exclusion in your literature management tool to keep an overview.

In addition to the common literature management tools, you can also use software tools that have been developed to support SLRs. The central library of the university in Zurich has published an overview and evaluation of different tools based on a survey among researchers. --> View SLR tools

The selection process needs to be made transparent. The PRISMA flow diagram supports the visualization of the number of included / excluded studies.

Forward and backward search

Should it become apparent that the number of sources found is relatively small, or if you wish to proceed with particular thoroughness, a forward-and-backward search based on the sources found is recommendable (Webster & Watson 2002, p. xvi). A backward search means going through the bibliographies of the sources found. A forward search, by contrast, identifies articles that have cited the relevant publications. The Web of Science and Scopus databases can be used to perform citation analyses.

6. Perform the review

As the next step, the remaining titles are analyzed as to their content by reading them several times in full. Information is extracted according to defined criteria and the quality of the publications is evaluated. If the data extraction is carried out by more than one person, a training ensures that there will be no differences between the reviewers.

Depending on the research questions there exist diffent methods for data abstraction (content analysis, concept matrix etc.). A so-called concept matrix can be used to structure the content of information (Webster & Watson 2002, p. xvii). The image to the right gives an example of a concept matrix according to Becker (2014).

Particularly in the field of economic sciences, the evaluation of a study’s quality cannot be performed according to a generally valid scheme, such as those existing in the field of medicine, for instance. Quality assessment therefore depends largely on the research questions.

Based on the findings of individual studies, a meta-level is then applied to try to understand what similarities and differences exist between the publications, what research gaps exist, etc. This may also result in the development of a theoretical model or reference framework.

Example concept matrix (Becker 2013) on the topic Business Process Management

ArticlePatternConfigurationSimilarities
Thom (2008)x  
Yang (2009)x x
Rosa (2009) xx

7. Synthesizing results

Once the review has been conducted, the results must be compiled and, on the basis of these, conclusions derived with regard to the research question (Fink 2014, p. 199ff.). This includes, for example, the following aspects:

  • historical development of topics (histogram, time series: when, and how frequently, did publications on the research topic appear?);
  • overview of journals, authors or specialist disciplines dealing with the topic;
  • comparison of applied statistical methods;
  • topics covered by research;
  • identifying research gaps;
  • developing a reference framework;
  • developing constructs;
  • performing a meta-analysis: comparison of the correlations of the results of different empirical studies (see for example Fink 2014, p. 203 on conducting meta-analyses)

Publications about the method

Bandara, W., Furtmueller, E., Miskon, S., Gorbacheva, E., & Beekhuyzen, J. (2015). Achieving Rigor in Literature Reviews: Insights from Qualitative Data Analysis and Tool-Support.  Communications of the Association for Information Systems . 34(8), 154-204.

Booth, A., Papaioannou, D., and Sutton, A. (2012)  Systematic approaches to a successful literature review.  London: Sage.

Briner, R. B., & Denyer, D. (2012). Systematic Review and Evidence Synthesis as a Practice and Scholarship Tool. In Rousseau, D. M. (Hrsg.),  The Oxford Handbook of Evidenence Based Management . (S. 112-129). Oxford: Oxford University Press.

Durach, C. F., Wieland, A., & Machuca, Jose A. D. (2015). Antecedents and dimensions of supply chain robustness: a systematic literature review . International Journal of Physical Distribution & Logistic Management , 46 (1/2), 118-137. doi:  https://doi.org/10.1108/IJPDLM-05-2013-0133

Feak, C. B., & Swales, J. M. (2009). Telling a Research Story: Writing a Literature Review.  English in Today's Research World 2.  Ann Arbor: University of Michigan Press. doi:  10.3998/mpub.309338

Fink, A. (2014).  Conducting Research Literature Reviews: From the Internet to Paper  (4. Aufl.). Los Angeles, London, New Delhi, Singapore, Washington DC: Sage Publication.

Fisch, C., & Block, J. (2018). Six tips for your (systematic) literature review in business and management research.  Management Review Quarterly,  68, 103–106 (2018).  doi.org/10.1007/s11301-018-0142-x

Guba, B. (2008). Systematische Literaturrecherche.  Wiener Medizinische Wochenschrift , 158 (1-2), S. 62-69. doi:  doi.org/10.1007/s10354-007-0500-0  Hart, C.  Doing a literature review: releasing the social science research imagination.  London: Sage.

Jesson, J. K., Metheson, L. & Lacey, F. (2011).  Doing your Literature Review - traditional and Systematic Techniques . Los Angeles, London, New Delhi, Singapore, Washington DC: Sage Publication.

Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ 2021;372:n71. doi: 10.1136/bmj.n71.

Petticrew, M. and Roberts, H. (2006).  Systematic Reviews in the Social Sciences: A Practical Guide . Oxford:Blackwell. Ridley, D. (2012).  The literature review: A step-by-step guide . 2nd edn. London: Sage. 

Chang, W. and Taylor, S.A. (2016), The Effectiveness of Customer Participation in New Product Development: A Meta-Analysis,  Journal of Marketing , American Marketing Association, Los Angeles, CA, Vol. 80 No. 1, pp. 47–64.

Tranfield, D., Denyer, D. & Smart, P. (2003). Towards a methodology for developing evidence-informed management knowledge by means of systematic review.  British Journal of Management , 14 (3), S. 207-222. doi:  https://doi.org/10.1111/1467-8551.00375

Webster, J., & Watson, R. T. (2002). Analyzing the Past to Prepare for the Future: Writing a Literature Review.  Management Information Systems Quarterly , 26(2), xiii-xxiii.  http://www.jstor.org/stable/4132319

Durach, C. F., Wieland, A. & Machuca, Jose. A. D. (2015). Antecedents and dimensions of supply chain robustness: a systematic literature review. International Journal of Physical Distribution & Logistics Management, 45(1/2), 118 – 137.

What is particularly good about this example is that search terms were defined by a number of experts and the review was conducted by three researchers working independently of one another. Furthermore, the search terms used have been very well extracted and the procedure of the literature selection very well described.

On the downside, the restriction to English-language literature brings the language bias into play, even though the authors consider it to be insignificant for the subject area.

Bos-Nehles, A., Renkema, M. & Janssen, M. (2017). HRM and innovative work behaviour: a systematic literature review. Personnel Review, 46(7), pp. 1228-1253

  • Only very specific keywords used
  • No precise information on how the review process was carried out (who reviewed articles?)
  • Only journals with impact factor (publication bias)

Jia, F., Orzes, G., Sartor, M. & Nassimbeni, G. (2017). Global sourcing strategy and structure: towards a conceptual framework. International Journal of Operations & Production Management, 37(7), 840-864

  • Research questions are explicitly presented
  • Search string very detailed
  • Exact description of the review process
  • 2 persons conducted the review independently of each other

Franziska Klatt

[email protected]

+49 30 314-29778

systematic literature reviews

Privacy notice: The TU Berlin offers a chat information service. If you enable it, your IP address and chat messages will be transmitted to external EU servers. more information

The chat is currently unavailable.

Please use our alternative contact options.

  • UNC Libraries
  • HSL Academic Process
  • Systematic Reviews

Systematic Reviews: Home

Created by health science librarians.

HSL Logo

  • Systematic review resources

What is a Systematic Review?

A simplified process map, how can the library help, publications by hsl librarians, systematic reviews in non-health disciplines, resources for performing systematic reviews.

  • Step 1: Complete Pre-Review Tasks
  • Step 2: Develop a Protocol
  • Step 3: Conduct Literature Searches
  • Step 4: Manage Citations
  • Step 5: Screen Citations
  • Step 6: Assess Quality of Included Studies
  • Step 7: Extract Data from Included Studies
  • Step 8: Write the Review

  Check our FAQ's

   Email us

   Call (919) 962-0800

   Make an appointment with a librarian

  Request a systematic or scoping review consultation

Sign up for a systematic review workshop or watch a recording

A systematic review is a literature review that gathers all of the available evidence matching pre-specified eligibility criteria to answer a specific research question. It uses explicit, systematic methods, documented in a protocol, to minimize bias , provide reliable findings , and inform decision-making.  ¹  

There are many types of literature reviews.

Before beginning a systematic review, consider whether it is the best type of review for your question, goals, and resources. The table below compares a few different types of reviews to help you decide which is best for you. 

Comparing Systematic, Scoping, and Systematized Reviews
Systematic Review Scoping Review Systematized Review
Conducted for Publication Conducted for Publication Conducted for Assignment, Thesis, or (Possibly) Publication
Protocol Required Protocol Required No Protocol Required
Focused Research Question Broad Research Question Either
Focused Inclusion & Exclusion Criteria Broad Inclusion & Exclusion Criteria Either
Requires Large Team Requires Small Team Usually 1-2 People
  • Scoping Review Guide For more information about scoping reviews, refer to the UNC HSL Scoping Review Guide.

Systematic Reviews: A Simplified, Step-by-Step Process Map

  • UNC HSL's Simplified, Step-by-Step Process Map A PDF file of the HSL's Systematic Review Process Map.
  • Text-Only: UNC HSL's Systematic Reviews - A Simplified, Step-by-Step Process A text-only PDF file of HSL's Systematic Review Process Map.

Creative commons license applied to systematic reviews image requires that reusers give credit to the creator. It allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, for noncommercial purposes only.

The average systematic review takes 1,168 hours to complete. ¹   A librarian can help you speed up the process.

Systematic reviews follow established guidelines and best practices to produce high-quality research. Librarian involvement in systematic reviews is based on two levels. In Tier 1, your research team can consult with the librarian as needed. The librarian will answer questions and give you recommendations for tools to use. In Tier 2, the librarian will be an active member of your research team and co-author on your review. Roles and expectations of librarians vary based on the level of involvement desired. Examples of these differences are outlined in the table below.

Roles and expectations of librarians based on level of involvement desired.
Tasks Tier 1: Consultative Tier 2: Research Partner / Co-author
Guidance on process and steps Yes Yes
Background searching for past and upcoming reviews Yes Yes
Development and/or refinement of review topic Yes Yes
Assistance with refinement of PICO (population, intervention(s), comparator(s), and key questions Yes Yes
Guidance on study types to include Yes Yes
Guidance on protocol registration Yes Yes
Identification of databases for searches Yes Yes
Instruction in search techniques and methods Yes Yes
Training in citation management software use for managing and sharing results Yes Yes
Development and execution of searches No Yes
Downloading search results to citation management software and removing duplicates No Yes
Documentation of search strategies No Yes
Management of search results No Yes
Guidance on methods Yes Yes
Guidance on data extraction, and management techniques and software Yes Yes
Suggestions of journals to target for publication Yes Yes
Drafting of literature search description in "Methods" section No Yes
Creation of PRISMA diagram No Yes
Drafting of literature search appendix No Yes
Review other manuscript sections and final draft No Yes
Librarian contributions warrant co-authorship No Yes
  • Request a systematic or scoping review consultation

The following are systematic and scoping reviews co-authored by HSL librarians.

Only the most recent 15 results are listed. Click the website link at the bottom of the list to see all reviews co-authored by HSL librarians in PubMed

Researchers conduct systematic reviews in a variety of disciplines.  If your focus is on a topic outside of the health sciences, you may want to also consult the resources below to learn how systematic reviews may vary in your field.  You can also contact a librarian for your discipline with questions.

  • EPPI-Centre methods for conducting systematic reviews The EPPI-Centre develops methods and tools for conducting systematic reviews, including reviews for education, public and social policy.

Cover Art

Environmental Topics

  • Collaboration for Environmental Evidence (CEE) CEE seeks to promote and deliver evidence syntheses on issues of greatest concern to environmental policy and practice as a public service

Social Sciences

systematic literature reviews

  • Siddaway AP, Wood AM, Hedges LV. How to Do a Systematic Review: A Best Practice Guide for Conducting and Reporting Narrative Reviews, Meta-Analyses, and Meta-Syntheses. Annu Rev Psychol. 2019 Jan 4;70:747-770. doi: 10.1146/annurev-psych-010418-102803. A resource for psychology systematic reviews, which also covers qualitative meta-syntheses or meta-ethnographies
  • The Campbell Collaboration

Social Work

Cover Art

Software engineering

  • Guidelines for Performing Systematic Literature Reviews in Software Engineering The objective of this report is to propose comprehensive guidelines for systematic literature reviews appropriate for software engineering researchers, including PhD students.

Cover Art

Sport, Exercise, & Nutrition

Cover Art

  • Application of systematic review methodology to the field of nutrition by Tufts Evidence-based Practice Center Publication Date: 2009
  • Systematic Reviews and Meta-Analysis — Open & Free (Open Learning Initiative) The course follows guidelines and standards developed by the Campbell Collaboration, based on empirical evidence about how to produce the most comprehensive and accurate reviews of research

Cover Art

  • Systematic Reviews by David Gough, Sandy Oliver & James Thomas Publication Date: 2020

Cover Art

Updating reviews

  • Updating systematic reviews by University of Ottawa Evidence-based Practice Center Publication Date: 2007
  • Next: Step 1: Complete Pre-Review Tasks >>
  • Last Updated: May 16, 2024 3:24 PM
  • URL: https://guides.lib.unc.edu/systematic-reviews

Reference management. Clean and simple.

How to write a systematic literature review [9 steps]

Systematic literature review

What is a systematic literature review?

Where are systematic literature reviews used, what types of systematic literature reviews are there, how to write a systematic literature review, 1. decide on your team, 2. formulate your question, 3. plan your research protocol, 4. search for the literature, 5. screen the literature, 6. assess the quality of the studies, 7. extract the data, 8. analyze the results, 9. interpret and present the results, registering your systematic literature review, frequently asked questions about writing a systematic literature review, related articles.

A systematic literature review is a summary, analysis, and evaluation of all the existing research on a well-formulated and specific question.

Put simply, a systematic review is a study of studies that is popular in medical and healthcare research. In this guide, we will cover:

  • the definition of a systematic literature review
  • the purpose of a systematic literature review
  • the different types of systematic reviews
  • how to write a systematic literature review

➡️ Visit our guide to the best research databases for medicine and health to find resources for your systematic review.

Systematic literature reviews can be utilized in various contexts, but they’re often relied on in clinical or healthcare settings.

Medical professionals read systematic literature reviews to stay up-to-date in their field, and granting agencies sometimes need them to make sure there’s justification for further research in an area. They can even be used as the starting point for developing clinical practice guidelines.

A classic systematic literature review can take different approaches:

  • Effectiveness reviews assess the extent to which a medical intervention or therapy achieves its intended effect. They’re the most common type of systematic literature review.
  • Diagnostic test accuracy reviews produce a summary of diagnostic test performance so that their accuracy can be determined before use by healthcare professionals.
  • Experiential (qualitative) reviews analyze human experiences in a cultural or social context. They can be used to assess the effectiveness of an intervention from a person-centric perspective.
  • Costs/economics evaluation reviews look at the cost implications of an intervention or procedure, to assess the resources needed to implement it.
  • Etiology/risk reviews usually try to determine to what degree a relationship exists between an exposure and a health outcome. This can be used to better inform healthcare planning and resource allocation.
  • Psychometric reviews assess the quality of health measurement tools so that the best instrument can be selected for use.
  • Prevalence/incidence reviews measure both the proportion of a population who have a disease, and how often the disease occurs.
  • Prognostic reviews examine the course of a disease and its potential outcomes.
  • Expert opinion/policy reviews are based around expert narrative or policy. They’re often used to complement, or in the absence of, quantitative data.
  • Methodology systematic reviews can be carried out to analyze any methodological issues in the design, conduct, or review of research studies.

Writing a systematic literature review can feel like an overwhelming undertaking. After all, they can often take 6 to 18 months to complete. Below we’ve prepared a step-by-step guide on how to write a systematic literature review.

  • Decide on your team.
  • Formulate your question.
  • Plan your research protocol.
  • Search for the literature.
  • Screen the literature.
  • Assess the quality of the studies.
  • Extract the data.
  • Analyze the results.
  • Interpret and present the results.

When carrying out a systematic literature review, you should employ multiple reviewers in order to minimize bias and strengthen analysis. A minimum of two is a good rule of thumb, with a third to serve as a tiebreaker if needed.

You may also need to team up with a librarian to help with the search, literature screeners, a statistician to analyze the data, and the relevant subject experts.

Define your answerable question. Then ask yourself, “has someone written a systematic literature review on my question already?” If so, yours may not be needed. A librarian can help you answer this.

You should formulate a “well-built clinical question.” This is the process of generating a good search question. To do this, run through PICO:

  • Patient or Population or Problem/Disease : who or what is the question about? Are there factors about them (e.g. age, race) that could be relevant to the question you’re trying to answer?
  • Intervention : which main intervention or treatment are you considering for assessment?
  • Comparison(s) or Control : is there an alternative intervention or treatment you’re considering? Your systematic literature review doesn’t have to contain a comparison, but you’ll want to stipulate at this stage, either way.
  • Outcome(s) : what are you trying to measure or achieve? What’s the wider goal for the work you’ll be doing?

Now you need a detailed strategy for how you’re going to search for and evaluate the studies relating to your question.

The protocol for your systematic literature review should include:

  • the objectives of your project
  • the specific methods and processes that you’ll use
  • the eligibility criteria of the individual studies
  • how you plan to extract data from individual studies
  • which analyses you’re going to carry out

For a full guide on how to systematically develop your protocol, take a look at the PRISMA checklist . PRISMA has been designed primarily to improve the reporting of systematic literature reviews and meta-analyses.

When writing a systematic literature review, your goal is to find all of the relevant studies relating to your question, so you need to search thoroughly .

This is where your librarian will come in handy again. They should be able to help you formulate a detailed search strategy, and point you to all of the best databases for your topic.

➡️ Read more on on how to efficiently search research databases .

The places to consider in your search are electronic scientific databases (the most popular are PubMed , MEDLINE , and Embase ), controlled clinical trial registers, non-English literature, raw data from published trials, references listed in primary sources, and unpublished sources known to experts in the field.

➡️ Take a look at our list of the top academic research databases .

Tip: Don’t miss out on “gray literature.” You’ll improve the reliability of your findings by including it.

Don’t miss out on “gray literature” sources: those sources outside of the usual academic publishing environment. They include:

  • non-peer-reviewed journals
  • pharmaceutical industry files
  • conference proceedings
  • pharmaceutical company websites
  • internal reports

Gray literature sources are more likely to contain negative conclusions, so you’ll improve the reliability of your findings by including it. You should document details such as:

  • The databases you search and which years they cover
  • The dates you first run the searches, and when they’re updated
  • Which strategies you use, including search terms
  • The numbers of results obtained

➡️ Read more about gray literature .

This should be performed by your two reviewers, using the criteria documented in your research protocol. The screening is done in two phases:

  • Pre-screening of all titles and abstracts, and selecting those appropriate
  • Screening of the full-text articles of the selected studies

Make sure reviewers keep a log of which studies they exclude, with reasons why.

➡️ Visit our guide on what is an abstract?

Your reviewers should evaluate the methodological quality of your chosen full-text articles. Make an assessment checklist that closely aligns with your research protocol, including a consistent scoring system, calculations of the quality of each study, and sensitivity analysis.

The kinds of questions you'll come up with are:

  • Were the participants really randomly allocated to their groups?
  • Were the groups similar in terms of prognostic factors?
  • Could the conclusions of the study have been influenced by bias?

Every step of the data extraction must be documented for transparency and replicability. Create a data extraction form and set your reviewers to work extracting data from the qualified studies.

Here’s a free detailed template for recording data extraction, from Dalhousie University. It should be adapted to your specific question.

Establish a standard measure of outcome which can be applied to each study on the basis of its effect size.

Measures of outcome for studies with:

  • Binary outcomes (e.g. cured/not cured) are odds ratio and risk ratio
  • Continuous outcomes (e.g. blood pressure) are means, difference in means, and standardized difference in means
  • Survival or time-to-event data are hazard ratios

Design a table and populate it with your data results. Draw this out into a forest plot , which provides a simple visual representation of variation between the studies.

Then analyze the data for issues. These can include heterogeneity, which is when studies’ lines within the forest plot don’t overlap with any other studies. Again, record any excluded studies here for reference.

Consider different factors when interpreting your results. These include limitations, strength of evidence, biases, applicability, economic effects, and implications for future practice or research.

Apply appropriate grading of your evidence and consider the strength of your recommendations.

It’s best to formulate a detailed plan for how you’ll present your systematic review results. Take a look at these guidelines for interpreting results from the Cochrane Institute.

Before writing your systematic literature review, you can register it with OSF for additional guidance along the way. You could also register your completed work with PROSPERO .

Systematic literature reviews are often found in clinical or healthcare settings. Medical professionals read systematic literature reviews to stay up-to-date in their field and granting agencies sometimes need them to make sure there’s justification for further research in an area.

The first stage in carrying out a systematic literature review is to put together your team. You should employ multiple reviewers in order to minimize bias and strengthen analysis. A minimum of two is a good rule of thumb, with a third to serve as a tiebreaker if needed.

Your systematic review should include the following details:

A literature review simply provides a summary of the literature available on a topic. A systematic review, on the other hand, is more than just a summary. It also includes an analysis and evaluation of existing research. Put simply, it's a study of studies.

The final stage of conducting a systematic literature review is interpreting and presenting the results. It’s best to formulate a detailed plan for how you’ll present your systematic review results, guidelines can be found for example from the Cochrane institute .

systematic literature reviews

University Libraries      University of Nevada, Reno

  • Skill Guides
  • Subject Guides

Systematic, Scoping, and Other Literature Reviews: Overview

  • Project Planning

What Is a Systematic Review?

Regular literature reviews are simply summaries of the literature on a particular topic. A systematic review, however, is a comprehensive literature review conducted to answer a specific research question. Authors of a systematic review aim to find, code, appraise, and synthesize all of the previous research on their question in an unbiased and well-documented manner. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) outline the minimum amount of information that needs to be reported at the conclusion of a systematic review project. 

Other types of what are known as "evidence syntheses," such as scoping, rapid, and integrative reviews, have varying methodologies. While systematic reviews originated with and continue to be a popular publication type in medicine and other health sciences fields, more and more researchers in other disciplines are choosing to conduct evidence syntheses. 

This guide will walk you through the major steps of a systematic review and point you to key resources including Covidence, a systematic review project management tool. For help with systematic reviews and other major literature review projects, please send us an email at  [email protected] .

Getting Help with Reviews

Organization such as the Institute of Medicine recommend that you consult a librarian when conducting a systematic review. Librarians at the University of Nevada, Reno can help you:

  • Understand best practices for conducting systematic reviews and other evidence syntheses in your discipline
  • Choose and formulate a research question
  • Decide which review type (e.g., systematic, scoping, rapid, etc.) is the best fit for your project
  • Determine what to include and where to register a systematic review protocol
  • Select search terms and develop a search strategy
  • Identify databases and platforms to search
  • Find the full text of articles and other sources
  • Become familiar with free citation management (e.g., EndNote, Zotero)
  • Get access to you and help using Covidence, a systematic review project management tool

Doing a Systematic Review

  • Plan - This is the project planning stage. You and your team will need to develop a good research question, determine the type of review you will conduct (systematic, scoping, rapid, etc.), and establish the inclusion and exclusion criteria (e.g., you're only going to look at studies that use a certain methodology). All of this information needs to be included in your protocol. You'll also need to ensure that the project is viable - has someone already done a systematic review on this topic? Do some searches and check the various protocol registries to find out. 
  • Identify - Next, a comprehensive search of the literature is undertaken to ensure all studies that meet the predetermined criteria are identified. Each research question is different, so the number and types of databases you'll search - as well as other online publication venues - will vary. Some standards and guidelines specify that certain databases (e.g., MEDLINE, EMBASE) should be searched regardless. Your subject librarian can help you select appropriate databases to search and develop search strings for each of those databases.  
  • Evaluate - In this step, retrieved articles are screened and sorted using the predetermined inclusion and exclusion criteria. The risk of bias for each included study is also assessed around this time. It's best if you import search results into a citation management tool (see below) to clean up the citations and remove any duplicates. You can then use a tool like Rayyan (see below) to screen the results. You should begin by screening titles and abstracts only, and then you'll examine the full text of any remaining articles. Each study should be reviewed by a minimum of two people on the project team. 
  • Collect - Each included study is coded and the quantitative or qualitative data contained in these studies is then synthesized. You'll have to either find or develop a coding strategy or form that meets your needs. 
  • Explain - The synthesized results are articulated and contextualized. What do the results mean? How have they answered your research question?
  • Summarize - The final report provides a complete description of the methods and results in a clear, transparent fashion. 

Adapted from

Types of reviews, systematic review.

These types of studies employ a systematic method to analyze and synthesize the results of numerous studies. "Systematic" in this case means following a strict set of steps - as outlined by entities like PRISMA and the Institute of Medicine - so as to make the review more reproducible and less biased. Consistent, thorough documentation is also key. Reviews of this type are not meant to be conducted by an individual but rather a (small) team of researchers. Systematic reviews are widely used in the health sciences, often to find a generalized conclusion from multiple evidence-based studies. 

Meta-Analysis

A systematic method that uses statistics to analyze the data from numerous studies. The researchers combine the data from studies with similar data types and analyze them as a single, expanded dataset. Meta-analyses are a type of systematic review.

Scoping Review

A scoping review employs the systematic review methodology to explore a broader topic or question rather than a specific and answerable one, as is generally the case with a systematic review. Authors of these types of reviews seek to collect and categorize the existing literature so as to identify any gaps.

Rapid Review

Rapid reviews are systematic reviews conducted under a time constraint. Researchers make use of workarounds to complete the review quickly (e.g., only looking at English-language publications), which can lead to a less thorough and more biased review. 

Narrative Review

A traditional literature review that summarizes and synthesizes the findings of numerous original research articles. The purpose and scope of narrative literature reviews vary widely and do not follow a set protocol. Most literature reviews are narrative reviews. 

Umbrella Review

Umbrella reviews are, essentially, systematic reviews of systematic reviews. These compile evidence from multiple review studies into one usable document. 

Grant, Maria J., and Andrew Booth. “A Typology of Reviews: An Analysis of 14 Review Types and Associated Methodologies.” Health Information & Libraries Journal , vol. 26, no. 2, 2009, pp. 91-108. doi: 10.1111/j.1471-1842.2009.00848.x .

  • Next: Project Planning >>

Language on the Move

Systematic Literature Review: Easy Guide

systematic literature reviews

WRONG. It turns out that typing “what is a systematic literature review” into Google will only overwhelm a new researcher! I came across plenty of journal articles that claimed to be explaining what an SLR was (and how that somehow differed from another term I was learning – a scoping review), but for the life of me I could not find a clear-cut set of instructions. All of the information seemed to be pitched at a level far above the one I was operating at, and I began to feel frustrated that I could not find a source that was putting this methodology into terms that the average person could understand. But I knew I needed to figure it out, so over the course of the next few weeks I read what felt like dozens of explainers and guides.

Eventually, my reading and furious note-taking paid off, because by the end of 2023 I had successfully completed my research, entitled “How are language barriers bridged in hospitals?: a systematic review” . But in the process, I had spoken to so many academics who also voiced their frustration that they couldn’t find explanations on how to conduct an SLR in clear lay terms, and so I knew I hadn’t been alone.

Something I feel VERY passionate about is that, as academics, we must be able to talk to people outside of academia, and that means that we need to be able to communicate complex ideas in easily digestible ways. Higher knowledge shouldn’t be reserved for people who have weeks to teach themselves a new research methodology, and I wanted to be able to explain an SLR to everyone, not just other researchers.

And so, I created this “ SLR: Easy Guide ” explainer for anyone and everyone who would like to conduct an SLR but has no idea where to start. If that’s you, please feel free to use this resource – and know that you aren’t alone as an early researcher who is learning things for the first time. We’ve all got to start somewhere, and we can make it easier on others by sharing what we’ve figured out the hard way!

What exactly is a systematic literature review (SLR)?

Ok, so you know how you need to do a literature review before you write a research paper? In that literature review, you are basically summarising what other researchers have said about your research topic so that you can show how your research is building on prior knowledge.

An SLR is different to that. An SLR is your research (your “experiment”, if you will). In an SLR, you read and analyse lots of different published journal articles in order to see patterns in already-published data. There’s an actual methodology that you have to use (which I detail in SLR: An Easy Guide ) in order to select these journal articles.

I haven’t heard of an SLR, but I’ve heard of a meta-analysis. What’s the difference?

Literally nothing. They mean the same thing! Surprise! Academia is fun and not at all confusing.

I’ve also heard of a scoping review. Is that the same as a systematic literature review?

In this case, there actually is a difference, albeit a relatively small one. The methodology for both types of reviews will be the same (whew!), but the reason for conducting one versus the other will be a bit different. Let me give you an example based on my own research. When I began looking into how hospitals manage linguistic diversity between patients and staff, I knew that there was already a lot of literature out there about the subject (generally having to do with the work of professional interpreters). I had four very specific research questions that I wanted to answer based on that literature. This is why I conducted a systematic review – because I already knew that I would be able to find existing research that could answer my questions.

HOWEVER, you might not know how much literature already exists on a given topic. Maybe your topic is fairly niche, so you haven’t seen much about it in publications. This is where a scoping review comes in. In conducting a scoping review, you’ll find out exactly how much literature on the topic already exists. In doing so, you’ll be able to make an argument for why a particular area of research should be looked into more.

If this still sounds confusing (totally understandable!), be sure to talk to a fabulous university librarian. They are really good at knowing the difference between the two!

Is there any kind of SLR “authority” that I should know about?

There sure is! There is an organisation called PRISMA (which stands for Preferred Reporting Items for Systematic Reviews and Meta-Analyses). You can go to their website for two very crucial items that you will need for your SLR: a checklist and a flow chart.

The PRISMA checklist is great because it tells you exactly what you need to include in your SLR. The PRISMA flow chart is what you include in your SLR to show why/how you included and excluded studies during your screening process (which you can see in steps 3 and 4 of my SLR: An Easy Guide  resource). But don’t worry, you don’t need to create the flow chart from scratch. If you use Covidence, the platform will create it for you. And speaking of Covidence…

This feels overwhelming! Is there one place I can go to manage all my SLR data easily?

Absolutely. I used Covidence , an online platform that essentially walks you through the SLR process. I would HIGHLY recommend using Covidence or a similar service to help you manage all your data in one place. Covidence will also automatically create your flow chart for you as you go through your screening process. What I especially liked about Covidence was that I was able to custom-create my data collection template based on my specific research questions. This made my data analysis much easier than it would have been without it!

What do I do if I’m still confused or feel like I don’t know how to do this?

Remember that every single one of us who goes on to do higher degree research feels like this. We don’t know what we don’t know! I’ve now completed two Masters degrees and am currently working on my PhD, and let me tell you, the learning curve is steep! But you know what? You can do it. Don’t be afraid to ask questions. Tell your supervisors and colleagues when you feel lost. Remind yourself that learning these research skills is just as important as the research itself. And when you get super stressed, grab a cup of coffee, stand in the sunshine and take a 10-minute break. You’ve got this!

Download and cite my free “SLR: An Easy Guide” resource

“ SLR: An Easy Guide ” is a free cheat sheet for your systematic literature review. You can download it here .

If you find it useful, please cite as:

Quick, B. (2024). Systematic Literature Review: An Easy Guide. Language on the Move . Retrieved from https://www.languageonthemove.com/systematic-literature-review-easy-guide

Next Post Life in a New Language, Part 2: Work

Related posts, because internet.

Brynn Quick

Is it ok for linguists to hate new words?

Dave Sayers

Community Languages Schools Transforming Education

Hanna Torsh

Author Brynn Quick

Brynn Quick holds a Master of Applied Linguistics and a Master of Research from Macquarie University. For her PhD, also at Macquarie University, she is investigating how language barriers are bridged between patients and staff in Australian hospitals. Her linguistic interests are many and varied, and include sociolinguistics, anthropological linguistics, sociophonetics, and historical linguistics, particularly the history of English.

Join the discussion One Comment

systematic literature reviews

Wow, Brynn! What a creative and relatable way to navigate an otherwise intimidating research method! Thanks for sharing your guide. Will definitely use it!

Leave a Reply Cancel Reply

Save my name, email, and website in this browser for the next time I comment.

Notify me of follow-up comments by email.

Notify me of new posts by email.

UNTERSTÜTZT VON / SUPPORTED BY

Alexander Humboldt Foundation

NEWSLETTER SUBSCRIPTION

Be the first to know. Sign up now to stay updated on our news and latest posts.

Email Address

systematic literature reviews

  • Life in a New Language
  • COVID-19 Crisis Communication
  • Intercultural Communication
  • Japanese on the Move
  • Next Generation Literacies
  • PhD Hall of Fame
  • Research blog
  • Language on the Move 2023
  • Language on the Move 2022
  • Language on the Move 2021
  • Language on the Move 2020
  • Language on the Move 2019
  • Language on the Move 2018
  • Language on the Move 2017
  • Language on the Move 2016
  • Language on the Move 2015
  • Language on the Move 2014
  • Language on the Move 2013
  • Language on the Move 2012
  • Language on the Move 2011
  • Language on the Move 2010
  • Language on the Move 2009
  • Translators
  • Web developer

Log in using your username and password

  • Search More Search for this keyword Advanced search
  • Latest content
  • Current issue
  • BMJ Journals

You are here

  • Volume 33, Issue 7
  • Between-hospital variation in indicators of quality of care: a systematic review
  • Article Text
  • Article info
  • Citation Tools
  • Rapid Responses
  • Article metrics

Download PDF

  • http://orcid.org/0000-0001-6034-0064 Margrietha van der Linde 1 ,
  • Nèwel Salet 2 ,
  • Nikki van Leeuwen 1 ,
  • Hester F Lingsma 1 ,
  • Frank Eijkenaar 2
  • 1 Department of Public Health , Erasmus MC , Rotterdam , The Netherlands
  • 2 Erasmus Universiteit Rotterdam, Erasmus School of Health Policy and Management , Rotterdam , The Netherlands
  • Correspondence to Margrietha van der Linde, Department of Public Health, Erasmus MC, Rotterdam, Netherlands; m.vanderlinde{at}erasmusmc.nl

Background Efforts to mitigate unwarranted variation in the quality of care require insight into the ‘level’ (eg, patient, physician, ward, hospital) at which observed variation exists. This systematic literature review aims to synthesise the results of studies that quantify the extent to which hospitals contribute to variation in quality indicator scores.

Methods Embase, Medline, Web of Science, Cochrane and Google Scholar were systematically searched from 2010 to November 2023. We included studies that reported a measure of between-hospital variation in quality indicator scores relative to total variation, typically expressed as a variance partition coefficient (VPC). The results were analysed by disease category and quality indicator type.

Results In total, 8373 studies were reviewed, of which 44 met the inclusion criteria. Casemix adjusted variation was studied for multiple disease categories using 144 indicators, divided over 5 types: intermediate clinical outcomes (n=81), final clinical outcomes (n=35), processes (n=10), patient-reported experiences (n=15) and patient-reported outcomes (n=3). In addition to an analysis of between-hospital variation, eight studies also reported physician-level variation (n=54 estimates). In general, variation that could be attributed to hospitals was limited (median VPC=3%, IQR=1%–9%). Between-hospital variation was highest for process indicators (17.4%, 10.8%–33.5%) and lowest for final clinical outcomes (1.4%, 0.6%–4.2%) and patient-reported outcomes (1.0%, 0.9%–1.5%). No clear pattern could be identified in the degree of between-hospital variation by disease category. Furthermore, the studies exhibited limited attention to the reliability of observed differences in indicator scores.

Conclusion Hospital-level variation in quality indicator scores is generally small relative to residual variation. However, meaningful variation between hospitals does exist for multiple indicators, especially for care processes which can be directly influenced by hospital policy. Quality improvement strategies are likely to generate more impact if preceded by level-specific and indicator-specific analyses of variation, and when absolute variation is also considered.

PROSPERO registration number CRD42022315850.

  • Healthcare quality improvement
  • Quality improvement methodologies
  • Health policy
  • Performance measures

Data availability statement

All data relevant to the study are included in the article or uploaded as online supplemental information.

https://doi.org/10.1136/bmjqs-2023-016726

Statistics from Altmetric.com

Request permissions.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Contributors All authors contributed to the manuscript. All authors contributed to the overall conceptualisation and study design of the systematic review. MvdL and NS conducted the screening, data extraction and data analysis. All authors contributed to writing the manuscript, and all authors have agreed on the final version of the manuscript. FE is the guarantor of the study.

Funding This work was funded by Erasmus Initiative Smarter Choices for Better Health (no award/grant number).

Competing interests None declared.

Provenance and peer review Not commissioned; externally peer reviewed.

Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

Linked Articles

  • Editorial Variation in quality of care between hospitals: how to identify learning opportunities Alex Bottle Pia Kjær Kristensen BMJ Quality & Safety 2024; 33 413-415 Published Online First: 08 Mar 2024. doi: 10.1136/bmjqs-2024-017071

Read the full text or download the PDF:

The organisational impact of agility: a systematic literature review

  • Open access
  • Published: 21 June 2024

Cite this article

You have full access to this open access article

systematic literature reviews

  • Tien Nguyen   ORCID: orcid.org/0000-0002-8046-060X 1 ,
  • Cat Vi Le   ORCID: orcid.org/0000-0002-9272-2216 2 , 3 ,
  • Minh Nguyen   ORCID: orcid.org/0000-0002-3842-2749 2 , 3 ,
  • Gam Nguyen   ORCID: orcid.org/0000-0002-9123-2063 2 , 3 ,
  • Tran Thi Hong Lien   ORCID: orcid.org/0000-0001-7755-638X 2 , 3 &
  • Oanh Nguyen   ORCID: orcid.org/0000-0002-0559-4923 2 , 3  

16 Accesses

Explore all metrics

This paper reviews the literature on agility and its relationship with organisational performance using a sample of 249 recent empirical studies from 1998 to February 2024. We find support for a relatively strong and consistent contribution of different aspects of agility to organisational performance. Our analysis highlights numerous salient issues in this literature in terms of the theoretical background, research design, and contextual factors in agility-performance research. On this basis, we propose relevant recommendations for future research to address these issues, specifically focusing on the role of the board of directors and their leadership in fostering organisational agility.

Avoid common mistakes on your manuscript.

1 Introduction

Since the “Manifesto for Agile Software Development” was declared in 2001 (Highsmith 2011 ), the Agility concept and methodologies have migrated from a narrow area of the IT industry to a wide range of organisational applications. Agility has often been associated with startups and small and medium-sized companies but has recently been extended to large corporations. Due to the volatile, uncertain, complex, and ambiguous (VUCA) business environment combined with intense competition and threats from new startup radical growth, large firms are forced to change their status quo and their heavy and inflexible business and management models to quickly adapt to the rapidly changing environment. As such, embracing agility and leading with agility have become new norms and are essential for business survival (Rigby et al. 2016 ).

In recent years, the world economy has gone through unprecedented crises due to the COVID-19 pandemic, the tech and trade war between the US and China, the Ukraine-Russia war, and the most recent Gaza Strip conflict triggering the Red Sea marine crisis; this has intensified the need for organisations to develop more agile business models to weather environmental turbulence and economic downturns (McKinsey & Company 2020 ). Complexity and unpredictability are dominating rules, challenging traditional management methods that rely on well-order planning. As such, in today's business world, being agile is no longer optional—it is essential for a company to stay alive (Harraf et al. 2015 ).

The recent focus on organisational agility in both research and practice can also be tied to some common practices applied in both small and large companies. One example is the use of cross-functional teams with procedures such as SCRUM to work in harmony with customers and deliver what they expect in a timely and cost-efficient manner (Handscomb et al. 2019 ). Teams with members from different functions and disciplines work together to put customers first and respond swiftly to their requests, reducing the waiting time visible in hierarchal organisations. However, further research evidence is needed to examine whether and to what extent it is sufficient for such a practice to build organisational agility. This highlights the need for comprehensive literature reviews with scientific research insights to guide industry practitioners in the application of agile practices.

Organisational agility is often defined as the dynamic capability of an organisation to act and react to uncertainties and the ability to explore and exploit opportunities in the business environment (Overby et al. 2006 ; Roberts and Grover 2012 ). Since 2019, the number of publications on organisational agility in literature has increased notably. However, as this literature evolves, agility is conceptualised inconsistently. This is particularly problematic given that agility is a multidimensional concept that includes but is not limited to various aspects, such as manufacturing agility, strategic agility, supply chain agility, IT agility, marketing agility, and workforce agility (Walter 2021 ). Such disagreement among researchers regarding how agility should be defined and constructed has posed significant challenges for researchers and practitioners in this area moving forwards, making it difficult to build the literature upon previous findings, to generalise those findings in different contexts and to apply this concept in reality (Walter 2021 ). Thus, a comprehensive understanding of agility as an overarching concept, its antecedents, and its effects on organisational outcomes is needed (Walter 2021 ).

Agility is often considered beneficial to organisational performance. With a dynamic ability to weather rapid changes and turbulence, an agile organisation is believed to be in a better position to produce outcomes. However, some evidence shows that the organisational benefits of agility are dependent on a range of factors, including the types of agility and outcomes as the focus of interest and the conditions for agility to contribute to organisational outcomes (Wieland and Wallenburg 2012 ). For instance, agility is found to increase firm financial performance (Rafi et al. 2021 ) or boost innovation (Del Giudice et al. 2021 ). However, in their study, Chakravarty et al. ( 2013 ) found that only entrepreneurial agility—the proactive ability to anticipate and exploit market opportunities and challenges—can help achieve better financial performance, while such effects from reactive types of agility are not significant. Additionally, while researchers have devoted much attention to some aspects of agility, such as supply chain agility or strategic agility, other aspects of agility, such as workforce agility and marketing agility, are still underresearched (Ajgaonkar et al 2022 ; Gomes et al. 2020 ). This demonstrates the need for a comprehensive and systematic review of whether, how, and what agility can contribute to organisational outcomes.

Recent literature reviews in this area have elucidated how agility is measured, what contributes to agility and the impact of agility on organisational outcomes. However, these reviews either adopted a narrow focus on one aspect of agility, such as marketing agility, supply chain agility, or IT agility (Kalaignanam et al. 2021 ; Patel and Sambasivan 2021 ; Tallon et al. 2019 ), or failed to provide an in-depth analysis that focused exclusively on the contribution of agility to business outcomes (Walter 2021 ).

The lack of a consensus on the concept, measurements and association of organisational agility with critical business outcomes indicates the need for a systematic review of the literature to, first, bring together all different types of agility and examine their impact on different organisational outcomes; second, identify the intervening factors that affect this relationship; and third, provide implications for future research in this area. This paper addresses the abovementioned objectives with an overarching research question: What is the current status of the literature on organisational agility and organisational outcomes? Then, this question is broken down into five broad subquestions as follows:

How are organisational agility and organisational performance defined and measured?

What is the relationship between organisational agility and organisational outcomes?

Which theories are used to examine the relationship between organisational agility and organisational outcomes?

What are some possible mediators or moderators that affect the relationship between organisational agility and organisational outcomes?

What are the implications for future research on this topic?

To comprehensively review the literature on agility and organisational performance, this paper adopts the strategy of a systematic literature review to examine 249 empirical studies in this area from 1998 to February 2024. This paper makes two significant contributions to the literature in this field. First, it seeks to provide a comprehensive summary and a conceptual map of whether and how organisational agility affects organisational performance based on 26 years of empirical evidence on this topic. Second, it aims to identify the gaps in knowledge and propose possible directions for future research and practices in this area. The paper starts with an introduction of the research design, followed by a description of the research findings, and ends with a discussion and recommendations for future research.

2 Research design

This paper adopts the widely used systematic review methodology in literature review studies to collect and analyse data because it is comprehensive, transparent, evidence-based, and unbiased (Khan et al. 2003 ; Snyder 2019 ; Tranfield et al. 2003 ). Figure  1 explains the strategy and steps taken to conduct this literature review.

figure 1

Systematic literature review strategy and procedure

Following Xiao and Watson ( 2019 ), Diaz Tautiva et al. ( 2024 ), Tranfield et al. ( 2003 ), the paper utilises a systematic strategy and review steps through the three main phases of (i) planning, (ii) data collection, and (iii) data extraction, synthesis, and reporting to ensure the replicability and transparency of the methodology and findings. In the planning phase, we formed the review framework by carefully crafting the research objectives and referring to existing systematic review frameworks. Through this process, we were able to determine the search criteria and the framework for data extraction and classification, as indicated in Fig.  1 .

The review framework is based on dimensions of agility, variable measurement, theoretical background, methodology, findings, and intervening factors, followed by a synthesis of a conceptual map (Walter 2021 ; Bhattacharjee and Sarkar 2022 ; Patel and Sambasivan 2021 ). This framework is well aligned with our research questions and objectives and is often used in other literature review papers (Walter 2021 ; Bhattacharjee and Sarkar 2022 ; Patel and Sambasivan 2021 ). By using this framework, we can then move to the next step, which involves identifying the knowledge gaps in the literature and proposing some directions for future research in the field.

Using the predetermined search criteria identified in the planning phase, we first conducted a general search on Web of Science, one of the largest coverage databases, and obtained a sample of 8107 papers. We used the filter function to include 1165 peer-reviewed articles that had full texts available, were written in English, and were published in the fields of business, economics, and management. Then, we screened the titles and abstracts and adopted further exclusion criteria, as shown in Fig.  1 . The final sample consists of 249 English peer-reviewed empirical articles on agility and organisational outcomes, with agility being one of the main variables of interest in studies that test the firm-level impact of agility in the business, economics, and management fields.

Three groups of coders performed the data extraction and grouping based on the predetermined criteria mentioned above. Discussion and moderation were conducted before each group carried out their tasks. The data were extracted into an Excel file and categorised into the following columns: article title, authors, year, journal, theories, sample size, sample type (cross-sectional or panel), independent variables, moderators and contextual variables, mediators, dependent variables, control variables, analytical approach, and findings.

3 Research findings

3.1 descriptive analysis.

Table 1 summarises some key features of our data. In this dataset, agility is either the primary independent variable or a mediator that links inputs to outcomes. We also included other recent literature reviews and conceptual papers in this field to support our data analysis. Thus, our final data consist of 249 empirical studies, 39 literature reviews and conceptual studies, and seven other relevant studies in this area.

Figure  2 presents the distribution of 249 empirical studies on agility and outcomes from 1998 to February 2024, with a sharp increase in the number of publications in recent years since 2017. This indicates researchers’ growing interest in this area and reflects a timely research response to recent environmental and societal changes (Joyce 2021 ).

figure 2

Publications by year from 1998 to February 2024

Table 2 provides an overview of different subtopics in agility and organisational outcomes research and shows that supply chain agility, organisational agility, and strategic agility are the most researched topics in this area. Other aspects of agility run from manufacturing/operational to marketing, business process, customer, workforce, IT and digital, market capitalising, project management, leadership, intellectual, R&D, social media, and value creation.

3.2 Measuring agility

Table 3 elucidates how different types of agility are measured in the literature. There is no consensus on how agility should be defined and measured. As the most researched type of agility, supply chain agility has been captured based on one or multiple dimensions, such as customers, products, delivery, responsiveness to the environment, competitors, and partners (Mandal 2018 ; Charles et al. 2010 ), collaborative planning (Braunscheidel and Suresh 2009 ; Chiang et al. 2012 ), procurement/sourcing and distribution/logistics (Swafford et al. 2006 ). Other approaches to measuring agility focus more on organisational capabilities such as alertness, accessibility, decisiveness, swiftness, and flexibility (Gligor and Holcomb 2012 ) or internal processes such as network collaboration, information integration, process integration, customer demand responsiveness (Mirghafoori et al. 2017 ) or information sharing (Whitten et al. 2012 ).

Organisational agility has also been measured in different ways. While some pioneering studies consider organisational agility to be flexible (Sharifi and Zhang 1999 ), others reveal that organisational agility should be a broader concept (Vokurka and Fliedner 1998 ). Such a concept can be similar to organisational ambidexterity (Overby et al. 2006 ; Roberts and Grover 2012 ), can feature dynamic capability (Teece et al. 1997 ), or can represent an overall organisational framework (Doz and Kosonen 2008 ; Dyer and Shafer 1998 ). The three most popular dimensions of organisational agility—customers, operation and partnership—are drawn from the work of Tallon and Pinsonneault ( 2011 ). Other approaches capture the sensing capability and response capability of organisations (Overby et al. 2006 ) or have different focuses, including but not limited to internal capabilities (Sharifi and Zhang 1999 ), people (Pramono et al. 2021 ), business processes (Vaculík et al. 2018 ), or products and costs (Zheng et al. 2023 ).

Strategic agility is commonly measured based on strategic sensitivity, resource fluidity, leadership unity, or a combination of technology capability, collaborative innovation, organisational learning, and internal alignment (Clauss et al. 2021 ; Doz and Kosonen 2008 ). Another approach involves adopting the three key dimensions of agility from Tallon and Pinsonneault ( 2011 ) from a strategic perspective. Some other measurement approaches are presented in Table  3 .

Manufacturing agility has been examined as a system leveraged by a range of capabilities, including responsiveness, competency, flexibility and speed (Cao and Dowlatshahi 2005 ; Sharifi and Zhang 1999 ), or as an organisational competency (Jacobs et al. 2011 ). Some of the less popular types of agility, such as customer agility, are measured as customers’ sensing capabilities and customers’ response capabilities (Clauss et al. 2021 ; Doz and Kosonen 2008 ). Intellectual agility is captured as the level of business-related skills, the frequency of skills and knowledge updates, the perception of work tasks as a challenge or an opportunity to practice skills, and the willingness to apply alternative solutions when solving problems (Chen and Chiang 2011 ; Felipe et al. 2016 ; Sambamurthy et al. 2003 ).

Overall, the literature on agility offers a wide range of approaches to measuring organisational agility and other dimensions of agility. While traditional approaches such as those of Sharifi and Zhang ( 1999 ), Overby et al. ( 2006 ), or Tallon and Pinsonneault ( 2011 ) are widely used, the literature continues to evolve with newer and more innovative approaches to measure agility and its dimensions. On the one hand, it motivates researchers in this field to develop better and more comprehensive ways to capture agility. On the other hand, the lack of consistency in measuring agility makes it difficult for researchers to synthesise how agility and its dimensions are constructed and what organisations should focus on to be more agile. Thus, there is a lack of informed guidance for practitioners to build agility in their organisations.

3.3 Measuring organisational outcomes

Table 4 provides an overview of the various aspects of organisational outcomes and the ways in which they are measured. The literature indicates a wide range of organisational outcomes examined in the context of agility. Some popular approaches to measuring organisational outcomes include the use of a self-reported overall organisational performance indicator, the construction of a composite variable with multiple dimensions, or the use of multiple separate indicators to capture different aspects of performance, including but not limited to financial performance (accounting and market indicators), nonfinancial performance, environmental performance, operational performance and beyond (Kurniawan et al. 2021a , b ). Other aspects of organisational outcomes examined in the agility and organisational outcomes literature include supply chain performance, innovation, competitiveness, customer service performance, digital and technology performance, manufacturing and operation, sustainability, international performance, employees, marketing, and organisational capabilities.

The literature offers a diverse set of organisational outcomes in conjunction with agility. This allows researchers and practitioners to look at how agility affects organisations in different angles and layers from financial performance to organisational survival, operation, sustainability, capabilities, employee performance and well-being. However, methodologically, the literature reveals some flaws in measuring and constructing organisational performance. While the predominant use of composite variables helps capture an overall indicator of organisational performance, which eases the analysis process (Panda 2021 ), this approach lacks consideration of the separate impact of each aspect of performance, making it challenging to interpret the results and apply the findings to practice.

The literature also reveals that organisational outcomes are often measured as construct variables through reflective/self-report survey questions (Altay et al. 2018 ; Goncalves et al. 2020 ), which raises some concerns about data reliability and validity. Some other studies use quantitative variables based on secondary data (Gligor and Bozkurt 2021 ; Pereira et al. 2021 ) or examine both qualitative and quantitative performance variables. However, further tests should be adopted to ensure the consistency and congruence of these methods (Feizabadi et al. 2019 ; Gligor et al. 2020a , b ).

3.4 The use of theories in agility and organisational outcome research

Table 5 provides a summary of relevant theories in this area of research. Despite the wide range of theories available in this domain, the use of theories in empirical research in this sample is still inadequate. Out of 249 empirical studies, 141 (56.6%) adopt single or multiple theoretical approaches to build their argument of the contribution of agility to organisational outcomes. However, 109 (43.8%) studies in the dataset did not explicitly utilise relevant theories to support their hypothesis development. Given this lack of solid theoretical frameworks, these studies cannot develop a logical and established view of how or why agility improves organisational outcomes, which might threaten the rigour of their research design and the strength of their argument.

Furthermore, Table  5 highlights a wide range of theories incorporated in this research domain, with the dynamic capabilities perspective and the resource-based view being the most widely used theoretical background. These two theoretical frameworks are often combined to provide a comprehensive understanding of the relationship between agility and outcomes (Jabarzadeh et al. 2022 ; Mikalef and Pateli 2017 ). The dynamic capabilities perspective emphasises the importance of perceiving and seizing valuable growth opportunities and the ability to transform the organisation to fit with these opportunities (Teece et al. 1997 ). However, the dynamic capabilities perspective is criticised for its limited explanation of how and to what extent organisations should achieve the abovementioned purposes (Ambler and Wilson 2006). The resource-based view focuses on analysing the internal resources of the enterprise as well as linking internal resources with the external environment to foster innovation and create competitive advantage (Sambamurthy et al. 2003 ). However, similar to dynamic capabilities theory, the resource-based view still has limited practicality (El Shafeey and Trott 2014 ). Therefore, future studies on firm performance and agility should be based on a multitheoretical approach to obtain a more comprehensive view of this relationship (Doz and Kosonen 2008 ; Dyer and Shafer 1998 ).

3.5 The relationship between agility and organisational performance

Figure  3 summarises the findings of the relationship between agility and performance. Evidence from the current literature elucidates the positive impact of agility on organisational performance, with 219 (87.9%) studies confirming the positive impact of different forms of agility on organisational outcomes. Twenty-seven studies reported mixed effects between agility and organisational outcomes, 2 studies found no significant relationship between agility and organisational outcomes, and 1 study showed a negative impact of organisational agility on the continuity of innovation projects in organisations.

figure 3

The impact of agility on organisational performance

Overall, relatively strong and consistent results support the contribution of organisational agility to organisational outcomes, including overall organisational performance (Stei et al. 2024 ), financial and nonfinancial performance (i.e., Rafi et al. 2021 ), innovation (i.e., Goncalves et al. 2020 ), sustainability ( i.e., Lopez-Gamero et al. 2023 ), competitiveness (i.e., Mikalef and Pateli 2017 ), digital and technology transformation ( i.e., Ly 2023 ), international performance ( i.e., Nemkova 2017 ), and employee job performance ( i.e., Chung et al. 2014 ).

However, some studies still report mixed effects of organisational agility on organisational outcomes. Several factors contribute to this mixed effect. First, it depends on the type of inputs and outcomes in the models where organisational agility serves as a mediator or a main independent variable. For example, even though organisational agility is found to enhance radical innovation, it does not help incremental innovation, even under technological turbulence, according to a study conducted by Puriwat and Hoonsopon ( 2021 ). Organisational agility has been shown to translate firm knowledge management into competitive advantage. However, by taking a closer look at different forms of knowledge management, Corte-Real et al. ( 2017 ) found that organisational agility serves as a mediator only for the relationship between exogenous knowledge management and firm competitiveness but not for that between endogenous knowledge management or knowledge sharing partners. Another study confirmed that knowledge management improves organisational agility, which in turn strengthens firm competitive advantage, but a similar positive mediating effect is not found for knowledgement and firm innovation (Salimi and Nazarian 2022 ).

Second, the impact of organisational agility on organisational outcomes is dependent on its dimensions . For instance, between the two types of organisational agility, entrepreneurial agility improves firm financial performance, while adaptive agility does not (Chakravarty et al. 2013 ). Additionally, El Idrissi et al. ( 2023 ) found that among the three dimensions of organisational agility—customer agility, operational agility, and partnering agility—only the first two help organisations to be more prepared for crises.

Third, the mixed effect of organisational agility on organisational outcomes is found under different contextual factors . For instance, the dynamics of the business environment facilitate the positive effect of organisational agility on firm financial performance but not on environmental performance or social performance (Khan 2023 ). Under a low to moderate level of industry competition, organisational agility positively mediates the impact of operational cooperation on the mass customisation of products and services. However, when competition is too intense, this mediating effect becomes negative (Sheng et al. 2021 ). Vaculík et al. ( 2018 ) found that under disruptive organisational changes, firms need to trade off short-term benefits for long-term performance. In such a situation, being more agile causes firms to abandon their current innovation projects and leads to greater possibilities of innovation project termination.

Supply chain agility has been found to improve organisational financial performance (DeGroote and Marx 2013 , Wamba and Akter 2019 ; Zhu and Gao 2021 ), competitive advantage (Alfalla-Luque et al. 2018 ; Chen 2019 ), commercial performance (Sturm et al. 2021 ), customer service (Avelar 2018 ), customer satisfaction (Gligor et al. 2020a , b ), supply chain performance (Baah et al. 2021 ; Wang and Ali 2021 ), and supply chain resilience (Naimi et al. 2020 ). However, in some specific situations, such as uncertain environmental conditions and supply chain disruptions, only supply chain flexibility—one of the three dimensions of supply chain agility—increases organisational performance, while the impacts of the other two dimensions (velocity and visibility) are not statistically significant (Juan et al. 2021 ). Another study showed that supply chain agility has no significant impact on performance (Wieland and Wallenburg 2012 ).

Strategic agility has been found to directly improve overall performance (Chan and Muthuveloo 2021 ; Kurniawan et al. 2020 ), project performance (Haider and Kayani 2021 ), technological performance (Pereira et al. 2021 ), competitive advantage (Hemmati et al. 2016 ), and innovation (Clauss et al. 2021 ). However, Reed ( 2021 ) shows that under environmental turbulence, firms that are more strategically agile experience lower financial performance.

Manufacturing agility and operational agility have been proven to increase competitiveness (Vázquez‐Bustelo et al. 2007 ), manufacturing performance (Awan et al. 2021 ), and market share (Ettlie 1998 ). However, Jacobs et al. ( 2011 ) found that the relationship between manufacturing and firm financial performance is not significant.

Strong evidence supports the contribution of other forms of agility to organisational outcomes (Abrishamkar et al. 2021 ; Asseraf et al. 2019b; Gupta et al. 2019 ; Ju et al. 2020 ; Roberts and Grover 2012 ). However, the positive contributions of these forms vary under certain conditions. Onngam and Charoensukmongkol ( 2023 ) highlighted that firms benefit more from social media agility when the organisational size is smaller and the dynamism of the business environment is lower. Sharif et al. ( 2022 ) found that market capitalising agility only mediates the relationship between knowledge coupling and firm innovation during business downsizing. Khan ( 2020 ) and Zhou et al. ( 2019 ) noted that marketing agility improves firm financial performance. However, when the market is turbulent, this positive effect becomes nonsignificant; when the complexity of marketing is heightened, higher marketing agility reduces marketing adaptation ability. Ngo and Vu ( 2021 , 2020 ) examined two dimensions of customer agility and found that while sensing capability helps organisations achieve superior financial performance, response capability does not.

Overall, the literature on the organisational impact of agility provides strong evidence to support such a positive and significant effect. However, in some cases, how and whether agility leads to higher outcomes is notably dependent on (i) certain environmental factors, (ii) different dimensions of agility and (iii) the types of organisational outcomes.

3.6 Intervening factors in organisational agility and outcomes relationship

Table 6 presents the use of intervening factors in agility and performance research. Agility is often treated as an important mediator linking organisational inputs to outcomes. This is reflected in 61.8% of the research in the dataset incorporating agility as a mediator in their models. For instance, organisational agility is considered a positive explanatory factor for the impact of technological capability and IT (Govuzela and Mafini 2019 ), corporate network management (Kurniawan et al. 2021a , b ), knowledge and intellectual resources management (Cegarra-Navarro et al. 2016 ), leadership capability (Oliveira et al. 2012b , a ), risk management culture (Liu et al. 2018 ), organisational learning culture (Pantouvakis and Bouranta 2017 ), strategic alignment (Hazen et al. 2017 ), promotion information analysis capability (Shuradze et al. 2018 ), organisational ambidexterity (Del Giudice et al. 2021 ), and dispute management (Yaseen et al. 2021 ) on organisational performance. This indicates the importance of conducting agility-performance research in an organisation's internal and external context to understand how agility plays out with other factors to predict organisational outcomes.

Table 7 presents the types of intervening factors examined in the literature on agility and organisational outcomes. The literature highlights that the organisational impact of agility is subjected to a wide range of moderating factors . As aforementioned, organisational agility tends to exert its strengths under adverse environmental conditions, such as volatile and complex environments (Clauss et al. 2021 ), high competitive pressure (Ahammad et al. 2021 ), and high demand for major technological change in the industry  (Ashrafi et al. 2019 ). Additionally, the impact of agility on organisational outcomes depends on external factors such as customer loyalty (Gligor et al. 2020b , a ) and industry type (Lee et al. 2016 ) or internal factors such as firm age  (Reed 2021 ), the adaptability of products and marketing (Asseraf et al. 2019a), the nature of work (Chung et al. 2014 ), information technology systems agility (Tallon and Pinsonneault 2011 ), and startup innovation sensitivity (Tsou and Cheng 2018 ).

Third, the literature also elucidates the mediators through which agility contributes to organisational outcomes. These include but are not limited to the following: new technology acceptance  (Chung et al. 2014 ), business model innovation (Mihardjo and Rukmana 2019 ), entrepreneurship and innovative behaviour development (Pramono et al. 2021 ), networking structure (Yang and Liu 2012 ) and market and social media analytics capability (Yang and Liu 2012 ). Similarly, supply chain agility is said to improve organisational performance through competitiveness (Sheel and Nath 2019 ), risk management (Okoumba et al. 2020 ), collaboration and re-engineering capabilities (Abeysekara et al. 2019 ), effectiveness, cost reduction (Gligor et al. 2015 ), and customer value and customer service (Um 2017 ).

The above analysis and the aspects that are mentioned in Sect.  3.5 stress the importance of studying the relationship between agility and firm performance in the context of both contextual factors and mediators. This highlights the need for future research to continue searching for factors that affect the contribution of agility to firm performance. Such comprehensive models will enhance our understanding of the relationship between agility and organisational performance and, as such, will significantly contribute to further developing this research area.

3.7 Research methodologies in the agility and firm performance literature

Table 8 presents a summary of popular research methodologies used in agility–organisational outcome research, with several notable findings as follows:

First, most studies in the sample use quantitative methods to examine the effect of agility on firm performance. Qualitative and mixed methods, although considered insightful and comprehensive (Truscott et al. 2010 ), have not been adequately utilised in this literature. Overall, the quantitative approach is appropriate for testing the causal effect between Agility (X) and OP (Y) in one or multiple regression models. However, the over-emphasis on causality testing without a proper investigation of the underlying reasons and insights using qualitative techniques might lead to imprecise findings and conclusions, which may create confusion and misunderstanding when applied to practice (Heyvaert et al. 2013 ).

Second, the research on agility and organisational performance mainly uses primary data from surveys and questionnaires to individuals and organisations at a specific timeframe. This approach is appropriate because, given the complexity of measuring agility, it is challenging and impractical for researchers to use proxy and secondary data for measurement. However, using a one-time survey has disadvantages in terms of reliability and generalisability, as the information collected only reflects the impact of agility on organisational performance at a specific time point. This reduces the generalisability of research findings to other contexts at different time points (Bartram 2019 ; Wooldridge 2010 ).

Third, the most popular analytical tool used in this literature is structural equation modelling (SEM)/PLS-SEM (Mikalef and Pateli 2017 ; Ramos et al. 2021 ), which includes bootstrapping techniques (Felipe et al. 2020 ; Gligor et al. 2019 ), followed by multiregression approaches for cross-sectional or panel data (Chen et al. 2014 ; Pereira et al. 2021 ). It is appropriate to use SEM for complex models with multilevel causal relationships. This method facilitates the examination of models with different pathways, including models with mediators and moderators, and provides suitable treatments for latent variables (Bollen 2014 ; Kline 2015 ).

Notably, there are two widely used methods in SEM: covariance-based SEM (CB-SEM) and partial least squares-based SEM (PLS-SEM). CB-SEM is often used in confirmatory research and factor-based models, while PLS-SEM is used in exploratory research and composite-based models (Dash and Paul 2021 ; Rigdon et al. 2017 ). However, the use of PLS-SEM is still debatable in the literature. PLS-SEM is criticised for its limited ability to examine complex and multidirectional causal relationships in SEM and its unproven assumptions (Antonakis et al. 2010 ). This leads to inconsistency in analytical findings and the ability to appraise model fit, especially for models based on small sample sizes (McIntosh et al. 2014 ; Rönkkö et al. 2016 ). Recent research in this area has emphasised that researchers must prioritise understanding their research question, the nature of the variables used, and the purpose of their research to consider the appropriate analytical method (Sarstedt et al. 2016 ).

4 Discussion and implications for future research

Using a dataset of 249 empirical studies from 1998 to 2024, this literature review paper has highlighted that agility is an essential predictor of organisational outcomes. Details about agility, firm performance, and the intervening factors of this causal relationship are summarised in Fig.  4 . The findings of this paper support our understanding of the relationship between agility and organisational performance and provide valuable implications for future research in this field, as indicated below.

figure 4

A summary concept map of the agility and organisational outcomes relationship

4.1 Measuring agility

The literature shows that organisational agility is a matter of becoming rather than being (Alzoubi, et al. 2011 ; Harraf et al 2015 ). As analysed earlier, the literature on agility and firm performance has not provided a solid answer as to how and to what extent agility and its dimensions should be measured. For instance, Table  2 indicates that organisational agility can be measured with multiple instruments, including a firm’s internal capability, external partnership management, its proactiveness to sensing new opportunities, and its responsiveness to changes in the environment. This provides opportunities for future research to explore more extensive approaches to measuring agility based on the literature and explore how organisational agility and its dimensions could be improved (i.e., Ajgaonkar et al. 2022 ).

4.2 Theoretical background

Our analysis indicates that there is a wide range of theories available in the literature that provide explanations and justifications for the contribution of agility to organisational performance, with dynamic capability theory and resource-based theory being the two most widely used theories. The literature also highlights the growing use of multitheoretical approaches for a more extensive understanding of this relationship. Future research could explore new theories and simultaneously continue to incorporate multiple theories to examine the relationship between agility and firm performance.

4.3 Agility dimensions and their impacts on organisational performance

Our analysis indicates that organisational agility, supply chain agility, strategic agility, and manufacturing/operational agility are the most popular topics in the agility-firm performance literature, while the organisational impact of other types of agility, for instance, workforce agility, intellectual agility, leadership agility, and project management agility, are not thoroughly examined. This provides opportunities for future research to investigate these dimensions and their impact on organisational outcomes.

Another promising pathway moving forwards is leadership agility. While top managers and corporate boards are considered crucial for creating and promoting organisational agility, research on this topic is still scarce in terms of both quantity and quality (Lehn 2018 ). The existing corporate governance literature has emphasised the unparalleled contribution of boards of directors to organisational survival with their ability to link firms to external resources during economic uncertainties, crises, or bankruptcy (Haleblian and Finkelstein 1993 ; Hillman et al. 2009 ). To do so, boards needs to build their dynamic capabilities to create, strengthen, and adjust their internal resources to adapt to the external environment (Barreto 2010 ; Helfat et al. 2009 ). However, except for the work of Desai ( 2016 ) that examines the impact of board size and ownership structure on organisational flexibility and the work of Hoppmann et al. ( 2019 ) on the influence of the board on strategic flexibility, this area of research is still in its infancy. This gap in knowledge encourages future research to examine (i) the processes that allow boards to fulfil their role of facilitating changes and building agility capability in their organisation, (ii) the attributes and characteristics of boards that allow them to be more agile, and (iii) whether such agility can contribute to organisational agility, which translates to organisational outcomes.

Our literature review also indicates that agility can contribute to a wide range of organisational outcomes. However, there is still a lack of evidence on how agility affects outcomes in an orderly way running from the individual level to the group level to organisational level outcomes and how and whether the impact of agility on organisational outcomes might be different in the short, medium, and long term. Thus, it is strongly recommended that future research explore these possibilities to provide a more comprehensive and structured view of agility and outcome relationships.

4.4 Interactions and intervening factors

Our review indicates that many aspects of organisational performance benefit from agility. However, these benefits are likely to be dependent on a wide range of factors. This encourages future research to continue searching for intervening factors that have meaningful impacts on the agility–performance relationship. For instance, how and whether agility impacts organisational outcomes might depend on various factors: the type of organisation – small and medium-sized enterprises, public sector organisations, multinational enterprises, nonprofit organisations or domestic vs. international organisations; different stages of the organisational life cycle; and different types of organisational structure and culture (Harraf et al 2015 ).

Additionally, different types of agility may interact, and such interactions might affect organisational outcomes in different ways. This warrants further investigation to examine the effects of different types of agility on firm performance both separately and interactively (Gunasekaran et al 2019 ), for instance, the interactive effects of workforce agility and manufacturing agility on organisational performance.

4.5 Methodology

Our review shows that quantitative research is a primary approach in agility-firm performance research. However, the overreliance on causality might prevent researchers from understanding the underlying reasons why agility can translate to organisational outcomes and the dynamics behind this causal relationship. As such, future research should use a mixed method with both qualitative and quantitative approaches to first understand the organisational impact of agility at the surface level and, second, reveal the processes, dynamics, blockages, enablers and other organisational factors that explain the relationship between agility and organisational outcomes.

Additionally, our review indicates that there is still a lack of comparative research in this area. This provides some pathways for future research to investigate the effect of agility on firm performance in comparative settings. For instance, is the impact of agility on organisational outcomes different across different national cultures and institutional contexts?

Finally, our review highlighted the need for panel and time series data to examine the short-term, medium-term, and long-term effects of agility on organisational performance. We strongly recommend that future research develop more extensive datasets covering multiple periods to ensure that robust and rigorous studies are added to this literature.

4.6 Implications

The resulting concept model of this paper with antecedents, mediators, moderators, organisational outcomes and types of agility has multiple implications for industry practitioners.

First , organisational agility is constructed from several subcomponents corresponding to multiple business functions, such as the supply chain, strategy, manufacturing, marketing, workforce, IT and leadership. For an entire organisation to be agile, each and every function should be agile.

Organisations can utilise different avenues and practices to build capabilities that contribute to agility.

Second , agility promotes corporate outcomes through its impact on mediating actions. To realise the potential of agility, organisations should account for those mediating steps and outcomes in their implementation.

Finally , a strong finding of this literature review is the way in which the relationship between agility and outcomes is contextualised. As such, organisations should pay attention to both internal and external environments as contingent factors on agility and outcomes. For instance, agility seems to have the greatest impact in complex and volatile environments, so organisations should carefully consider the implementation of agility if they operate in relatively stable industries. Additionally, while startups in high-tech industries are initially agile, established businesses in stable industries are generally not agile. As such, for such businesses to achieve agility, they should consider factors such as firm size, IT infrastructure and their customer base.

5 Research contribution, limitations and conclusion

By answering the research question “ What is the current status of the literature on organisational agility and organisational outcomes?” in the above analysis, this study has provided a comprehensive picture of the current literature on the relationship between several aspects of agility and firm performance, with the former either as independent or as mediator variables. The review covers theories, measurements, relationship structure, methodology, and concepts of agility. Following Walter's ( 2021 ) systematic review of agility, our study has extended the scope of investigation and focuses specifically on the relationship between the two most important concepts of agility and performance that play a minor role in Walter’s OA conceptual map. Additionally, the paper has mapped out the organisational agility–performance relationship with antecedents, mediators and moderators, each with a specific list of dimensions for measurement, as sketched out in the subresearch questions. This conceptual map can guide future studies in establishing well-rooted research models.

With a limited number of empirical studies (249), a sharp increase since 2017, a few with archival data (while a majority with data from questionnaires and interviews), and a significant proportion of research without theories as background, agility performance appears to be an emerging research field in its immature phase. This point is strengthened by the fact that the reviewed articles are not in top theoretical management journals such as the Journal of Management and the Academy of Management Journal. Furthermore, theories of this relationship have not been explicitly developed to support quantitative studies for hypothesis testing. By highlighting this gap, this study opens a new road for researchers to establish theories for the agility–performance relation beyond what is currently borrowed from the strategic management field.

Our paper has several limitations. Our attempt to provide a comprehensive overview of agility and performance prevents us from examining this relationship in a specific country or industry context. In addition, although our dataset covers a long time frame from 1998 to February 2024, some of the most recent research may not be included in our review. Nevertheless, we believe that our findings underline both the importance of organisational agility and the worth viewing it in conjunction with other organisational aspects in predicting organisational performance. Furthermore, we hope that this study will inspire future investigations to move further in this literature.

In conclusion, organisational agility and its association with organisational performance have emerged as attractive research topics since 2017. Even though quantitative empirical studies account for most publications, a significant number of them lack a background theory and a consensus on measuring agility and its subcategories. This is detrimental to the value of the findings and intensifies the need for future studies to develop this immature field.

Data availability statement

The data that support the findings of this study are available from Web of Science database for account holders. The data are available from the authors upon request.

Abdelilah B, El Korchi A, Balambo M (2021) Agility as a combination of lean and supply chain integration: how to achieve a better performance. Int J of Logistics-Res Appl. https://doi.org/10.1080/13675567.2021.1972949

Article   Google Scholar  

Abeysekara N, Wang H, Kuruppuarachchi D (2019) Effect of supply-chain resilience on firm performance and competitive advantage: A study of the Sri Lankan apparel industry. Bus Process Manag J 25(7):1673–1695. https://doi.org/10.1108/BPMJ-09-2018-0241

Abrishamkar MM, Abubakar YA, Mitra J (2021) The influence of workforce agility on high-growth firms: the mediating role of innovation. Int J Entrepren Innov 22(3):146–160. https://doi.org/10.1177/1465750320973896

Jahed AM, Quaddus M, Suresh NC, Salam MA, Khan EA (2022) Direct and indirect influences of supply chain management practices on competitive advantage in fast fashion manufacturing industry. J Manuf Technol Manag 33(3):598–617. https://doi.org/10.1108/jmtm-04-2021-0150

Adhiatma A, Hakim A, Fachrunnisa O, Hussain FK (2024) The role of social media business and organizational resources for successful digital transformation. J Media Bus Stud 21(1):23–50. https://doi.org/10.1080/16522354.2023.2203641

Agag G, Shehawy YM, Almoraish A, Eid AR, Lababdi HC, Labben TG, Abdo SS (2024) Understanding the relationship between marketing analytics, customer agility, and customer satisfaction: A longitudinal perspective. J Retail Consumer Serv. https://doi.org/10.1016/j.jretconser.2023.103663

Agarwal A, Shankar R, Tiwari MK (2007) Modeling agility of supply. Ind Mark Manag 36(4):443–457. https://doi.org/10.1016/j.indmarman.2005.12.004

Ahammad MF, Basu S, Munjal S et al (2021) Strategic agility, environmental uncertainties and international performance: The perspective of Indian firms. J World Bus 56(4):101218. https://doi.org/10.1016/j.jwb.2021.101218

Ahmed W, Najmi A, Mustafa Y, Khan A (2019) Developing model to analyze factors affecting firms’ agility and competitive capability: A case of a volatile market. J Model Manag 14(2):476–491. https://doi.org/10.1108/JM2-07-2018-0092

Ajgaonkar S, Neelam NG, Wiemann J (2022) Drivers of workforce agility: a dynamic capability perspective. Int J Org Anal 30(4):951–982

Akhtar P, Ghouri AM, Saha M, Khan MR, Shamim A, Nallaluthan K (2022) Industrial digitization, the use of real-time information, and operational agility: digital and information perspectives for supply chain resilience. Ieee Trans Eng Manag. https://doi.org/10.1109/tem.2022.3182479

Akter S, Hani U, Dwivedi YK, Sharma A (2022) The future of marketing analytics in the sharing economy. Ind Marketing Manag 104:85–100. https://doi.org/10.1016/j.indmarman.2022.04.008

Al Humdan E, Shi YY, Behina M, Chowdhury M, Mahmud A (2023) The role of innovativeness and supply chain agility in the Australian service industry: a dynamic capability perspective. Int J Phys Dist & LogistManag 53(11):1–25. https://doi.org/10.1108/ijpdlm-03-2022-0062

Aldhaheri RT, Ahmad SZ (2023) Factors affecting organisations’ supply chain agility and competitive capability. Bus Process Manag J 29(2):505–527. https://doi.org/10.1108/bpmj-11-2022-0579

Alfalla-Luque R, Machuca JA, Marin-Garcia JA (2018) Triple-A and competitive advantage in supply chains: Empirical research in developed countries. Int J Prod Econ 203:48–61. https://doi.org/10.1016/j.ijpe.2018.05.020

Alghamdi O, Agag G (2024) Competitive advantage: A longitudinal analysis of the roles of data-driven innovation capabilities, marketing agility, and market turbulence. J Retail Consumer Serv. https://doi.org/10.1016/j.jretconser.2023.103547

Alhassani AA, Al-Somali S (2022) The impact of dynamic innovation capabilities on organizational agility and performance in Saudi Public Hospitals. Risus-J Innov Sustain 13(1):44–59. https://doi.org/10.23925/2179-3565.2022v13i1p44-59

Ali A, Rafiq A, Hussien M, Sarwat S, Raziq A (2023) Exploring big data usage to predict supply chain effectiveness: a moderated and mediated model linkage. Glob Bus Rev. https://doi.org/10.1177/09721509231183767

Alkhatib SF, Momani RA (2023) Supply chain resilience and operational performance: the role of digital technologies in Jordanian manufacturing firms. Admin Sci 13(2):40. https://doi.org/10.3390/admsci13020040

Al-Qaralleh RE, Atan T (2021) Impact of knowledge-based HRM, business analytics and agility on innovative performance: linear and FsQCA findings from the hotel industry. Kybernetes 51(1):423–441. https://doi.org/10.1108/K-10-2020-0684

Al-Shboul MA (2017) Infrastructure framework and manufacturing supply chain agility: the role of delivery dependability and time to market. Int J Supply Chain Manag 22(2):172–185. https://doi.org/10.1108/scm-09-2016-0335

Al-Shboul MA, Alsmairat MAK (2023) Enabling supply chain efficacy through SC risk mitigation and absorptive capacity: an empirical investigation in manufacturing firms in the Middle East region - a moderated-mediated model. Supply Chain Manag-an Int J 28(5):909–922. https://doi.org/10.1108/scm-09-2022-0382

Alsmairat MAK, Al-Shboul MA (2023) Enabling supply chain efficacy through supply chain absorptive capacity and ambidexterity: empirical study from Middle East region-a moderated-mediation model. J Manu Techn Manag 34(6):917–936. https://doi.org/10.1108/jmtm-10-2022-0373

AlTaweel IR, Al-Hawary SI (2021) The mediating role of innovation capability on the relationship between strategic agility and organizational performance. Sustainability 13(14):7564. https://doi.org/10.3390/su13147564

Altay N, Gunasekaran A, Dubey R, Childe SJ (2018) Agility and resilience as antecedents of supply chain performance under moderating effects of organizational culture within the humanitarian setting: a dynamic capability view. Prod Plan Control 29(14):1158–1174. https://doi.org/10.1080/09537287.2018.1542174

Alzahrani SS (2023) Balanced agile project management impact on firm performance through business process agility as mediator in IT sector of Kingdom of Saudi Arabia. Int J Bus PerformManag 24(3–4):409–428. https://doi.org/10.1504/ijbpm.2023.132326

Alzoubi AEH, Al-otoum FJ, Albatainh AKF (2011) Factors associated affecting organization agility on product development. Int J Res Rev in Applied Sci 9(3):503–515

Google Scholar  

Ameen N, Tarba S, Cheah JH, Xia SM, Sharma GD (2024) Coupling artificial intelligence capability and strategic agility for enhanced product and service creativity. British J Manag. https://doi.org/10.1111/1467-8551.12797

Antonakis J, Bendahan S, Jacquart P, Lalive R (2010) On making causal claims: a review and recommendations. Leadersh Q 21(6):1086–1120. https://doi.org/10.1016/j.leaqua.2010.10.010

Arslan AS, Kamara S, Tian AY, Rodgers P, Kontkanen M (2024) Marketing agility in underdog entrepreneurship: A qualitative assessment in post-conflict Sub-Saharan African context. J Bus Res. https://doi.org/10.1016/j.jbusres.2023.114488

Aryanto VDW, Mulyo BS (2018) Mediating effect of value creation in the relationship between relational capabilities on business performance. Contaduría y admin 63(1):0–0

Ashrafi A, Ravasan AZ, Trkman P, Afshari S (2019) The role of business analytics capabilities in bolstering firms’ agility and performance. Int J Inf Manag 47:1–15. https://doi.org/10.1016/j.ijinfomgt.2018.12.005

Aslam H, Khan AQ, Rashid K, Rehman S-u (2020) Achieving supply chain resilience: the role of supply chain ambidexterity and supply chain agility. J Manuf Technol Manag 31(6):1185–1204. https://doi.org/10.1108/jmtm-07-2019-0263

Asseraf Y, Lages LF, Shoham A (2019) Assessing the drivers and impact of international marketing agility. Int Mark Rev 36(2):289–315. https://doi.org/10.1108/imr-12-2017-0267

Avelar L (2018) Application of structural equation modelling to analyse the impacts of logistics services on risk perception, agility and customer service level. Adv Prod Eng Manag 13(2):179–192. https://doi.org/10.14743/apem2018.2.283

Awan U, Bhatti SH, Shamim S et al (2021) the role of big data analytics in manufacturing agility and performance: moderation-mediation analysis of organizational creativity and of the involvement of customers as data analysts. Br J Manag 33(3):1200–1220. https://doi.org/10.1111/1467-8551.12549

Baah C, Agyeman DO, Acquah ISK et al (2021) Effect of information sharing in supply chains: understanding the roles of supply chain visibility, agility, collaboration on supply chain performance. Benchmarking Int J. https://doi.org/10.1108/BIJ-08-2020-0453

Babber G, Mittal (2023) Achieving sustainability through the integration of lean, agile, and innovative systems: implications for Indian micro small medium enterprises (MSMEs). J Sci Technol Polic Manag. https://doi.org/10.1108/jstpm-05-2023-0087

Barreto I (2010) Dynamic capabilities: A review of past research and an agenda for the future. J Manag 36(1):256–280. https://doi.org/10.1177/0149206309350776

Bartram B (2019) Using questionnaires. In: Lambert M (ed) Practical research methods in education. Routledge, p 1–11. https://doi.org/10.4324/9781351188395

Bhattacharjee A, Sarkar A (2022) Abusive supervision: a systematic literature review. Manag Rev Q 74(1):1–34

Bhatti SH, Santoro G, Khan J, Rizzato F (2021) Antecedents and consequences of business model innovation in the IT industry. J Bus Res 123:389–400. https://doi.org/10.1016/j.jbusres.2020.10.003

Bollen KA (2014) Structural equations with latent variables, vol 210. John Wiley & Sons, New Jersey

Bouguerra A, Gölgeci İ, Gligor DM, Tatoglu E (2021) How do agile organizations contribute to environmental collaboration? Evidence from MNEs in Turkey. J Int Manag 27(1):100711. https://doi.org/10.1016/j.intman.2019.100711

Braunscheidel MJ, Suresh NC (2009) The organizational antecedents of a firm’s supply chain agility for risk mitigation and response. J Oper Manag 27(2):119–140. https://doi.org/10.1016/j.jom.2008.09.006

Bughin J (2023) Are you resilient? Machine learning prediction of corporate rebound out of the Covid-19 pandemic. Manag Decis Econ 44(3):1547–1564. https://doi.org/10.1002/mde.3764

Cadden T, McIvor R, Cao GM, Treacy R, Yang Y, Gupta M, Onofrei G (2022) Unlocking supply chain agility and supply chain performance through the development of intangible supply chain analytical capabilities. Int J Operations & Prod Manag 42(9):1329–1355. https://doi.org/10.1108/ijopm-06-2021-0383

Cao Q, Dowlatshahi S (2005) The impact of alignment between virtual enterprise and information technology on business performance in an agile manufacturing environment. J Oper Manag 23(5):531–550. https://doi.org/10.1016/j.jom.2004.10.010

Castro-Lopez A, Iglesias V, Santos-Vijande ML (2023) Organizational capabilities and institutional pressures in the adoption of circular economy. J Bus Res. https://doi.org/10.1016/j.jbusres.2023.113823

Cegarra-Navarro J-G, Soto-Acosta P, Wensley AKP (2016) Structured knowledge processes and firm performance: The role of organizational agility. J Bus Res 69(5):1544–1549. https://doi.org/10.1016/j.jbusres.2015.10.014

Çetindas A, Akben I, Özcan C, Kanusagi I, Öztürk O (2023) The effect of supply chain agility on firm performance during COVID-19 pandemic: the mediating and moderating role of demand stability. Supply Chain Forum 24(3):307–318. https://doi.org/10.1080/16258312.2023.2167465

Cha H, Park SM (2023) Organizational agility and communicative actions for responsible innovation: evidence from manufacturing firms in South Korea. Asia Pac J Manag. https://doi.org/10.1007/s10490-023-09883-8

Chakravarty A, Grewal R, Sambamurthy V (2013) Information technology competencies, organizational agility, and firm performance: Enabling and facilitating roles. Inf Syst Res 24(4):976–997

Chan JIL, Muthuveloo R (2021) Antecedents and influence of strategic agility on organizational performance of private higher education institutions in Malaysia. Stud High Educ 46(8):1726–1739. https://doi.org/10.1080/03075079.2019.1703131

Chan JIL, Muthuveloo R (2022) Strategic agility: linking people and organisational performance of private higher learning institutions in Malaysia. Int J Bus Soc 23(1):342–358. https://doi.org/10.33736/ijbs.4616.2022

Chan JLL, Muthuveloo R (2019) Antecedents and influence of strategic agility on organizational performance of private higher education institutions in Malaysia. Stud High Educ. https://doi.org/10.1080/03075079.2019.1703131

Charles A, Lauras M, Van Wassenhove L (2010) A model to define and assess the agility of supply chains: building on humanitarian experience. Int J Phys Dist Logist Manag 40(8/9):722–741. https://doi.org/10.1108/09600031011079355

Chatterjee S, Chaudhuri R, Vrontis D (2022) Examining the impact of adoption of emerging technology and supply chain resilience on firm performance: moderating role of absorptive capacity and leadership support. Ieee Trans on Eng Manag. https://doi.org/10.1109/tem.2021.3134188

Chen C-J (2019) Developing a model for supply chain agility and innovativeness to enhance firms’ competitive advantage. Manag Decis 57(7):1511–1534. https://doi.org/10.1108/MD-12-2017-1236

Chen W-H, Chiang A-H (2011) Network agility as a trigger for enhancing firm performance: A case study of a high-tech firm implementing the mixed channel strategy. Ind Mark Manag 40(4):643–651. https://doi.org/10.1016/j.indmarman.2011.01.001

Chen Y, Wang Y, Nevo S et al (2014) IT capability and organizational performance: the roles of business process agility and environmental factors. Eur J Inf Syst 23(3):326–342. https://doi.org/10.1057/ejis.2013.4

Cherian TM, Arun CJ (2023) COVID-19 impact in supply chain performance: a study on the construction industry. Int J Prod Perform Manag 72(10):2882–2897. https://doi.org/10.1108/ijppm-04-2021-0220

Chiang C-Y, Kocabasoglu-Hillmer C, Suresh N (2012) An empirical investigation of the impact of strategic sourcing and flexibility on firm’s supply chain agility. Int J Oper Prod 32(1–2):49–78. https://doi.org/10.1108/01443571211195736

Cho HE, Jeong I, Kim E, Cho J (2023) Achieving superior performance in international markets: the roles of organizational agility and absorptive capacity. J Bus Ind Mark 38(4):736–750. https://doi.org/10.1108/jbim-09-2021-0425

Christopher M, Peck H (2004) Building the Resilient Supply Chain. Int J Logist Manag 15(2):1–14. https://doi.org/10.1108/09574090410700275

Chuang SH (2020) Co-creating social media agility to build strong customer-firm relationships. Ind Mark Manag 84:202–211. https://doi.org/10.1016/j.indmarman.2019.06.012

Chung S, Lee KY, Kim K (2014) Job performance through mobile enterprise systems: The role of organizational agility, location independence, and task characteristics. Inf Manag 51(6):605–617. https://doi.org/10.1016/j.im.2014.05.007

Clauss T, Abebe M, Tangpong C, Hock M (2021) Strategic Agility, Business Model Innovation, and Firm Performance: An Empirical Investigation. IEEE Trans Eng Manag 68(3):767–784. https://doi.org/10.1109/tem.2019.2910381

Corte-Real N, Oliveira T, Ruivo P (2017) Assessing business value of big data analytics in European firms. J Bus Res 70:379–390. https://doi.org/10.1016/j.jbusres.2016.08.011

Dabić M, Stojčić N, Simić M, Potocan V, Slavković M, Nedelko Z (2021) Intellectual agility and innovation in micro and small businesses: The mediating role of entrepreneurial leadership. J Bus Res 123:683–695

Dahms S, Cabrilo S, Kingkaew (2023) Configurations of innovation performance in foreign owned subsidiaries: focusing on organizational agility and digitalization. Manag Decis. https://doi.org/10.1108/md-05-2022-0600

Das KP, Mukhopadhyay S, Suar D (2023) Enablers of workforce agility, firm performance, and corporate reputation. Asia Pac Manag Rev 28(1):33–44. https://doi.org/10.1016/j.apmrv.2022.01.006

Dash G, Paul J (2021) CB-SEM vs PLS-SEM methods for research in social sciences and technology forecasting. Technol Forecast Soc Change 173:121092. https://doi.org/10.1016/j.techfore.2021.121092

de Oliveira MA, Oliveira Dalla Valentina LV, Possamai O (2012a) Forecasting project performance considering the influence of leadership style on organizational agility. Int J Product Perform Manag 61(6):653–671. https://doi.org/10.1108/17410401211249201

DeGroote SE, Marx TG (2013) The impact of IT on supply chain agility and firm performance: An empirical investigation. Int J Inf Manag 33(6):909–916

Del Giudice M, Scuotto V, Papa A et al (2021) A self-tuning model for smart manufacturing SMEs: effects on digital innovation. J Prod Innov Manag 38(1):68–89. https://doi.org/10.1111/jpim.12560

Desai VM (2016) The behavioral theory of the (governed) firm: Corporate board influences on organizations’ responses to performance shortfalls. Acad Manag J 59(3):860–879. https://doi.org/10.5465/amj.2013.0948

Dhaigude A, Kapoor R (2017) The mediation role of supply chain agility on supply chain orientation-supply chain performance link. J Decis Syst 26(3):275–293. https://doi.org/10.1080/12460125.2017.1351862

Diaz T, Andrés J, Rivera FIR, Celume SAB, Rivera SAR (2024) Mapping the research about organisations in the latin american context: a bibliometric analysis. Manag Rev Q 74(1):121–169

Doz Y, Kosonen M (2008) The dynamics of strategic agility: Nokia’s rollercoaster experience. Calif Manag Rev 50(3):95–118. https://doi.org/10.2307/41166447

Drury-Grogan ML (2014) Performance on agile teams: Relating iteration objectives and critical decisions to project management success factors. Inf Softw Technol 56(5):506–515. https://doi.org/10.1016/j.infsof.2013.11.003

Dubey R, Singh T, Gupta OK (2015) Impact of agility, adaptability and alignment on humanitarian logistics performance: mediating effect of leadership. Glob Bus Rev 16(5):812–831. https://doi.org/10.1177/0972150915591463

Dyer L, Shafer RA (1998) From human resource strategy to organizational effectiveness: lessons from research on organizational agility. https://ecommons.cornell.edu/server/api/core/bitstreams/ff207b12-5936-48df-899e-3ea1fb07a2f6/content

Eckstein D, Goellner M, Blome C, Henke M (2015) The performance impact of supply chain agility and supply chain adaptability: the moderating effect of product complexity. Int J Prod Res 53(10):3028–3046. https://doi.org/10.1080/00207543.2014.970707

Eisele S, Greven A, Grimm M, Fischer-Kreer D, Brettel M (2022) Understanding the drivers of radical and incremental innovation performance: The role of a firm’s knowledge-based capital and organisational agility. Int J InnovManag. https://doi.org/10.1142/s1363919622500207

El Idrissi M, El Manzani E, Maatalah WA, Lissaneddine Z (2023) Organizational crisis preparedness during the COVID-19 pandemic: an investigation of dynamic capabilities and organizational agility roles. Int J Org Anal 31(1):27–49. https://doi.org/10.1108/ijoa-09-2021-2973

El Shafeey T, Trott P (2014) Resource-based competition: three schools of thought and thirteen criticisms. Eur Bus Rev 26(2):122–148. https://doi.org/10.1108/EBR-07-2013-0096

Engström TE, Westnes P, Westnes SF (2003) Evaluating intellectual capital in the hotel industry. J Intellect Cap 4(3):287–303. https://doi.org/10.1108/14691930310487761

Ettlie JE (1998) R&D and global manufacturing performance. Manag Sci 44(1):1–11. https://doi.org/10.1287/mnsc.44.1.1

Fadaki M, Rahman S, Chan C (2020) Leagile supply chain: design drivers and business performance implications. Int J Prod Res 58(18):5601–5623. https://doi.org/10.1080/00207543.2019.1693660

Fang MJ, Liu F, Xiao SF, Park K (2023) Hedging the bet on digital transformation in strategic supply chain management: a theoretical integration and an empirical test. Int J Phys Dist Logist Manag 53(4):512–531. https://doi.org/10.1108/ijpdlm-12-2021-0545

Feizabadi J, Gligor D, Motlagh SA (2019) The triple-As supply chain competitive advantage. Benchmarking Int J 26(7):2286–2317. https://doi.org/10.1108/BIJ-10-2018-0317

Felipe CM, Leidner DE, Roldan JL, Leal-Rodriguez AL (2020) Impact of IS capabilities on firm performance: the roles of organizational agility and industry technology intensity. Decis Sci 51(3):575–619. https://doi.org/10.1111/deci.12379

Felipe CM, Roldán JL, Leal-Rodríguez AL (2016) An explanatory and predictive model for organizational agility. J Bus Res 69(10):4624–4631. https://doi.org/10.1016/j.jbusres.2016.04.014

Franco C, Landini F (2022) Organizational drivers of innovation: The role of workforce agility. Res Pol 51(2):104423. https://doi.org/10.1016/j.respol.2021.104423

Ganguly A, Talukdar A, Kumar C (2024) Absorptive capacity and disruptive innovation: the mediating role of organizational agility. Ieee Trans Eng Manag 71:3117–3128. https://doi.org/10.1109/tem.2022.3205922

Gligor D, Bozkurt S (2021) The role of perceived social media agility in customer engagement. J Res Interact Mark 15(1):125–146

Gligor DM, Holcomb MC (2012) Antecedents and consequences of supply chain agility: establishing the link to firm performance. J Bus Logist 33(4):295–308. https://doi.org/10.1111/jbl.12003

Gligor D, Bozkurt S, Gölgeci I, Maloni MJ (2020a) Does supply chain agility create customer value and satisfaction for loyal B2B business and B2C end-customers? Int J Phys Distrib 50(7/8):721–743. https://doi.org/10.1108/IJPDLM-01-2020-0004

Gligor D, Esmark CL, Holcomb MC (2015) Performance outcomes of supply chain agility: When should you be agile? J Oper Manag 33–34:71–82. https://doi.org/10.1016/j.jom.2014.10.008

Gligor D, Feizabadi J, Russo I et al (2020b) The triple-a supply chain and strategic resources: developing competitive advantage. Int J Phys Distrib 50(2):159–190. https://doi.org/10.1108/IJPDLM-08-2019-0258

Gligor D, Holcomb M, Maloni MJ, Davis-Sramek E (2019) Achieving financial performance in uncertain times: leveraging supply chain agility. Transp J 58(4):247–279. https://doi.org/10.5325/transportationj.58.4.0247

Gomes E, Sousa CMP, Vendrell-Herrero F (2020) International marketing agility: conceptualization and research agenda. Int Mark Rev 37(2):261–272

Goncalves D, Bergquist M, Bunk R, Alänge S (2020) Cultural aspects of organizational agility affecting digital innovation. Int J Entrepren Innov 16(4):13–46. Retrieved from https://www.ceeol.com/search/article-detail?id=909079

Govuzela S, Mafini C (2019) Organisational agility, business best practices and the performance of small to medium enterprises in South Africa. S Afr J Bus Manag 50(1):1–3. https://doi.org/10.4102/sajbm.v50i1.1417

Gunasekaran A, Yusuf YY, Adeleye EO et al (2019) Agile manufacturing: an evolutionary review of practices. Int J Prod Res 57(15–16):5154–5174

Guo RP, Yin HB, Liu X (2023) Coopetition, organizational agility, and innovation performance in digital new ventures. Ind Mark Manag 111:143–157. https://doi.org/10.1016/j.indmarman.2023.04.003

Gupta S, Kumar S, Kamboj S et al (2019) Impact of IS agility and HR systems on job satisfaction: an organizational information processing theory perspective. J Knowl Manag 23(9):1782–1805. https://doi.org/10.1108/JKM-07-2018-0466

Hadjielias E, Christofi M, Christou P, Drotarova MH (2022) Digitalization, agility, agility, and customer value in tourism. Technol Forecast Soc Ch. https://doi.org/10.1016/j.techfore.2021.121334

Haider SA, Kayani UN (2021) The impact of customer knowledge management capability on project performance-mediating role of strategic agility. J Knowl Manag 25(2):298–312. https://doi.org/10.1108/JKM-01-2020-0026

Haleblian J, Finkelstein S (1993) Top management team size, CEO dominance, and firm performance: The moderating roles of environmental turbulence and discretion. Acad Manag J 36(4):844–863. https://doi.org/10.5465/256761

Hallgren M, Olhager J (2009) Lean and agile manufacturing: external and internal drivers and performance outcomes. Int J Oper Prod Manag 29(10):976–999

Handscomb C, Heyning C, Woxholth J (2019). Giants can dance: Agile organizations in asset-heavy industries. https://www.mckinsey.com/industries/oil-and-gas/our-insights/giants-can-dance-agile-organizations-in-asset-heavy-industries

Harraf A, Wanasika I, Tate K, Talbott K (2015) Organizational agility. J Appl Bus Res 31(2):675–686

Hazen BT, Bradley RV, Bell JE et al (2017) Enterprise architecture: a competence-based approach to achieving agility and firm performance. Int J Prod Econ 193:566–577. https://doi.org/10.1016/j.ijpe.2017.08.022

Helfat CE, Finkelstein S, Mitchell W et al (2009) Dynamic capabilities: Understanding strategic change in organizations. John Wiley & Sons, New Jersey

Hemmati M, Feiz D, Jalilvand MR, Kholghi I (2016) Development of fuzzy two-stage DEA model for competitive advantage based on RBV and strategic agility as a dynamic capability. J Model Manag 11(1):288–308. https://doi.org/10.1108/JM2-12-2013-0067

Heyvaert M, Hannes K, Maes B, Onghena P (2013) Critical appraisal of mixed methods studies. J Mix Methods Res 7(4):302–327. https://doi.org/10.1177/1558689813479449

Highsmith J (2011). History: The Agile Manifesto. https://agilemanifesto.org/

Hillman AJ, Withers MC, Collins BJ (2009) Resource dependence theory: a review. J Manag 35(6):1404–1427. https://doi.org/10.1177/0149206309343469

Hoppmann J, Naegele F, Girod B (2019) Boards as a source of inertia: examining the internal challenges and dynamics of boards of directors in times of environmental discontinuities. Acad Manag J 62(2):437–468. https://doi.org/10.5465/amj.2016.1091

Huma S, Ahmed W (2022) Understanding influence of supply chain competencies when developing Triple-A. Benchmarking Int J 29(9):2757–2779. https://doi.org/10.1108/bij-06-2021-0337

Hwang T, Kim ST (2019) Balancing in-house and outsourced logistics services: effects on supply chain agility and firm performance. Serv Bus 13(3):531–556

Ilmudeen A (2021) Information technology (IT) governance and IT capability to realize firm performance: enabling role of agility and innovative capability. Benchmarking Int J. https://doi.org/10.1108/bij-02-2021-0069

Ismail H, Sharifi H (2006) A balanced approach to building agile supply chains. Int J Phys Distrib 36(6):431–444. https://doi.org/10.1108/09600030610677384

Jabarzadeh Y, Khangah MH, Cemberci M, Cerchione R, Sanoubar N (2022) Effect of absorptive capacity on strategic flexibility and supply chain agility: implications for performance in fast-moving consumer goods. Oper Supply Chain Manag Int J 15(3):407–423

Jacobs M, Droge C, Vickery SK, Calantone R (2011) Product and process modularity’s effects on manufacturing agility and firm growth performance. J Prod Innov Manag 28(1):123–137. https://doi.org/10.1111/j.1540-5885.2010.00785.x

Jing ZC, Zheng Y, Guo HL (2023) A study of the impact of digital competence and organizational agility on green innovation performance of manufacturing firms-the moderating effect based on knowledge inertia. Admin Sci 13(12):250. https://doi.org/10.3390/admsci13120250

Joiner B (2019) Leadership agility for organizational agility. J Creat Value 5(2):139–149. https://doi.org/10.1177/2394964319868321

Joyce P (2021) Public governance, agility and pandemics: a case study of the UK response to COVID-19. Int Rev Adm Sci 87(3):536–555. https://doi.org/10.1177/0020852320983406

Ju X, Ferreira FA, Wang M (2020) Innovation, agile project management and firm performance in a public sector-dominated economy: Empirical evidence from high-tech small and medium-sized enterprises in China. Soci-Econ Plan Sci 72:100779. https://doi.org/10.1016/j.seps.2019.100779

Juan S-J, Li EY, Hung W-H (2021) An integrated model of supply chain resilience and its impact on supply chain performance under disruption. Int J Logist Manag 33(3):339–364. https://doi.org/10.1108/IJLM-03-2021-0174

Kalaignanam K, Tuli KR, Kushwaha T et al (2021) keting agility: The concept, antecedents, and a research agenda. J Mark 85(1):35–58. https://doi.org/10.1177/0022242920952760

Kareem MA, Kummitha HVR (2020) The impact of supply chain dynamic capabilities on operational performance. Organizacija. https://doi.org/10.2478/orga-2020-0021

Khan KS, Kunz R, Kleijnen J, Antes G (2003) Five steps to conducting a systematic review. J R Soc Med 96(3):118–121. https://doi.org/10.1177/014107680309600304

Khan AN (2023) Artificial intelligence and sustainable performance: role of organisational agility and environmental dynamism. Technol Anal Strateg Manag. https://doi.org/10.1080/09537325.2023.2290171

Khan A, Talukder MS, Islam QT, Islam A (2022) The impact of business analytics capabilities on innovation, information quality, agility and firm performance: the moderating role of industry dynamism. Vine J Inf and Knowl Manag Syst. https://doi.org/10.1108/vjikms-01-2022-0027

Khan H (2020) Is marketing agility important for emerging market firms in advanced markets? Int Bus Rev 29(5):101733. https://doi.org/10.1016/j.ibusrev.2020.101733

Khan NA, Ahmed W, Waseem M (2023) Factors influencing supply chain agility to enhance export performance: case of export-oriented textile sector. Rev Int Bus Strateg 33(2):301–316. https://doi.org/10.1108/ribs-05-2021-0068

Kline RB (2015) Principles and practice of structural equation modeling. Guilford Publications, New York

Ko A, Mitev A, Kovács T, Fehér P, Szabo Z (2022) Digital agility, digital competitiveness, and innovative performance of SMEs. J Compet 14(4):78–96. https://doi.org/10.7441/joc.2022.04.05

Kocoglu I, Keskin H, Cemberci M, Civelek ME (2022) Effect of supply chain coordination on performance: a serial mediation model of trust, agility, and collaboration. Int J Inf Sys Supply Chain Manag. https://doi.org/10.4018/ijisscm.287130

Ku ECS, Chen CD (2023) Increasing the organizational performance of online sellers: the powerful back-end management systems. Bus Process Manag J 29(3):838–857. https://doi.org/10.1108/bpmj-11-2022-0562

Kurniawan R, Budiastuti D, Hamsal M, Kosasih W (2020) The impact of balanced agile project management on firm performance: the mediating role of market orientation and strategic agility. Rev Int Bus Strategy 30(4):457–490. https://doi.org/10.1108/RIBS-03-2020-0022

Kurniawan R, Budiastuti D, Hamsal M, Kosasih W (2021a) Networking capability and firm performance: the mediating role of market orientation and business process agility. J Bus Ind Mark 36(9):1646–1664. https://doi.org/10.1108/jbim-01-2020-0023

Kurniawan R, Manurung AH, Hamsal M, Kosasih W (2021b) Orchestrating internal and external resources to achieve agility and performance: the centrality of market orientation. Benchmarking Int J 28(2):517–555. https://doi.org/10.1108/bij-05-2020-0229

Kustyadji G, Windijarto W, Wijayani A (2021) Ambidexterity and leadership agility in micro, small and medium enterprises (MSME)’s performance: an empirical study in Indonesia. J Asian Fin Econ Bus 8(7):303–311. https://doi.org/10.13106/jafeb.2021.vol8.no7.0303

Lago NC, Marcon A, Ribeiro JLD, Olteanu Y, Fichter K (2023) The role of cooperation and technological orientation on startups’ innovativeness: An analysis based on the microfoundations of innovation. Technol Forecast Soc Change. https://doi.org/10.1016/j.techfore.2023.122604

Latan H, Jabbour A, Sarkis J, Jabbour CJC, Ali M (2024) The nexus of supply chain performance and blockchain technology in the digitalization era: Insights from a fast-growing economy. J Bus Res. https://doi.org/10.1016/j.jbusres.2023.114398

Lee O-KD, Xu P, Kuilboer J-P, Ashrafi N (2016) Idiosyncratic values of IT-enabled agility at the operation and strategic levels. Commun Assoc Inf 39:242–266. https://doi.org/10.17705/1cais.03913

Lehn K (2018) Corporate governance, agility, and survival. Int J Econ Bus 25(1):65–72. https://doi.org/10.1080/13571516.2017.1396661

Li X, Goldsby TJ, Holsapple CW (2009) Supply chain agility: scale development. Int J Logist Manag 20(3):408–424. https://doi.org/10.1108/09574090911002841

Li LX, Tong Y, Wei L, Yang SL (2022) Digital technology-enabled dynamic capabilities and their impacts on firm performance: Evidence from the COVID-19 pandemic. Inf & Manag 59(8):103689. https://doi.org/10.1016/j.im.2022.103689

Li L, Lin J, Turel O, Liu P, Luo X (2020) The impact of e-commerce capabilities on agricultural firms’ performance gains: the mediating role of organizational agility.". Ind Manag & Data Syst 120(7):1265–1286. https://doi.org/10.1108/imds-08-2019-0421

Liang XN, Li GX, Zhang H, Nolan E, Chen FD (2022) Firm performance and marketing analytics in the Chinese context: A contingency model. J Bus Res 141:589–599. https://doi.org/10.1016/j.jbusres.2021.11.061

Lin C-T, Chiu H, Chu P-Y (2006) Agility index in the supply chain. Int J Prod Econ 100(2):285–299. https://doi.org/10.1016/j.ijpe.2004.11.013

Liu C-L, Shang K-C, Lirn T-C et al (2018) Supply chain resilience, firm performance, and management policies in the liner shipping industry. Transp Res Part A Policy Pract 110:202–219. https://doi.org/10.1016/j.tra.2017.02.004

Liu H, Ke W, Wei KK, Hua Z (2013) The impact of IT capabilities on firm performance: The mediating roles of absorptive capacity and supply chain agility. Decis Support Syst 54(3):1452–1462. https://doi.org/10.1016/j.dss.2012.12.016

Lopez-Gamero MD, Molina-Azorín JF, Pereira-Moliner J, Pertusa-Ortega EM (2023) Agility, innovation, environmental management and competitiveness in the hotel industry. Corp Soc Responsib Environ Manag 30(2):548–562. https://doi.org/10.1002/csr.2373

Lu Y, Ramamurthy K (2011) Understanding the link between information technology capability and organizational agility: An empirical examination. MIS Quart 34(4):931–954. https://doi.org/10.2307/41409967

Lucia-Palacios L, Bordonaba-Juste V, Polo-Redondo Y, Grünhagen M (2014) E-business implementation and performance: analysis of mediating factors. Internet Res 24(2):223–245. https://doi.org/10.1108/IntR-09-2012-0195

Ly B (2023) The interplay of digital transformational leadership, organizational agility, and digital transformation. J Knowl Econ. https://doi.org/10.1007/s13132-023-01377-8

Mandal S, Dubey RK (2020) Role of tourismITadoption and risk management orientation on tourism agility and resilience: Impact on sustainable tourism supply chain performance. Int J Tour Res 22(6):800–813. https://doi.org/10.1002/jtr.2381

Mandal S (2018) Influence of human capital on healthcare agility and healthcare supply chain performance. J Bus Ind Mark 33(7):1012–1026. https://doi.org/10.1108/jbim-06-2017-0141

Marrucci A, Rialti R, Balzano M (2023) Exploring paths underlying Industry 4.0 implementation in manufacturing SMEs: a fuzzy-set qualitative comparative analysis. Manag Decis. https://doi.org/10.1108/md-05-2022-0644.%3cGotoISI%3e://WOS:000945584800001

Martinez-Sanchez A, Perez-Perez M, Vicente-Oliva S (2019) Absorptive capacity and technology: influences on innovative firms. Manag Res 17(3):250–265. https://doi.org/10.1108/MRJIAM-02-2018-0817

Mataveli M, Calvo JCA, Gil AJ (2022) The role of intellectual capital and service agility in banking service provision: the perspective of Brazilian export companies. Int J Emerging Markets. https://doi.org/10.1108/ijoem-02-2020-0190

McIntosh CN, Edwards JR, Antonakis J (2014) Reflections on partial least squares path modeling. Organ Res Methods 17(2):210–251. https://doi.org/10.1177/1094428114529165

McKinsey & Company. (2020). Agility in the time of Covid 19: Changing your operating model in an age of turbulence. Retrieved from https://www.mckinsey.com/business-functions/people-and-organizational-performance/our-insights/agility-in-the-time-of-covid-19-changing-your-operating-model-in-an-age-of-turbulence .

Meier A, Kock A (2023) Agile R&D units&apos; organisation and its relationship with innovation performance. R & D Manag. https://doi.org/10.1111/radm.12655

Mihardjo LW, Rukmana RA (2019) Customer experience and organizational agility driven business model innovation to shape sustainable development. Pol J Manag Stud 20(1):293–304. https://doi.org/10.17512/pjms.2019.20.1.26

Mikalef P, Pateli A (2017) Information technology-enabled dynamic capabilities and their indirect effect on competitive performance: Findings from PLS-SEM and fsQCA. J Bus Res 70:1–16. https://doi.org/10.1016/j.jbusres.2016.09.004

Mirghafoori SH, Andalib D, Keshavarz P (2017) Developing green performance through supply chain agility in manufacturing industry: a case study approach. Corp Soc Responsib Environ 24(5):368–381. https://doi.org/10.1002/csr.1411

Mourtzis D, Doukas M (2014) Design and planning of manufacturing networks for mass customisation and personalisation: challenges and outlook. Procedia CIRP 19:1–13. https://doi.org/10.1016/j.procir.2014.05.004

Mukhsin M (2023) Supply chain performance as a mediating factor in the effect of supply agility on company performance. Qual Access to Success 24(193):306–313. https://doi.org/10.47750/qas/24.193.34

Mulyana M, Nurhayati T, Putri ERP (2023) Value creation agility on business performance: an empirical study in retail fashion SMEs. J Creating Val. https://doi.org/10.1177/23949643231205839

Naimi MA, Faisal MN, Sobh R, Uddin SF (2020) Antecedents and consequences of supply chain resilience and reconfiguration: an empirical study in an emerging economy. J Enterp Inf Manag 34(6):1722–1745. https://doi.org/10.1108/JEIM-04-2020-0166

Naoui K, Boubker O, El Abdellaoui M (2023) Exploring the influence of IS on collaboration, agility, and performance: The case of the automotive supply chain. Logforum 19(1):15–32. https://doi.org/10.17270/j.Log.2023.779

Narasimhan R, Swink M, Kim SW (2006) Disentangling leanness and agility: an empirical investigation. J Oper Manag 24(5):440–457. https://doi.org/10.1016/j.jom.2005.11.011

Nayal K, Raut RD, Queiroz MM, Priyadarshinee P (2023) Digital supply chain capabilities: mitigating disruptions and leveraging competitive advantage under COVID-19. Ieee Trans Eng Manag. https://doi.org/10.1109/tem.2023.3266151

Nemkova E (2017) The impact of agility on the market performance of born-global firms: an exploratory study of the “Tech City” innovation cluster. J Bus Res. 80:257–265. https://doi.org/10.1016/j.jbusres.2017.04.017

Ngo VM, Vu HM (2021) Can customer relationship management create customer agility and superior firms’ performance? Int J Bus Soc 22(1):175–193. https://doi.org/10.33736/ijbs.3169.2021

Ngo VM, Vu HM (2020) Customer agility and firm performance in the tourism industry. Tourism Int Interdiscip J. 68(1):68–82

Nold H, Michel L (2016) The performance triangle: a model for corporate agility. Leadersh Organ Dev J 37(3):341–356. https://doi.org/10.1108/LODJ-07-2014-0123

Ojha D (2008) Impact of strategic agility on competitive capabilities and financial performance. Clemson University, Clemson

Okoumba WVL, Mafini C, Bhadury J (2020) Supply chain management and organizational performance: Evidence from SMEs in South Africa. Afr J Manag 6(4):295–326. https://doi.org/10.1080/23322373.2020.1830689

Oliveira MA, Dalla Valentina LVO, Possamai O (2012b) Forecasting project performance considering the influence of leadership style on organizational agility. Int J Product Perform 61(6):653–671. https://doi.org/10.1108/17410401211249201

Onngam W, Charoensukmongkol P (2023) Effect of social media agility on performance of small and medium enterprises: moderating roles of firm size and environmental dynamism. J Entrepren Emerging Econ. https://doi.org/10.1108/jeee-11-2022-0331

Overby E, Bharadwaj A, Sambamurthy V (2006) Enterprise agility and the enabling role of information technology. Eur J Inf Syst 15(2):120–131. https://doi.org/10.1057/palgrave.ejis.3000600

Panda S (2021) Strategic IT-business alignment capability and organizational performance: roles of organizational agility and environmental factors. J Asia Bus Stud. https://doi.org/10.1108/jabs-09-2020-0371

Panigrahi RR, Jena D, Meher JR, Shrivastava AK (2023) Assessing the impact of supply chain agility on operational performances-a PLS-SEM approach. Measuring Bus Excel 27(1):1–24. https://doi.org/10.1108/mbe-06-2021-0073

Pantouvakis A, Bouranta N (2017) Agility, organisational learning culture and relationship quality in the port sector. Total Qual Manag Bus 28(3–4):366–378. https://doi.org/10.1080/14783363.2015.1084871

Park S, Braunscheidel MJ, Suresh NC (2023) The performance effects of supply chain agility with sensing and responding as formative capabilities. J Manuf Technol Manag 34(5):713–734. https://doi.org/10.1108/jmtm-09-2022-0328

Patel BS, Sambasivan M (2021) A systematic review of the literature on supply chain agility. Manag Res Rev 45(2):236–260. https://doi.org/10.1108/MRR-09-2020-0574

Pereira V, Budhwar P, Temouri Y et al (2021) Investigating Investments in agility strategies in overcoming the global financial crisis-The case of Indian IT/BPO offshoring firms. J Int Manag 27(1):100738. https://doi.org/10.1016/j.intman.2020.100738

Pereira V, Mellahi K, Temouri Y et al (2019) Investigating dynamic capabilities, agility and knowledge management within EMNEs-longitudinal evidence from Europe. J Knowl Manag 23(9):1708–1728. https://doi.org/10.1108/JKM-06-2018-0391

Perera S, Soosay C, Sandhu S (2019) Investigating the strategies for supply chain agility and competitiveness. Asian J Bus Account 12(1):279–312

Pramono CA, Manurung AH, Heriyati P, Kosasih W (2021) Factors affecting start-up behavior and start-up performance during the COVID-19 pandemic in Indonesia. J Asian Finance Econ Bus 8(4):809–817. https://doi.org/10.13106/jafeb.2021.vol8.no4.0809

Puriwat W, Hoonsopon D (2021) Cultivating product innovation performance through creativity: the impact of organizational agility and flexibility under technological turbulence. J Manuf Technol Manag 33(4):741–762. https://doi.org/10.1108/JMTM-10-2020-0420

Queiroz MM, Wamba SF, Raut RD, Pappas IO (2023) Does resilience matter for supply chain performance in disruptive crises with scarce resources? British J Manag. https://doi.org/10.1111/1467-8551.12748

Queiroz MM, Tallon PP, Sharma R, Coltman T (2018) The role of IT application orchestration capability in improving agility and performance. T Strateg Inf Syst 27(1):4–21. https://doi.org/10.1016/j.jsis.2017.10.002

Qureshi F, Ellahi A, Javed Y, Rehman M, Rehman HM (2023) Empirical investigation into impact of IT adoption on supply chain agility in fast food sector in Pakistan. Cogent Bus Manag. https://doi.org/10.1080/23311975.2023.2170516

Rafi N, Ahmed A, Shafique I, Kalyar MN (2021) Knowledge management capabilities and organizational agility as liaisons of business performance. South Asian J Bus Stud 11(4):397–417. https://doi.org/10.1108/SAJBS-05-2020-0145

Rahman AU, Wen FH, Amjad F, Ullah R (2023) Exploring the impact of crowdfunding and collaborations on firm survival through crisis management in the context of Pakistan. Technol Anal Strateg Manag. https://doi.org/10.1080/09537325.2023.2299350

Ramos E, Patrucco AS, Chavez M (2021) Dynamic capabilities in the “new normal”: a study of organizational flexibility, integration and agility in the Peruvian coffee supply chain. Supply Chain Manag J. https://doi.org/10.1108/scm-12-2020-0620

Reed JH (2021) Strategic agility and the effects of firm age and environmental turbulence. J Strateg Manag 14(2):129–149. https://doi.org/10.1108/jsma-07-2020-0178

Rigby D, Sutherland J, Takeuchi H (2016) Embracing Agile – How to master the process that’s transforming management. Harvard Bus Rev 50(May):40–48

Rigdon EE, Sarstedt M, Ringle CM (2017) On comparing results from CB-SEM and PLS-SEM: Five perspectives and five recommendations. Marketing: ZFP–J Res Manag 39(3):4–16. https://doi.org/10.15358/0344-1369-2017-3-4

Riquelme-Medina M, Stevenson M, Barrales-Molina M, Llorens-Montes FJ (2022) Coopetition in business Ecosystems: The key role of absorptive capacity and supply chain agility. J Bus Res 146:464–476. https://doi.org/10.1016/j.jbusres.2022.03.071

Roberts N, Grover V (2012) Investigating firm’s customer agility and firm performance: The importance of aligning sensing and responding capabilities. J Bus Res 65(5):579–585. https://doi.org/10.1016/j.jbusres.2011.02.009

Rönkkö M, McIntosh C, Antonakis J, Edwards J (2016) Partial least squares path modeling: Time for some serious second thoughts. J Oper Manag 47–48:9–27. https://doi.org/10.1016/j.jom.2016.05.002

Roth AV (1996) Achieving strategic agility through economies of knowledge. Plan Rev 24(2):30–36. https://doi.org/10.1108/eb054550

Saeed KA, Malhotra MK, Abdinnour S (2019) How supply chain architecture and product architecture impact firm performance: An empirical examination. J Purchasing Supply Manag 25(1):40–52. https://doi.org/10.1016/j.pursup.2018.02.003

Salimi M, Nazarian A (2022) The effect of organisational agility as mediator in the relationship between knowledge management, and competitive advantage and innovation in sport organisations. Int J Knowl Manag Stud 13(3):231–256. https://doi.org/10.1504/ijkms.2022.123712

Sambamurthy V, Bharadwaj A, Grover V (2003) Shaping agility through digital options: Reconceptualizing the role of information technology in contemporary firms. MIS Quart 27(2):237–263. https://doi.org/10.2307/30036530

Sameer SK (2022) The Interplay of digitalization, organizational support, workforce agility and task performance in a blended working environment: evidence from Indian public sector organizations. Asian Bus Manag. https://doi.org/10.1057/s41291-022-00205-2

Sarstedt M, Hair JF, Ringle CM et al (2016) Estimation issues with PLS and CBSEM: Where the bias lies! J Bus Res 69(10):3998–4010. https://doi.org/10.1016/j.jbusres.2016.06.007

Setia P, Patel PC (2013) How information systems help create OM capabilities: Consequents and antecedents of operational absorptive capacity. J Oper Manag 31(6):409–431. https://doi.org/10.1016/j.jom.2013.07.013

Sharif SMF, Yang ND, Rehman AU, Alghamdi O (2022) Sustaining innovation during downsizing strategy through knowledge coupling, business process digitization, and market capitalizing agility. Australian J Manag. https://doi.org/10.1177/03128962221142435

Sharifi H, Zhang Z (1999) A methodology for achieving agility in manufacturing organisations: An introduction. Int J Prod Econ 62(1–2):7–22. https://doi.org/10.1016/s0925-5273(98)00217-5

Sheel A, Nath V (2019) Effect of blockchain technology adoption on supply chain adaptability, agility, alignment and performance. Manag Res Rev 42(12):1353–1374. https://doi.org/10.1108/mrr-12-2018-0490

Sheng H, Feng T, Chen L, Chu D (2021) Operational coordination and mass customization capability: the double-edged sword effect of customer need diversity. Int J Logist Manag 33(1):289–310. https://doi.org/10.1108/IJLM-11-2020-0417

Sherehiy B, Karwowski W, Layer J (2007) A Review of Enterprise Agility: Concepts, Frameworks, and Attributes. Int J Ind Ergon 37:445–460. https://doi.org/10.1016/j.ergon.2007.01.007

Shin H, Lee J-N, Kim D, Rhim H (2015) Strategic agility of Korean small and medium enterprises and its influence on operational and firm performance. Int J Prod Econ 168:181–196. https://doi.org/10.1016/j.ijpe.2015.06.015

Shuradze G, Bogodistov Y, Wagner H-T (2018) The role of marketing-enabled data analytics capability and organisational agility for innovation: empirical evidence from German firms. Int J Innov Manag 22(04):1850037. https://doi.org/10.1142/S1363919618500378

Snyder H (2019) Literature review as a research methodology: An overview and guidelines. J Bus Res 104:333–339. https://doi.org/10.1016/j.jbusres.2019.07.039

Stei G, Rossmann A, Szász L (2024) Leveraging organizational knowledge to develop agility and improve performance: the role of ambidexterity. Int J Oper Prod Manag. https://doi.org/10.1108/ijopm-04-2023-0274

Sturm S, Hohenstein N-O, Birkel H et al (2021) Empirical research on the relationships between demand-and supply-side risk management practices and their impact on business performance. Supply Chain Manag Int L ahead print. https://doi.org/10.1108/SCM-08-2020-0403

Swafford PM, Ghosh S, Murthy NN (2006) A framework for assessing value chain agility. Int J Oper Prod Manag 26(1–2):118–140. https://doi.org/10.1108/01443570610641639

Tallon PP, Pinsonneault A (2011) Competing perspectives on the link between strategic information technology alignment and organisational agility: insights from a mediation model. MIS Quart 35(2):463–486. https://doi.org/10.2307/23044052

Tallon PP, Queiroz M, Coltman T, Sharma R (2019) Information technology and the search for organizational agility: A systematic review with future research possibilities. J Strated Inf Syst 28(2):218–237. https://doi.org/10.1016/j.jsis.2018.12.002

Tallon PP (2008) Inside the adaptive enterprise: an information technology capabilities perspective on business process agility. Inf Technol Manag 9(1):21–36. https://doi.org/10.1007/s10799-007-0024-8

Teece DJ, Pisano G, Shuen A (1997) Dynamic capabilities and strategic management. Strateg Manag J 18(7):509–533. https://doi.org/10.1002/(SICI)1097-0266199708

Teo TS, Pian Y (2003) A contingency perspective on Internet adoption and competitive advantage. Eur J Inf Syst 12(2):78–92. https://doi.org/10.1057/palgrave.ejis.3000448

Tranfield D, Denyer D, Smart P (2003) Towards a methodology for developing evidence-informed management knowledge by means of systematic review. Br J Manag 14(3):207–222. https://doi.org/10.1111/1467-8551.00375

Truscott DM, Swars S, Smith S et al (2010) A cross-disciplinary examination of the prevalence of mixed methods in educational research: 1995–2005. Int J Soc Res Methods 13(4):317–328. https://doi.org/10.1080/13645570903097950

Tse YK, Zhang M, Akhtar P, MacBryde J (2016) Embracing supply chain agility: an investigation in the electronics industry. Supply Chain Manag Int J 21(1):140–156. https://doi.org/10.1108/scm-06-2015-0237

Tsou H-T, Cheng CC (2018) How to enhance IT B2B service innovation? An integrated view of organizational mechanisms. J Bus Ind Mark 33(7):984–1000. https://doi.org/10.1108/JBIM-07-2017-0175

Turi JA, Khwaja MG, Tariq F, Hameed A (2023) The role of big data analytics and organizational agility in improving organizational performance of business processing organizations. Bus Process Manag J 29(7):2081–2106. https://doi.org/10.1108/bpmj-01-2023-0058

Um J (2017) The impact of supply chain agility on business performance in a high level customization environment. Oper Manag Res 10(1–2):10–19. https://doi.org/10.1007/s12063-016-0120-1

Vaculík M, Lorenz A, Roijakkers N, Vanhaverbeke W (2018) Pulling the plug? investigating firm-level drivers of innovation project termination. IEEE Trans Eng Manag 66(2):180–192. https://doi.org/10.1109/TEM.2018.2798922

Vázquez-Bustelo D, Avella L, Fernández E (2007) Agility drivers, enablers and outcomes: empirical test of an integrated agile manufacturing model. Int J Oper Prod Manag 27(12):1303–1332. https://doi.org/10.1108/01443570710835633

Vickery S, Droge C, Setia P, Sambamurthy V (2010) Supply chain information technologies and organisational initiatives: complementary versus independent effects on agility and firm performance. Int J Oper Prod Manag 48(23):7025–7042. https://doi.org/10.1080/00207540903348353

Vokurka RJ, Fliedner G (1998) The journey toward agility. Ind Manag & Data Syst 98(4):165–171. https://doi.org/10.1108/02635579810219336

Vrontis D, Belas J, Thrassou A, Santoro G, Christofi M (2023) Strategic agility, openness and performance: a mixed method comparative analysis of firms operating in developed and emerging markets. Rev Manag Sci 17(4):1365–1398. https://doi.org/10.1007/s11846-022-00562-4

Walter A-T (2021) Organizational agility: ill-defined and somewhat confusing? A systematic literature review and conceptualization. Manag Rev Quart 71(2):343–391. https://doi.org/10.1007/s11301-020-00186-6

Wamba SF, Akter S (2019) Understanding supply chain analytics capabilities and agility for data-rich environments. Int J Oper Prod Manag 39(6/7/8):887–912. https://doi.org/10.1108/IJOPM-01-2019-0025

Wang Y, Ali Z (2021) Exploring big data use to predict supply chain effectiveness in Chinese organizations: a moderated mediated model link. Asia Pac Bus Rev. https://doi.org/10.1080/13602381.2021.1920704

Whitten GD, Green KW, Zelbst PJ (2012) Triple-A supply chain performance. Int J Oper Prod Manag 32(1):28–48. https://doi.org/10.1108/01443571211195727

Wieland A, Wallenburg CM (2012) Dealing with supply chain risks Linking risk management practices and strategies to performance. Int J Phys Distrib 42(10):887–905. https://doi.org/10.1108/09600031211281411

Wooldridge J (2010) Econometric analysis of cross section and panel data. MIT Press, Cambridge

Xiao Y, Watson M (2019) Guidance on conducting a systematic literature review. J Planning Edu Res 39(1):93–112

Yang C, Liu H-M (2012) Boosting firm performance via enterprise agility and network structure. Manag Decis 50(6):1022–1044. https://doi.org/10.1108/00251741211238319

Yang H, Hao YL, Zhao FR (2023) Assessment and analysis of the role of green human resource on agile innovation management in small- and medium-sized enterprises of digital technologies: the case of Asian economies. J Knowl Econ. https://doi.org/10.1007/s13132-023-01454-y

Yaseen SG, El Refae GA, Dajani DM, Ghanem AA (2021) Conflict management styles and innovation performance: the mediating role of organizational agility. Int J Hum Cap Inf 12(4):31–45. https://doi.org/10.4018/ijhcitp.2021100103

Yildiz T, Aykanat Z (2021) The mediating role of organizational innovation on the impact of strategic agility on firm performance. World J Entrepren Manag Sustain Dev. https://doi.org/10.1108/wjemsd-06-2020-0070

Yusuf YY, Sarhadi M, Gunasekaran A (1999) Agile manufacturing: The drivers, concepts and attributes. Int J Prod Econ 62(1–2):33–43. https://doi.org/10.1016/S0925-5273(98)00219-9

Zahoor N, Khan H, Donbesuur F, Khan Z, Rajwani T (2023a) Grand challenges and emerging market small and medium enterprises: The role of strategic agility and gender diversity. J Product Innov Manag. https://doi.org/10.1111/jpim.12661

Zahoor N, Khan H, Shamim S, Puthusserry P (2023b) Examining the microfoundations for digital business model innovation of developing markets international new ventures. Ieee Trans Eng Manag. https://doi.org/10.1109/tem.2023.3273028

Zhang Z, Sharifi H (2000) A methodology for achieving agility in manufacturing organisations. Int J Oper Prod Manag 20(3–4):496–512. https://doi.org/10.1108/01443570010314818

Zhang M, Wang Y, Olya H (2022) Shaping social media analytics in the pursuit of organisational agility: a real options theory perspective. Tourism Manag. https://doi.org/10.1016/j.tourman.2021.104415

Zhang YM, Gu MH, Huo BF (2023) Antecedents and consequences of supply chain agility: a competence-capability-performance paradigm. J Bus Ind Mark 38(5):1087–1100. https://doi.org/10.1108/jbim-05-2021-0262

Zheng H, Dai J, Li BY, Shou YY (2023) The impact of anticipation of new technologies on operational and environmental performance: a strategy-structure-capabilities-performance perspective. Int J Logist-Res Appl. https://doi.org/10.1080/13675567.2023.2260312

Zhou HD, Wang Q, Li LX, Teo TSH, Yang SL (2023) Supply chain digitalization and performance improvement: a moderated mediation model. Supply Chain Manag Int J 28(6):993–1008. https://doi.org/10.1108/scm-11-2022-0434

Zhou J, Mavondo FT, Saunders SG (2019) The relationship between marketing agility and financial performance under different levels of market turbulence. Ind Mark Manag 83:31–41. https://doi.org/10.1016/j.indmarman.2018.11.008

Zhu M, Gao H (2021) The antecedents of supply chain agility and their effect on business performance: an organizational strategy perspective. Oper Manag Res 14(1):166–176. https://doi.org/10.1007/s12063-020-00174-9

Download references

Open Access funding enabled and organized by CAUL and its Member Institutions. This research is funded by the University of Economics and Law, Vietnam National University Ho Chi Minh City/ VNU-HCM under the grant C2021-34-01.

Author information

Authors and affiliations.

Curtin Singapore - an offshore campus of Curtin University, Australia https://curtin.edu.sg/

Tien Nguyen

University of Economics and Law, Ho Chi Minh City, Vietnam

Cat Vi Le, Minh Nguyen, Gam Nguyen, Tran Thi Hong Lien & Oanh Nguyen

Vietnam National University, Ho Chi Minh City, Vietnam

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Tran Thi Hong Lien .

Ethics declarations

Conflict of interest.

The authors declare that we have no conflict of interest in completing this research.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Nguyen, T., Le, C.V., Nguyen, M. et al. The organisational impact of agility: a systematic literature review. Manag Rev Q (2024). https://doi.org/10.1007/s11301-024-00446-9

Download citation

Received : 02 January 2023

Accepted : 21 May 2024

Published : 21 June 2024

DOI : https://doi.org/10.1007/s11301-024-00446-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Organisational agility
  • Organisational outcomes
  • Systematic literature review
  • Strategic agility
  • Supply chain agility
  • Manufacturing agility

JEL Classification

  • Find a journal
  • Publish with us
  • Track your research
  • Open access
  • Published: 15 June 2024

Title and abstract screening for literature reviews using large language models: an exploratory study in the biomedical domain

  • Fabio Dennstädt   ORCID: orcid.org/0000-0002-5374-8720 1 , 3 ,
  • Johannes Zink 2 ,
  • Paul Martin Putora 1 , 3 ,
  • Janna Hastings 4 , 5 , 6 &
  • Nikola Cihoric 3  

Systematic Reviews volume  13 , Article number:  158 ( 2024 ) Cite this article

426 Accesses

1 Altmetric

Metrics details

Systematically screening published literature to determine the relevant publications to synthesize in a review is a time-consuming and difficult task. Large language models (LLMs) are an emerging technology with promising capabilities for the automation of language-related tasks that may be useful for such a purpose.

LLMs were used as part of an automated system to evaluate the relevance of publications to a certain topic based on defined criteria and based on the title and abstract of each publication. A Python script was created to generate structured prompts consisting of text strings for instruction, title, abstract, and relevant criteria to be provided to an LLM. The relevance of a publication was evaluated by the LLM on a Likert scale (low relevance to high relevance). By specifying a threshold, different classifiers for inclusion/exclusion of publications could then be defined. The approach was used with four different openly available LLMs on ten published data sets of biomedical literature reviews and on a newly human-created data set for a hypothetical new systematic literature review.

The performance of the classifiers varied depending on the LLM being used and on the data set analyzed. Regarding sensitivity/specificity, the classifiers yielded 94.48%/31.78% for the FlanT5 model, 97.58%/19.12% for the OpenHermes-NeuralChat model, 81.93%/75.19% for the Mixtral model and 97.58%/38.34% for the Platypus 2 model on the ten published data sets. The same classifiers yielded 100% sensitivity at a specificity of 12.58%, 4.54%, 62.47%, and 24.74% on the newly created data set. Changing the standard settings of the approach (minor adaption of instruction prompt and/or changing the range of the Likert scale from 1–5 to 1–10) had a considerable impact on the performance.

Conclusions

LLMs can be used to evaluate the relevance of scientific publications to a certain review topic and classifiers based on such an approach show some promising results. To date, little is known about how well such systems would perform if used prospectively when conducting systematic literature reviews and what further implications this might have. However, it is likely that in the future researchers will increasingly use LLMs for evaluating and classifying scientific publications.

Peer Review reports

Systematic literature reviews (SLRs) summarize knowledge about a specific topic and are an essential ingredient for evidence-based medicine. Performing an SLR involves a lot of effort, as it requires researchers to identify, filter, and analyze substantial quantities of literature. Typically, the most relevant out of thousands of publications need to be identified for the topic and key information needs to be extracted for the synthesis. Some estimates indicate that systematic reviews typically take several months to complete [ 1 , 2 ], which is why the latest evidence may not always be taken into consideration.

Title and abstract screening forms a considerable part of the systematic reviewing workload. In this step, which typically follows defining a search strategy and precedes the full-text screening of a smaller number of search results, researchers determine whether a certain publication is relevant for inclusion in the systematic review based on title and abstract. Automating title and abstract screening has the potential to save time and thereby accelerate the translation of evidence into practice. It may also make the reviewing methodology more consistent and reproducible. Thus, the automation or semi-automation of this part of the reviewing workflow has been of longstanding interest [ 3 , 4 , 5 ].

Several approaches have been developed that use machine learning (ML) to automate or semi-automate screening [ 1 , 6 ]. For example, systematic review software applications such as Covidence [ 7 ] and EPPI-Reviewer [ 8 ] (which use the same algorithm) offer ML-assisted ranking algorithms that aim to show the most relevant publications for the search criteria higher in the reviewing to speed up the manual review process. Elicit [ 9 ] is a standalone literature discovery tool that also offers an ML-assisted literature search facility. Furthermore, several dedicated tools have been developed to specifically automate title and abstract screening [ 1 , 10 ]. Examples include Rayyan [ 11 ], DistillerSR [ 12 ], Abstrackr [ 13 ], RobotAnalyst [ 14 ], and ASReview [ 5 ]. These tools typically work via different technical strategies drawn from ML and topic modeling to enable the system to learn how similar new articles are to a core set of identified ‘good’ results for the topic. These approaches have been found to lead to a considerable reduction in the time taken to complete systematic reviews [ 15 ].

Most of these systems require some sort of pre-selection or specific training for the larger corpus of publications to be analyzed (e.g., identification of some “relevant” publications by a human so that the algorithm can select similar papers) and are thus not fully automated.

Furthermore, dedicated models are required that are built for the specific purpose together with appropriate training data. Fully automated systems that achieve high levels of performance and can be flexibly applied to various topics have not yet been realized.

Large language models (LLMs) are an approach to natural language processing in which very large-scale neural networks are trained on vast amounts of textual data to generate sequences of words in response to input text. These capable models are then subject to different strategies for additional training to improve their performance on a wide range of tasks. Recent technological advancements in model size, architecture, and training strategies have led to general-purpose dialog LLMs achieving and exceeding state-of-the-art performance on many benchmark tasks including medical question answering [ 16 ] and text summarization [ 17 ].

Recent progress in the development of LLMs led to very capable models. While models developed by private companies such as GPT-3/GPT-3.5/GPT-4 from OpenAI [ 18 ] or PaLM and Gemini from Google [ 19 , 20 ] are among the most powerful LLMs currently available, openly available models are actively being developed by different stakeholders and in some cases achieve performances not far from the state of the art [ 21 ].

LLMs have shown remarkable capabilities in a variety of subjects and tasks that would require a profound understanding of text and knowledge for a human to perform. Among others, LLMs can be used for classification [ 22 ], information extraction [ 23 ], and knowledge access [ 24 ]. Furthermore, they can be flexibly adapted via prompt engineering techniques [ 25 ] and parameter settings, to behave in a desired way. At the same time, considerable problems with the usage of LLMs such as “hallucinations” of models [ 26 ], inherent biases [ 27 , 28 ], and weak alignment with human evaluation [ 29 ] have been described. Therefore, even though the text output generated by LLMs is based on objective statistical calculations, the text output itself is not necessarily factual and correct and furthermore incorporates subjectivity based on the training data. This implies, that an LLM-based evaluation system has a priori some fundamental limitations. However, using LLMs for evaluating scientific publications is a novel and interesting approach that may be helpful in creating fully automated and still flexible systems for screening and evaluating scientific literature.

To investigate whether and how well openly available LLMs can be used for evaluating the relevance of publications as part of an automated title and abstract screening system, we conducted a study to evaluate the performance of such an approach in the biomedical domain with modern openly available LLMs.

Using LLMs for title and abstract screening

We designed an approach for evaluating the relevance of publications based on title and abstract using an LLM. This approach is based on the following strategy:

An instruction prompt to evaluate the relevance of a scientific publication for inclusion into an SLR is given to an LLM.

The prompt includes the title and abstract of the publication and the criteria that are considered relevant.

The prompt furthermore includes the request to return just a number as an answer, which corresponds to the relevance of the publication on a Likert scale (“not relevant” to “highly relevant”).

The prompt for each publication is created in a structured and automated way.

A numeric threshold may be defined which separates relevant publications from irrelevant publications (corresponding to the definition of a classifier).

The prompts are created in the following way:

Prompt = [Instruction] + [Title of publication] + [Abstract of publication] + [Relevant Criteria ] .

(“ + ” is not part of the final prompt but indicates the merge of the text strings).

[Instruction] is the text string describing the general instruction for the LLM to evaluate the publication. The LLM is asked to evaluate the relevance of a publication for an SLR on a numeric scale (low relevance to high relevance) based on the title and abstract of the publication and based on defined relevant criteria.

[Title of publication] is the text string “Title:” together with the title of the publication.

[Abstract of publication] is the text string “, Abstract:” together with the abstract of the publication.

[Relevant Criteria] is the text that describes the criteria to evaluate the relevance of a publication. The relevant criteria are defined beforehand by the researchers depending on the topic to determine which publications are relevant. The [Relevant Criteria] text string remains unchanged for all the publications that should be checked for relevance.

The answer to the LLM usually consists just of a digit on a numeric scale (e.g., 1–5). However, variations are acceptable if the answer can unambiguously be assigned to one of the possible scores on the Likert scale (e.g., the answer “The relevance of the publication is 3.” can unambiguously be assigned to the score 3). This assignment of answers to a score can be automated with a string-search command, meaning a simple regular expression command searching for a positive integer number, which will be extracted from the text string.

A request is sent to the LLM for each publication in the corpus. In cases for which an LLM provided an invalid (unprocessable) response for a publication, that response was excluded from the direct downstream analysis. It was determined for how many publications invalid responses were given and how many of these publications would have been relevant.

A schematic illustration of the approach is shown in Fig.  1 . An example of a prompt is provided in Supplementary material 1: Appendix 1.

figure 1

Schematic illustration of the LLM-based approach for evaluating the relevance of a scientific publication. In this example, a 1–5 scale and a 3 + classifier are used

A Python script was created to automate the process and to apply it to a data set with a collection of different publications.

With the publications being sorted into different relevance groups, a threshold can be defined, which is used by a classifier to separate relevant from irrelevant publications. For example, a 3 + classifier would classify publications with a score of ≥ 3 as relevant, and publications with a score < 3 as irrelevant.

The performance of the approach was tested with different LLMs, data sets and settings as described in the following:

Language models

A variety of different models were tested. To investigate the approach with different LLMs (that are also diverse regarding design and training data), the following four models were used in the experiments:

FlanT5-XXL (FlanT5) is an LLM developed by Google Research. It’s a variant of the T5 (text-to-text) model, that utilizes a unified text-to-text framework allowing it to perform a wide range of NLP tasks with the same model architecture, loss function, and hyperparameters. FlanT5 is a variant that was enhanced through fine-tuning over a thousand additional tasks and supporting more languages. It is primarily used for research in various areas of natural language processing, such as reasoning and question-answering [ 30 , 31 ].

OpenHermes-2.5-neural-chat-7b-v3-1-7B (OHNC) [ 32 ] is a powerful open-source LLM, which was merged from the two models OpenHermes 2.5 Mistral 7B [ 33 ] and Neural-Chat (neural-chat-7b-v3-1) [ 34 ]. Despite having only 7 billion parameters it performs better than some larger models on various benchmarks.

Mixtral-8 × 7B-Instruct v0.1 (Mixtral) is a pretrained generative Sparse Mixture of Experts LLM developed by Mistral AI [ 35 , 36 ]. It was reported to outperform powerful models like gpt-3.5-turbo, Claude-2.1, Gemini Pro, and Llama 2 70B-chat on human benchmarks.

Platypus2-70B-Instruct (Platypus 2) is a powerful language model with 70 Billion parameters [ 37 ]. The model itself is a merge of the models Platypus2-70B and SOLAR-0-70b-16bit (previously published as LLaMa-2-70b-instruct-v2) [ 38 ].

Published data sets

A list of several data sets for SLRs is provided to the public by the research group of the ASReview tool [ 39 ]. The list contains data sets on a variety of different biomedical subjects of previously published SLRs. For testing the LLM approach on an individual data set, the [Relevant Criteria] string for each data set was created based on the description in the publication of the corresponding SLR. We tested the approach on a total of ten published data sets covering different biomedical topics (Table  1 , Supplementary material 2: Appendix 2).

Newly created data set on CDSS in radiation oncology

To test the approach also in a prospective setting on a not previously published review, we created a data set for a new, hypothetical SLR, for which title and abstract screening should be performed.

The use case was an SLR on “Clinical Decision Support System (CDSS) tools for physicians in radiation oncology”. A CDSS is an information technology system developed to support clinical decision-making. This general definition may include diagnostic tools, knowledge bases, prognostic models, or patient decision aids [ 50 ]. We decided that the hypothetical SLR should be only about software-based systems to be used by clinicians for decision-making purposes in radiation oncology. We defined the following criteria for the [Relevant Criteria] text of the provided prompt:

Only inclusion of original articles, exclusion of review articles.

Publications examining one or several clinical decision-support systems relevant to radiation therapy.

Decision-support systems are software-based.

Exclusion of systems intended for support of non-clinicians (e.g., patient decision aids).

Publications about models (e.g., prognostic models) should only be included if the model is intended to support clinical decision-making as part of a software application, which may resemble a clinical decision support system.

The following query was used for searching relevant publications on PubMed: “(clinical decision support system) AND (radiotherapy OR radiation therapy)”.

Titles and abstracts of all publications found with the query were collected. A human-based title and abstract screening was performed to obtain the ground truth data set. Two researchers (FD and NC) independently labeled the publications as relevant/not relevant based on the title and abstract and based on the [Relevant criteria] string. The task was to label those publications relevant that may be of interest and should be analyzed as full text, while all other publications should be labeled irrelevant. After labeling all publications, some of the publications were deemed relevant only by one of the two researchers. To obtain a final decision, a third researcher (PMP) independently did the labeling for the undecided cases.

The aim was to create a human-based data set purely representing the process of title and abstract screening without further information or analysis.

A manual title and abstract screening was conducted on 521 publications identified in the search with 36 publications being identified as relevant and labeled accordingly in the data set. This data set was named “CDSS_RO”. It should be noted that this data set is qualitatively different from the 10 published data sets, as not only the publications that may be finally included in an SLR are labeled as relevant, but all publications that should be analyzed in full text based on title and abstract. The file is provided at https://github.com/med-data-tools/title-abstract-screening-ai ).

Parameters and settings of LLM-based title and abstract screening

Standard parameters.

The LLM-based title and abstract screening as described above requires the definition of some parameters. The standard settings for the approach were the following:

[Instruction] string: We used the following standard [Instruction] string:

“On a scale from 1 (very low probability) to X (very high probability), how would you rate the relevance of the following scientific publication to be included in a systematic literature review based on the relevant criteria and based on title and abstract?”

Range of scale: defines the range of the Likert scale mentioned in the [Instruction] string (marked as X in the standard string above). For the standard settings, a value of 5 was used.

Model parameters of the LLMs were defined in the source code. To obtain reproducible results, the model parameters were set accordingly for the model to become deterministic (e.g., the temperature value is a parameter that defines how much variation a response of a model should have. Values greater than 0 add a random element to the output, which should be avoided for the reproducibility of the LLM-based title and abstract screening).

Adaptation of instruction prompt and range

The behavior of an LLM is highly dependent on the provided prompt. Adequate adaptation of the prompt may be used to improve the performance of an LLM for certain tasks [ 25 ]. To investigate what impact a slightly adapted version of the Instruction prompt would have on the results, we added the string “(Note: Give a low score if not all criteria are fulfilled. Give only a high score if all or almost all criteria are fulfilled.)” in the instruction prompt as additional instruction and examined the impact on the performance. Furthermore, the range of the scale was changed from 1–5 to 1–10 in some experiments to investigate what impact this would have on the performance.

Statistical analyses

The performance of the approach, depending on models and threshold, was determined by calculating the sensitivity (= recall), specificity, accuracy, precision, and F1-score of the system, based on the amount of correctly and incorrectly included/excluded publications for each data set.

Comparison with the automated classifier of Natukunda et al.

The LLM-based title and abstract screening was compared to another, recently published approach for fully automated title and abstract screening. This approach, developed by Natukunda et al., uses an unsupervised Latent Dirichlet Allocation-based topic model for screening [ 51 ]. Unlike the LLM-based approach, it does not require an additional [Relevant Criteria] string, but defined search keywords to determine which publications are relevant. The approach was used to do a screening on the ten published data sets as well as on the CDSS_RO data set. To obtain the required keywords we processed the text of the used search terms by splitting combined text into individual words and removing stop words, duplicates, and punctuation (as described in the original publication of Natukunda et al.).

Performance of LLM-based title and abstract screening of different models on published data sets

The LLM-based screening with a Likert scale of 1–5 provided clear results for evaluating the relevance of a publication in the majority of cases. Out of the total of 44,055 publications among the 10 published data sets, valid and unambiguously assignable answers were given for 44,055 publications (100%) by the FlanT5 model, for 44,052 publications (99.993%) by the OHNC model, for 44,026 publications (99.93%) by the Mixtral model and for 44,054 publications (99.998%) by the Platypus 2 model. The few publications for which an invalid answer was given were excluded from further analysis. None of the excluded publications was relevant. The distribution of scores given was different between the different models. For example, the OHNC model ranked the majority of publications with a score of 3 (47.2%) or 4 (34.2%), while the FlanT5 model ranked almost all publications with a score of either 4 (68.1%) or 2 (31.7%). For all models, the group of publications labeled as relevant in the data sets was ranked with higher scores compared to the overall group of publications (mean score of 3.89 compared to 3.38 for FlanT5, 3.86 compared to 3.14 for OHNC, 4.16 compared to 2.12 for Mixtral and 3.80 compared to 2.92 for Platypus 2). An overview is provided in Fig.  2 .

figure 2

Distribution of scores given by the different models

Based on the scores given, according classifiers that label publications with a score of greater than or equal to “X” as relevant, have higher rates of sensitivity and lower rates of specificity with decreasing threshold (decreasing “X”).

Classifiers with a threshold of ≥ 3 (3 + classifiers) were further analyzed, as these classifiers were considered to correctly identify the vast majority of relevant publications (high sensitivity) without including too many irrelevant publications (sufficient specificity). The 3 + classifiers had a sensitivity/specificity of 94.8%/31.8% for the FlanT5 model, of 97.6%/19.1% for the OHNC model, of 81.9%/75.2% for the Mixtral model, and of 97.2%/38.3% for the Platypus 2 model on all ten published data sets. The performance of the classifiers was quite different depending on the data set used (Fig.  3 ). Detailed results on the individual data sets are presented in Supplementary material 3: Appendix 3.

figure 3

Sensitivity and specificity of the 3 + classifiers on different data sets using different models. Each data point represents the results of one of the data sets

The highest specificity at 100% sensitivity was seen for the Mixtral model on the data set Wolters_2018 with all 19 relevant publications being scored with 3–5, while 4410 of 5019 irrelevant publications were scored with 1 or 2 (specificity of 87.87%). The lowest sensitivity was observed with the Mixtral model on the dataset Jeyaraman_2021 with 23.96% sensitivity at 94.63% specificity.

Using LLM-based title and abstract screening for a new systematic literature review

On the newly created manually labeled data set, the 3 + classifiers had 100% sensitivity for all four models with specificity ranging from 4.54 to 62.47%. The results of the LLM-based title and abstract screening, dependent on the threshold for the classifiers are presented as receiver operating characteristics (ROC) curves in Fig.  4 as well as in Supplementary material 3: Appendix 3.

figure 4

Receiver operating characteristics (ROC) curves of the LLM-based title and abstract screening for the different models on the CDSS_RO data set

Dependence of LLM-based title and abstract screening on Instruction prompt and on a range of scale

Several runs of the Python script with different settings (adapted [Instruction] string and/or range of scale 1–10 instead of 1–5) were performed, which led to different results. Minor adaptation of the Instruction string with an additional demand to focus on the mentioned criteria had a different impact on the performance of the classifiers depending on the LLM used. While the sensitivity of the 3 + classifiers remained at 100% for all four models, the specificity was lower for the OHNC model (2.89% vs. 4.54%), the Mixtral model (56.29% vs. 62.47%) and the Platypus 2 model (15.88% vs. 24.74%), while it was higher for the FlanT5 model (25.15% vs. 12.58%).

Changing the range of scale from 1–5 to 1–10 and using a 6 + classifier instead of a 3 + classifier led to a lower sensitivity for the OHNC model (97.22% vs. 100%), while increasing the specificity (13.49% vs. 4.54%). For the other models, the sensitivity remained at 100% with higher specificity for the Platypus 2 model (51.34% vs. 24.74%) and the FlanT5 model (50.52% vs. 12.58%). The specificity was unchanged for the Mixtral model at 62.47%, which was the highest value among all combinations at 100% sensitivity. No combination of the settings for a range of scales and with/without prompt adaptation was superior among all models. An overview of the results is provided in Fig.  5 .

figure 5

Performance of the classifiers depending on adaptation of the prompt and on the range of scale

Comparison with unsupervised title and abstract screening of Natukunda et al.

The screening approach developed by Natukunda et al. achieved an overall sensitivity of 52.75% at 56.39% specificity on the ten published data sets. As for the LLM-based screening, the performance of this approach was dependent on the data set analyzed. The lowest sensitivity was observed for the Jeyaraman_2021 data set (1.04%), while the highest sensitivity was observed for the Wolters_2018 dataset (100%). Compared to the 3 + classifier with the Mixtral model, the LLM-based approach had higher sensitivity on 9 data sets and equal sensitivity on 1 data set, while it had higher specificity on 6 data sets and lower specificity on 4 data sets.

On the CDSS_RO data set, the approach of Natukunda et al. achieved 94.44% sensitivity (lower than all four LLMs) at 39.59% specificity (lower than the Mixtral model and higher than the FlanT5, OHNC, and Platypus 2 models). Further data on the comparison is provided in Supplementary material 4: Appendix 4.

We developed and elaborated a flexible approach to use LLMs for automated title and abstract screening that has shown some promising results on a variety of biomedical topics. Such an approach could potentially be used to automatically pre-screen the relevance of publications based on title and abstract. While the results are far from perfect, using LLMs for evaluating the relevance of publications could potentially be helpful (e.g., as a pre-processing step) when performing an SLR. Furthermore, the approach is widely applicable without the development of custom tools or training custom models.

Automated and semi-automated screening

A variety of different ML and AI tools have been developed to assist researchers in performing SLRs [ 5 , 10 , 52 , 53 ]. Fully automated systems (like the LLM-based approach presented in our study) still fail to differentiate relevant from irrelevant publications near the level of human evaluation [ 51 , 54 ].

A well-functioning fully automated title and abstract screening system that could be used on different subjects in the biomedical domain and possibly also in other scientific areas would be very valuable. While human-based screening is the current gold standard, it has considerable drawbacks. From a methodological point of view, one major problem of human-based literature evaluation, including title and abstract screening, is the subjectivity of the process [ 55 ]. Evaluating the publications (based on title and abstract) is dependent on the experience and individual judgments of the person doing the screening. To overcome this issue, SLRs of high quality require multiple independent researchers to do the evaluation with specific criteria upon inclusion/exclusion defined beforehand [ 56 ]. Nevertheless, subjectivity remains an unresolved issue, which also limits the reproducibility of results. From a practical point of view, another major problem is the considerable workload needed to be performed by humans, especially if thousands of publications need to be assessed, which is multiplied by the need to have multiple reviewers and to discuss disagreements. The challenge of workload is not just a matter of inconvenience, as SLRs on subjects that require tens of thousands of publications to be searched, may just not be feasible for small research teams to do, or may already be outdated after the time it would take to do the screening and analyze the results.

While fully automated screening approaches may also be affected by subjectivity (since the training data of models is itself generated by processes which are affected by subjectivity), the results would at least be more reproducible, and automation can be applied at scale in order to overcome the problem of practicability.

While current fully automated systems cannot replace humans in title and abstract screening, they may nevertheless be helpful. Such systems are already being used in systematic reviews and most likely their usage will continue to grow [ 57 ].

Ideally, a fully automated system should not miss a single relevant publication (100% sensitivity) while minimizing as far as possible the number of irrelevant publications included. This would allow confident exclusion of some of the retrieved search results which is a big asset to reducing time taken in manual screening.

LLMs for title and abstract screening

By creating structured prompts with clear instructions, an LLM can feasibly be used for evaluating the relevance of a scientific publication. In comparison to some other solutions, the LLM-based screening may have some advantages. On the one hand, the flexible nature of the approach allows adaptation to a specific subject. Depending on the question, different prompts for relevant criteria and instructions can be used to address the individual research question. On the other hand, the approach can create reproducible results, given a fixed model, parameters, prompting strategy, and defined threshold. At the same time, it is scalable to process large numbers of publications. As we have seen, such an approach is feasible with a performance similar to or even better in comparison to other current solutions like the approach of Natukunda et al. However, it should be noted that the performance varied considerably depending on which of the 10 + 1 data sets were used.

Further applications of LLMs in literature analysis

While we investigated LLMs for evaluating the relevance of publications and in particular for title and abstract screening, it is being discussed how these models may be used for a variety of tasks in literature analysis [ 58 , 59 ]. For example, Wang et al. obtained promising results when investigating if ChatGPT may be used for writing Boolean Queries for SLRs [ 60 ]. Aydin et al., also using ChatGPT, employed the LLM to write an entire Literature Review about Digital Twins in Healthcare [ 61 ].

Guo et al. recently performed a study using the OpenAI API with gpt-3.5 and gpt-4 to create a classifier for clinical reviews [ 62 ]. They observed promising results when comparing the performance of the classifier against human-based screening with a sensitivity of 76% at 91% specificity on six different review papers. In contrast to our approach, they used a Boolean classifier instead of a Likert scale. Another approach was developed by Akinseloyin et al., who used ChatGPT to create a method for citation screening by ranking the relevance of publications using a question-answering framework [ 63 ].

The question may arise what the purpose of using a Likert scale instead of a direct binary classifier is (also since some models only rarely use some of the score values; see e.g., FlanT5 in Fig.  2 ). The rationale for using the Likert scale arose out of some preliminary, unsystematic explorations we conducted using different models and ranges of scale (including binary). We realized that using a Likert scale has some advantages as it sorts the publications into several groups depending on the estimated relevance. This also allows flexible adjustment of the threshold (which may potentially also be useful if the user wants to rather focus on sensitivity or rather on specificity).

However, there seem to be several feasible approaches and frameworks to use LLMs for the screening of publications.

It should be noted that an LLM-based approach for evaluating the relevance of publications might just as well be used for a variety of different classification tasks in literature analysis. For example, one may adopt the [Instruction prompt] asking the LLM not to evaluate the relevance of a publication on a Likert scale, but for classification into several groups like “original article”, “trial”, “letter to the editor”, etc. From this point of view, the title and abstract screening is just a special use case of LLM-based classification.

Future developments

The capabilities of LLMs and other AI models will continue to evolve, which will increase the performance of fully automated systems. As we have seen, the results are highly dependent on the LLM used for the approach. In any case, there may still be substantial room for improvement and optimization and it currently is unclear what LLM-based approach with which prompts, models, and settings yields the best results over a large variety of data sets.

Furthermore, LLMs may not only be used for the screening of titles and abstracts but for the analysis of full-text documents. The newest generation of language and multimodal models may process whole articles or potentially also image data from publications [ 64 , 65 ]. Beyond that, LLM-based evaluation of scientific data and publications may only be one of several options for AI assistance in literature analysis. Future systems may combine different ML and AI approaches for optimal automated processing of literature and scientific data.

Limitations of LLM-based title and abstract screening

Even though the LLM-based screening presented in our work shows some promising results, it also has some drawbacks and limitations. While the open framework with adaptable prompts makes the approach flexible, the performance of the approach is highly dependent on the used model, the input parameters/settings, and the data set analyzed. If a slightly different instruction or another scale (1–10 instead of 1–5) is used, this can have a considerable impact on the performance. The classifiers analyzed in our study failed to consistently identify relevant publications at 100% sensitivity without considerably impairing the specificity. In academic research, the bar for automated screening tools needs to be very high, as ideally not a single relevant publication should be missed. The LLM-based title and abstract screening requires the definition of clear criteria for inclusion/exclusion. For research questions with less clear relevance criteria, LLMs may not be that useful for the evaluation. This may potentially be one reason, why the performance of the approach was quite different in our study depending on the data set analyzed. Overall, there are still many open questions, and it is unclear if and how high levels of performance can be consistently guaranteed so that such a system can be relied on. It is interesting that the Mixtral model, even though it seemed to have the highest level of performance on average, performed poorly with low sensitivity on one data set (Fig.  3 ). Further research is needed to investigate the requirements for good performance of the LLMs in evaluating scientific literature.

Another limitation of the approach in its current form is a considerable demand for resources regarding calculation power and hardware equipment. Answering thousands of long text prompts with modern, multi-billion-parameter LLMs requires sufficient IT infrastructure and calculation power to perform. The issue of resource demand is especially relevant if many thousand publications are evaluated and if very complex models are used.

Fundamental issues of using LLMs in literature analysis

On a more fundamental level, there are some general issues regarding the use of LLMs for literature studies. LLMs calculate the probability for a sequence of words based on their training data which derives from past observations and knowledge. They can thereby inherit unwanted features and biases (such as for example ethnic or gender biases) [ 29 , 66 ]. In a recent study by Koo et al., it was shown that the cognitive biases and preferences of LLMs are not the same as the ones of humans as a low correlation between ratings given by LLMs and humans was observed [ 67 ]. The authors therefore stated that LLMs are currently not suitable as fair and reliable automatic evaluators. Considering that using LLMs for evaluating and processing scientific publications may be seen as a problematic and questionable undertaking. However, the biases present in language models affect different tasks differently, and it remains to be seen how they might differentially affect different screening tasks in the literature review [ 28 ].

Nevertheless, it is most likely that LLMs and other AI solutions will be increasingly used in conducting and evaluating scientific research [ 68 ]. While this certainly will provide a lot of chances and opportunities, it is also potentially concerning. The amount and proportion of text being written by AI models is increasing. This includes not only public text on the Internet but also scientific literature and publications [ 69 , 70 ]. The fact that ChatGPT has been chosen as one of the top researchers of the year 2023 by Nature and has frequently been listed as co-author, shows how immediate the impact of the development has already been [ 71 ]. At the same time, most LLMs are trained on large amounts of text provided on the Internet. The idea that in the future LLMs might be used to evaluate publications written with the help of LLMs that may themselves be trained on data created by LLMs may lead to disturbing negative feedback loops which decrease the quality of the results over time [ 72 ]. Such a development could actually undermine academia and evidence-based science [ 73 ], also due to the known fact that LLMs tend to “hallucinate”, meaning that a model may generate text with illusory statements not based on correct data [ 26 ]. It is important to be aware that LLMs are not directly coupled to evidence and that there is no restriction preventing a model from generating incorrect statements. As part of a screening tool assigning just a score value to the relevance of a publication, this may be a mere factor impairing the performance of the system – yet for LLM-based analysis in general this is a major problem.

The majority of studies that so far have been published on using LLMs for publication screening used the currently most powerful models that are operated by private companies—most notably the ChatGPT models GPT-3.5 and GPT-4 developed by OpenAI [ 18 , 74 ]. Using models that are owned and controlled by private companies and that may change over time is associated with additional major problems when using them for publication screening, such as a lack of reproducibility. Therefore, after initial experiments with such models, we decided to use openly available models for our study.

Limitations of the study

Our study has some limitations. While we present a strategy for using LLMs to evaluate the relevance of publications for an SLR, our work does not provide a comprehensive analysis of all possible capabilities and limitations. Even though we achieved promising results on ten published data sets and a newly created one in our study, generalization of the results may be limited as it is not clear how the approach would perform on many other subjects within the biomedical domain more broadly and within other domains. To get a more comprehensive understanding, thorough testing with many more data sets about different topics would be needed, which is beyond the scope of this work. Testing the screening approach on retrospective data sets is also per se problematic. While a good performance on retrospective data should hopefully indicate a good performance if used prospectively on a new topic, this does not have to be the case [ 75 ]. Indeed, naively assuming a classifier that was tested on retrospective data will perform equally on a new research question is clearly problematic, since a new research question in science is by definition new and unfamiliar and therefore will not be represented in previously tested data sets.

Furthermore, models that are trained on vast amounts of scientific literature may even have been trained on some publications or the reviews that are used in the retrospective benchmarking of an LLM-based classifier, which obviously creates a considerable bias. To objectively assess how well an LLM-based solution can evaluate scientific publications for new research questions, large cultivated and independent prospective data sets on many different topics would be needed, which will be very challenging to create. It is interesting that the LLM-based title and abstract screening in our study would have also performed well on our new hypothetical SLR on CDSS in radiation therapy, but of course, this alone is a too limited data basis from which to draw general conclusions. Therefore, it currently cannot be reliably known in which situations such an LLM-based evaluation may succeed or may fail.

Regarding the ten published data sets, the results also need to be interpreted with caution. These data sets may not truly represent the singular task of title and abstract screening. For example, in the Appenzeller-Herzog_2020 data set, only the 26 publications that were finally included (not only after title and abstract screening but also after further analysis) were labeled as relevant [ 40 ]. While these publications ideally should be correctly identified by an AI-classifier, there may be other publications in the data set, that per se cannot be excluded solely based on title and abstract. Furthermore, we had to retrospectively define the [Relevant Criteria] string based on the text in the publication of the SLR. This obviously is a suboptimal way to define inclusion and exclusion criteria, as the defined string may not completely align with the criteria intended by the researchers of the SLR.

We also want to emphasize that the comparison with the approach of Natukunda et al. needs to be interpreted with caution since the two approaches are not based on exactly the same prerequisites: the LLM-based approach requires a [Relevant Criteria] string, while the approach of Natukunda et al. requires defined keywords.

While overall our work shows that LLM-based title and abstract screening is possible and shows some promising results on the analyzed data sets, our study cannot fully answer the question of how well LLMs would perform if they were used for new research. Even more importantly, we cannot answer the question of to what extent LLMs should be used for conducting literature reviews and for doing research.

Large language models can be used for evaluating the relevance of publications for SLRs. We were able to implement a flexible and cross-domain system with promising results on different biomedical subjects. With the continuing progress in the fields of LLMs and AI, fully automated computer systems may assist researchers in performing SLRs and other forms of scientific knowledge synthesis. However, it remains unclear how well such systems will perform when being used in a prospective manner and what implications this will have on the conduction of SLRs.

Availability of data and materials

All data generated and analyzed during this study are either included in this published article (and its supplementary information files) or publicly available on the Internet. The Python script as well as the CDSS_RO data set are available under https://github.com/med-data-tools/title-abstract-screening-ai . The ten published data sets analyzed in our study are available on the GitHub Repository of the research group of the ASReview Tool [ 39 ].

Abbreviations

Artificial intelligence

Application programming interface

Clinical Decision Support System

FlanT5-XXL model

Generative pre-trained transformer

Mixtral-8 × 7B-Instruct v0.1 model

Machine learning

Large language model

OpenHermes-2.5-neural-chat-7b-v3-1-7B model

Platypus2-70B-Instruct model

Receiver operating characteristic

  • Systematic literature review

Khalil H, Ameen D, Zarnegar A. Tools to support the automation of systematic reviews: a scoping review. J Clin Epidemiol. 2022;144:22–42.

Article   PubMed   Google Scholar  

Clark J, Scott AM, Glasziou P. Not all systematic reviews can be completed in 2 weeks—But many can be (and should be). J Clin Epidemiol. 2020;126:163.

Clark J, Glasziou P, Del Mar C, Bannach-Brown A, Stehlik P, Scott AM. A full systematic review was completed in 2 weeks using automation tools: a case study. J Clin Epidemiol. 2020;121:81–90.

Pham B, Jovanovic J, Bagheri E, Antony J, Ashoor H, Nguyen TT, et al. Text mining to support abstract screening for knowledge syntheses: a semi-automated workflow. Syst Rev. 2021;10(1):156.

Article   PubMed   PubMed Central   Google Scholar  

van de Schoot R, de Bruin J, Schram R, Zahedi P, de Boer J, Weijdema F, et al. An open source machine learning framework for efficient and transparent systematic reviews. Nat Mach Intell. 2021;3(2):125–33.

Article   Google Scholar  

Hamel C, Hersi M, Kelly SE, Tricco AC, Straus S, Wells G, et al. Guidance for using artificial intelligence for title and abstract screening while conducting knowledge syntheses. BMC Med Res Methodol. 2021;21(1):285.

Covidence [Internet]. [cited 2024 Jan 14]. Available from: www.covidence.org .

Machine learning functionality in EPPI-Reviewer [Internet]. [cited 2024 Jan 14]. Available from: https://eppi.ioe.ac.uk/CMS/Portals/35/machine_learning_in_eppi-reviewer_v_7_web_version.pdf .

Elicit [Internet]. [cited 2024 Jan 14]. Available from: https://elicit.org/ .

Harrison H, Griffin SJ, Kuhn I, Usher-Smith JA. Software tools to support title and abstract screening for systematic reviews in healthcare: an evaluation. BMC Med Res Methodol. 2020;20(1):7.

Rayyan [Internet]. [cited 2024 Jan 14]. Available from: https://www.rayyan.ai/ .

DistillerSR [Internet]. [cited 2024 Jan 14]. Available from: https://www.distillersr.com/products/distillersr-systematic-review-software .

Abstrackr [Internet]. [cited 2024 Jan 14]. Available from: http://abstrackr.cebm.brown.edu/account/login .

RobotAnalyst [Internet]. [cited 2024 Jan 14]. Available from: http://www.nactem.ac.uk/robotanalyst/ .

Clark J, McFarlane C, Cleo G, Ishikawa Ramos C, Marshall S. The impact of systematic review automation tools on methodological quality and time taken to complete systematic review Tasks: Case Study. JMIR Med Educ. 2021;7(2): e24418.

Ayers JW, Poliak A, Dredze M, Leas EC, Zhu Z, Kelley JB, et al. Comparing physician and artificial intelligence chatbot responses to patient questions posted to a public social media forum. JAMA Intern Med. 2023 [cited 2024 Jan 14]; Available from: https://jamanetwork.com/journals/jamainternalmedicine/fullarticle/2804309 .

Tang L, Sun Z, Idnay B, Nestor JG, Soroush A, Elias PA, et al. Evaluating Large Language Models on Medical Evidence Summarization [Internet]. Health Informatics; 2023 Apr [cited 2024 Jan 14]. Available from: http://medrxiv.org/lookup/doi/ https://doi.org/10.1101/2023.04.22.23288967 .

OpenAI: GPT3-apps [Internet]. [cited 2024 Jan 14]. Available from: https://openai.com/blog/gpt-3-apps .

Google: PaLM [Internet]. [cited 2024 Jan 14]. Available from: https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html .

Google: Gemini [Internet]. [cited 2024 Jan 14]. Available from: https://deepmind.google/technologies/gemini/#hands-on .

Zhao WX, Zhou K, Li J, Tang T, Wang X, Hou Y, et al. A Survey of Large Language Models. 2023 [cited 2024 Jan 14]; Available from: https://arxiv.org/abs/2303.18223 .

McNichols H, Zhang M, Lan A. Algebra error classification with large language models [Internet]. arXiv; 2023 [cited 2023 May 25]. Available from: http://arxiv.org/abs/2305.06163 .

Wadhwa S, Amir S, Wallace BC. Revisiting relation extraction in the era of large language models [Internet]. arXiv; 2023 [cited 2024 Jan 14]. Available from: http://arxiv.org/abs/2305.05003 .

Trajanoska M, Stojanov R, Trajanov D. Enhancing knowledge graph construction using large language models [Internet]. arXiv; 2023 [cited 2024 Jan 14]. Available from: http://arxiv.org/abs/2305.04676 .

Reynolds L, McDonell K. Prompt programming for large language models: beyond the few-shot paradigm [Internet]. arXiv; 2021 [cited 2024 Jan 14]. Available from: http://arxiv.org/abs/2102.07350 .

Guerreiro NM, Alves D, Waldendorf J, Haddow B, Birch A, Colombo P, et al. Hallucinations in Large Multilingual Translation Models [Internet]. arXiv; 2023 [cited 2024 Jan 14]. Available from: http://arxiv.org/abs/2303.16104 .

Zack T, Lehman E, Suzgun M, Rodriguez JA, Celi LA, Gichoya J, et al. Assessing the potential of GPT-4 to perpetuate racial and gender biases in health care: a model evaluation study. Lancet Digital Health. 2024;6(1):e12-22.

Article   CAS   PubMed   Google Scholar  

Hastings J. Preventing harm from non-conscious bias in medical generative AI. Lancet Digital Health. 2024;6(1):e2-3.

Digutsch J, Kosinski M. Overlap in meaning is a stronger predictor of semantic activation in GPT-3 than in humans. Sci Rep. 2023;13(1):5035.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Huggingface: FlanT5-XXL [Internet]. [cited 2024 Jan 14]. Available from: https://huggingface.co/google/flan-t5-xxl .

Chung HW, Hou L, Longpre S, Zoph B, Tay Y, Fedus W, et al. Scaling Instruction-Finetuned Language Models [Internet]. arXiv; 2022 [cited 2024 Jan 14]. Available from: http://arxiv.org/abs/2210.11416 .

Huggingface: OpenHermes-2.5-neural-chat-7b-v3–1–7B [Internet]. [cited 2024 Jan 14]. Available from: https://huggingface.co/Weyaxi/OpenHermes-2.5-neural-chat-7b-v3-1-7B .

Huggingface: OpenHermes-2.5-Mistral-7B [Internet]. [cited 2024 Jan 14]. Available from: https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B .

Huggingface: neural-chat-7b-v3–1 [Internet]. [cited 2024 Jan 14]. Available from: https://huggingface.co/Intel/neural-chat-7b-v3-1 .

Huggingface: Mixtral-8x7B-Instruct-v0.1 [Internet]. [cited 2024 Jan 14]. Available from: https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1 .

Jiang AQ, Sablayrolles A, Roux A, Mensch A, Savary B, Bamford C, et al. Mixtral of Experts [Internet]. [cited 2024 Jan 14]. Available from: http://arxiv.org/abs/2401.04088 .

Huggingface: Platypus2–70B-Instruct [Internet]. [cited 2024 Jan 14]. Available from: https://huggingface.co/garage-bAInd/Platypus2-70B-instruct .

Huggingface: SOLAR-0–70b-16bit [Internet]. [cited 2024 Jan 14]. Available from: https://huggingface.co/upstage/SOLAR-0-70b-16bit#updates .

Systematic Review Datasets: ASReview [Internet]. [cited 2024 Jan 14]. Available from: https://github.com/asreview/systematic-review-datasets .

Appenzeller-Herzog C, Mathes T, Heeres MLS, Weiss KH, Houwen RHJ, Ewald H. Comparative effectiveness of common therapies for Wilson disease: a systematic review and meta-analysis of controlled studies. Liver Int. 2019;39(11):2136–52.

Bos D, Wolters FJ, Darweesh SKL, Vernooij MW, De Wolf F, Ikram MA, et al. Cerebral small vessel disease and the risk of dementia: a systematic review and meta-analysis of population-based evidence. Alzheimer’s & Dementia. 2018;14(11):1482–92.

Donners AAMT, Rademaker CMA, Bevers LAH, Huitema ADR, Schutgens REG, Egberts TCG, et al. Pharmacokinetics and associated efficacy of emicizumab in humans: a systematic review. Clin Pharmacokinet. 2021;60(11):1395–406.

Jeyaraman M, Muthu S, Ganie PA. Does the source of mesenchymal stem cell have an effect in the management of osteoarthritis of the knee? Meta-analysis of randomized controlled trials. CARTILAGE. 2021 Dec;13(1_suppl):1532S-1547S.

Leenaars C, Stafleu F, De Jong D, Van Berlo M, Geurts T, Coenen-de Roo T, et al. A systematic review comparing experimental design of animal and human methotrexate efficacy studies for rheumatoid arthritis: lessons for the translational value of animal studies. Animals. 2020;10(6):1047.

Meijboom RW, Gardarsdottir H, Egberts TCG, Giezen TJ. Patients retransitioning from biosimilar TNFα inhibitor to the corresponding originator after initial transitioning to the biosimilar: a systematic review. BioDrugs. 2022;36(1):27–39.

Muthu S, Ramakrishnan E. Fragility analysis of statistically significant outcomes of randomized control trials in spine surgery: a systematic review. Spine. 2021;46(3):198–208.

Oud M, Arntz A, Hermens ML, Verhoef R, Kendall T. Specialized psychotherapies for adults with borderline personality disorder: a systematic review and meta-analysis. Aust N Z J Psychiatry. 2018;52(10):949–61.

Van De Schoot R, Sijbrandij M, Depaoli S, Winter SD, Olff M, Van Loey NE. Bayesian PTSD-trajectory analysis with informed priors based on a systematic literature search and expert elicitation. Multivar Behav Res. 2018;53(2):267–91.

Wolters FJ, Segufa RA, Darweesh SKL, Bos D, Ikram MA, Sabayan B, et al. Coronary heart disease, heart failure, and the risk of dementia: A systematic review and meta-analysis. Alzheimer’s Dementia. 2018;14(11):1493–504.

Sutton RT, Pincock D, Baumgart DC, Sadowski DC, Fedorak RN, Kroeker KI. An overview of clinical decision support systems: benefits, risks, and strategies for success. npj Digit Med. 2020 Feb 6;3(1):17.

Natukunda A, Muchene LK. Unsupervised title and abstract screening for systematic review: a retrospective case-study using topic modelling methodology. Syst Rev. 2023;12(1):1.

Marshall IJ, Wallace BC. Toward systematic review automation: a practical guide to using machine learning tools in research synthesis. Syst Rev. 2019 Dec;8(1):163, s13643–019–1074–9.

Wallace BC, Trikalinos TA, Lau J, Brodley C, Schmid CH. Semi-automated screening of biomedical citations for systematic reviews. BMC Bioinformatics. 2010;11(1):55.

Li D, Wang Z, Wang L, Sohn S, Shen F, Murad MH, et al. A text-mining framework for supporting systematic reviews. Am J Inf Manag. 2016;1(1):1–9.

PubMed   PubMed Central   Google Scholar  

de Almeida CPB, de Goulart BNG. How to avoid bias in systematic reviews of observational studies. Rev CEFAC. 2017;19(4):551–5.

Siddaway AP, Wood AM, Hedges LV. How to do a systematic review: a best practice guide for conducting and reporting narrative reviews, meta-analyses, and meta-syntheses. Annu Rev Psychol. 2019;70(1):747–70.

Santos ÁOD, Da Silva ES, Couto LM, Reis GVL, Belo VS. The use of artificial intelligence for automating or semi-automating biomedical literature analyses: a scoping review. J Biomed Inform. 2023;142: 104389.

Haman M, Školník M. Using ChatGPT to conduct a literature review. Account Res. 2023;6:1–3.

Liu R, Shah NB. ReviewerGPT? An exploratory study on using large language models for paper reviewing [Internet]. arXiv; 2023 [cited 2024 Jan 14]. Available from: http://arxiv.org/abs/2306.00622

Wang S, Scells H, Koopman B, Zuccon G. Can ChatGPT write a good boolean query for systematic review literature search? [Internet]. arXiv; 2023 [cited 2024 Jan 14]. Available from: http://arxiv.org/abs/2302.03495 .

Aydın Ö, Karaarslan E. OpenAI ChatGPT generated literature review: digital twin in healthcare. SSRN Journal [Internet]. 2022 [cited 2024 Jan 14]; Available from: https://www.ssrn.com/abstract=4308687 .

Guo E, Gupta M, Deng J, Park YJ, Paget M, Naugler C. Automated paper screening for clinical reviews using large language models [Internet]. arXiv; 2023 [cited 2024 Jan 14]. Available from: http://arxiv.org/abs/2305.00844 .

Akinseloyin O, Jiang X, Palade V. A novel question-answering framework for automated citation screening using large language models [Internet]. Health Informatics; 2023 Dec [cited 2024 Jan 14]. Available from: http://medrxiv.org/lookup/doi/ https://doi.org/10.1101/2023.12.17.23300102 .

Koh JY, Salakhutdinov R, Fried D. Grounding language models to images for multimodal inputs and outputs. 2023 [cited 2024 Jan 14]; Available from: https://arxiv.org/abs/2301.13823 .

Wang L, Lyu C, Ji T, Zhang Z, Yu D, Shi S, et al. Document-level machine translation with large language models [Internet]. arXiv; 2023 [cited 2024 Jan 14]. Available from: http://arxiv.org/abs/2304.02210 .

Brown TB, Mann B, Ryder N, Subbiah M, Kaplan J, Dhariwal P, et al. Language Models are Few-Shot Learners [Internet]. arXiv; 2020 [cited 2024 Jan 14]. Available from: http://arxiv.org/abs/2005.14165 .

Koo R, Lee M, Raheja V, Park JI, Kim ZM, Kang D. Benchmarking cognitive biases in large language models as evaluators [Internet]. arXiv; 2023 [cited 2024 Jan 14]. Available from: http://arxiv.org/abs/2309.17012 .

Editorial —Artificial Intelligence language models in scientific writing. EPL. 2023 Jul 1;143(2):20000.

Grimaldi G, Ehrler BAI, et al. Machines Are About to Change Scientific Publishing Forever. ACS Energy Lett. 2023;8(1):878–80.

Article   CAS   Google Scholar  

Grillo R. The rising tide of artificial intelligence in scientific journals: a profound shift in research landscape. Eur J Ther. 2023;29(3):686–8.

nature: ChatGPT and science: the AI system was a force in 2023 — for good and bad [Internet]. [cited 2024 Jan 14]. Available from: https://www.nature.com/articles/d41586-023-03930-6 .

Chiang CH, Lee H yi. Can large language models be an alternative to human evaluations? 2023 [cited 2024 Jan 6]; Available from: https://arxiv.org/abs/2305.01937 .

Erler A. Publish with AUTOGEN or perish? Some pitfalls to avoid in the pursuit of academic enhancement via personalized large language models. Am J Bioeth. 2023;23(10):94–6.

OpenAI: ChatGPT [Internet]. [cited 2024 Jan 14]. Available from: https://openai.com/blog/chatgpt .

Gates A, Gates M, Sebastianski M, Guitard S, Elliott SA, Hartling L. The semi-automation of title and abstract screening: a retrospective exploration of ways to leverage Abstrackr’s relevance predictions in systematic and rapid reviews. BMC Med Res Methodol. 2020;20(1):139.

Download references

Acknowledgements

Not applicable.

Author information

Authors and affiliations.

Department of Radiation Oncology, Cantonal Hospital of St. Gallen, St. Gallen, Switzerland

Fabio Dennstädt & Paul Martin Putora

Institute for Computer Science, University of Würzburg, Würzburg, Germany

Johannes Zink

Department of Radiation Oncology, Inselspital, Bern University Hospital and University of Bern, Bern, Switzerland

Fabio Dennstädt, Paul Martin Putora & Nikola Cihoric

Institute for Implementation Science in Health Care, University of Zurich, Zurich, Switzerland

Janna Hastings

School of Medicine, University of St. Gallen, St. Gallen, Switzerland

Swiss Institute of Bioinformatics, Lausanne, Switzerland

You can also search for this author in PubMed   Google Scholar

Contributions

All authors contributed to designing the concept and methodology of the presented approach of LLM-based evaluation of the relevance of a publication to an SLR. The Python script was created by FD and JZ. The experiments were conducted by FD and JH. All authors contributed in writing and revising the manuscript. All authors have read and approved the final version of the manuscript.

Corresponding author

Correspondence to Fabio Dennstädt .

Ethics declarations

Ethics approval and consent to participate, consent for publication, competing interests.

NC is a technical lead for the SmartOncology© project and medical advisor for Wemedoo AG, Steinhausen AG, Switzerland. The authors declare that they have no other competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary material 1: appendix 1: sample prompt., supplementary material 2: appendix 2: relevant criteria of published datasets., supplementary material 3: appendix 3: performance of models on data sets., supplementary material 4: appendix 4: comparison with other approach., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Dennstädt, F., Zink, J., Putora, P.M. et al. Title and abstract screening for literature reviews using large language models: an exploratory study in the biomedical domain. Syst Rev 13 , 158 (2024). https://doi.org/10.1186/s13643-024-02575-4

Download citation

Received : 17 June 2023

Accepted : 30 May 2024

Published : 15 June 2024

DOI : https://doi.org/10.1186/s13643-024-02575-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Natural language processing
  • Biomedicine
  • Title and abstract screening
  • Large language models

Systematic Reviews

ISSN: 2046-4053

  • Submission enquiries: Access here and click Contact Us
  • General enquiries: [email protected]

systematic literature reviews

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Wiley Open Access Collection

Logo of blackwellopen

An overview of methodological approaches in systematic reviews

Prabhakar veginadu.

1 Department of Rural Clinical Sciences, La Trobe Rural Health School, La Trobe University, Bendigo Victoria, Australia

Hanny Calache

2 Lincoln International Institute for Rural Health, University of Lincoln, Brayford Pool, Lincoln UK

Akshaya Pandian

3 Department of Orthodontics, Saveetha Dental College, Chennai Tamil Nadu, India

Mohd Masood

Associated data.

APPENDIX B: List of excluded studies with detailed reasons for exclusion

APPENDIX C: Quality assessment of included reviews using AMSTAR 2

The aim of this overview is to identify and collate evidence from existing published systematic review (SR) articles evaluating various methodological approaches used at each stage of an SR.

The search was conducted in five electronic databases from inception to November 2020 and updated in February 2022: MEDLINE, Embase, Web of Science Core Collection, Cochrane Database of Systematic Reviews, and APA PsycINFO. Title and abstract screening were performed in two stages by one reviewer, supported by a second reviewer. Full‐text screening, data extraction, and quality appraisal were performed by two reviewers independently. The quality of the included SRs was assessed using the AMSTAR 2 checklist.

The search retrieved 41,556 unique citations, of which 9 SRs were deemed eligible for inclusion in final synthesis. Included SRs evaluated 24 unique methodological approaches used for defining the review scope and eligibility, literature search, screening, data extraction, and quality appraisal in the SR process. Limited evidence supports the following (a) searching multiple resources (electronic databases, handsearching, and reference lists) to identify relevant literature; (b) excluding non‐English, gray, and unpublished literature, and (c) use of text‐mining approaches during title and abstract screening.

The overview identified limited SR‐level evidence on various methodological approaches currently employed during five of the seven fundamental steps in the SR process, as well as some methodological modifications currently used in expedited SRs. Overall, findings of this overview highlight the dearth of published SRs focused on SR methodologies and this warrants future work in this area.

1. INTRODUCTION

Evidence synthesis is a prerequisite for knowledge translation. 1 A well conducted systematic review (SR), often in conjunction with meta‐analyses (MA) when appropriate, is considered the “gold standard” of methods for synthesizing evidence related to a topic of interest. 2 The central strength of an SR is the transparency of the methods used to systematically search, appraise, and synthesize the available evidence. 3 Several guidelines, developed by various organizations, are available for the conduct of an SR; 4 , 5 , 6 , 7 among these, Cochrane is considered a pioneer in developing rigorous and highly structured methodology for the conduct of SRs. 8 The guidelines developed by these organizations outline seven fundamental steps required in SR process: defining the scope of the review and eligibility criteria, literature searching and retrieval, selecting eligible studies, extracting relevant data, assessing risk of bias (RoB) in included studies, synthesizing results, and assessing certainty of evidence (CoE) and presenting findings. 4 , 5 , 6 , 7

The methodological rigor involved in an SR can require a significant amount of time and resource, which may not always be available. 9 As a result, there has been a proliferation of modifications made to the traditional SR process, such as refining, shortening, bypassing, or omitting one or more steps, 10 , 11 for example, limits on the number and type of databases searched, limits on publication date, language, and types of studies included, and limiting to one reviewer for screening and selection of studies, as opposed to two or more reviewers. 10 , 11 These methodological modifications are made to accommodate the needs of and resource constraints of the reviewers and stakeholders (e.g., organizations, policymakers, health care professionals, and other knowledge users). While such modifications are considered time and resource efficient, they may introduce bias in the review process reducing their usefulness. 5

Substantial research has been conducted examining various approaches used in the standardized SR methodology and their impact on the validity of SR results. There are a number of published reviews examining the approaches or modifications corresponding to single 12 , 13 or multiple steps 14 involved in an SR. However, there is yet to be a comprehensive summary of the SR‐level evidence for all the seven fundamental steps in an SR. Such a holistic evidence synthesis will provide an empirical basis to confirm the validity of current accepted practices in the conduct of SRs. Furthermore, sometimes there is a balance that needs to be achieved between the resource availability and the need to synthesize the evidence in the best way possible, given the constraints. This evidence base will also inform the choice of modifications to be made to the SR methods, as well as the potential impact of these modifications on the SR results. An overview is considered the choice of approach for summarizing existing evidence on a broad topic, directing the reader to evidence, or highlighting the gaps in evidence, where the evidence is derived exclusively from SRs. 15 Therefore, for this review, an overview approach was used to (a) identify and collate evidence from existing published SR articles evaluating various methodological approaches employed in each of the seven fundamental steps of an SR and (b) highlight both the gaps in the current research and the potential areas for future research on the methods employed in SRs.

An a priori protocol was developed for this overview but was not registered with the International Prospective Register of Systematic Reviews (PROSPERO), as the review was primarily methodological in nature and did not meet PROSPERO eligibility criteria for registration. The protocol is available from the corresponding author upon reasonable request. This overview was conducted based on the guidelines for the conduct of overviews as outlined in The Cochrane Handbook. 15 Reporting followed the Preferred Reporting Items for Systematic reviews and Meta‐analyses (PRISMA) statement. 3

2.1. Eligibility criteria

Only published SRs, with or without associated MA, were included in this overview. We adopted the defining characteristics of SRs from The Cochrane Handbook. 5 According to The Cochrane Handbook, a review was considered systematic if it satisfied the following criteria: (a) clearly states the objectives and eligibility criteria for study inclusion; (b) provides reproducible methodology; (c) includes a systematic search to identify all eligible studies; (d) reports assessment of validity of findings of included studies (e.g., RoB assessment of the included studies); (e) systematically presents all the characteristics or findings of the included studies. 5 Reviews that did not meet all of the above criteria were not considered a SR for this study and were excluded. MA‐only articles were included if it was mentioned that the MA was based on an SR.

SRs and/or MA of primary studies evaluating methodological approaches used in defining review scope and study eligibility, literature search, study selection, data extraction, RoB assessment, data synthesis, and CoE assessment and reporting were included. The methodological approaches examined in these SRs and/or MA can also be related to the substeps or elements of these steps; for example, applying limits on date or type of publication are the elements of literature search. Included SRs examined or compared various aspects of a method or methods, and the associated factors, including but not limited to: precision or effectiveness; accuracy or reliability; impact on the SR and/or MA results; reproducibility of an SR steps or bias occurred; time and/or resource efficiency. SRs assessing the methodological quality of SRs (e.g., adherence to reporting guidelines), evaluating techniques for building search strategies or the use of specific database filters (e.g., use of Boolean operators or search filters for randomized controlled trials), examining various tools used for RoB or CoE assessment (e.g., ROBINS vs. Cochrane RoB tool), or evaluating statistical techniques used in meta‐analyses were excluded. 14

2.2. Search

The search for published SRs was performed on the following scientific databases initially from inception to third week of November 2020 and updated in the last week of February 2022: MEDLINE (via Ovid), Embase (via Ovid), Web of Science Core Collection, Cochrane Database of Systematic Reviews, and American Psychological Association (APA) PsycINFO. Search was restricted to English language publications. Following the objectives of this study, study design filters within databases were used to restrict the search to SRs and MA, where available. The reference lists of included SRs were also searched for potentially relevant publications.

The search terms included keywords, truncations, and subject headings for the key concepts in the review question: SRs and/or MA, methods, and evaluation. Some of the terms were adopted from the search strategy used in a previous review by Robson et al., which reviewed primary studies on methodological approaches used in study selection, data extraction, and quality appraisal steps of SR process. 14 Individual search strategies were developed for respective databases by combining the search terms using appropriate proximity and Boolean operators, along with the related subject headings in order to identify SRs and/or MA. 16 , 17 A senior librarian was consulted in the design of the search terms and strategy. Appendix A presents the detailed search strategies for all five databases.

2.3. Study selection and data extraction

Title and abstract screening of references were performed in three steps. First, one reviewer (PV) screened all the titles and excluded obviously irrelevant citations, for example, articles on topics not related to SRs, non‐SR publications (such as randomized controlled trials, observational studies, scoping reviews, etc.). Next, from the remaining citations, a random sample of 200 titles and abstracts were screened against the predefined eligibility criteria by two reviewers (PV and MM), independently, in duplicate. Discrepancies were discussed and resolved by consensus. This step ensured that the responses of the two reviewers were calibrated for consistency in the application of the eligibility criteria in the screening process. Finally, all the remaining titles and abstracts were reviewed by a single “calibrated” reviewer (PV) to identify potential full‐text records. Full‐text screening was performed by at least two authors independently (PV screened all the records, and duplicate assessment was conducted by MM, HC, or MG), with discrepancies resolved via discussions or by consulting a third reviewer.

Data related to review characteristics, results, key findings, and conclusions were extracted by at least two reviewers independently (PV performed data extraction for all the reviews and duplicate extraction was performed by AP, HC, or MG).

2.4. Quality assessment of included reviews

The quality assessment of the included SRs was performed using the AMSTAR 2 (A MeaSurement Tool to Assess systematic Reviews). The tool consists of a 16‐item checklist addressing critical and noncritical domains. 18 For the purpose of this study, the domain related to MA was reclassified from critical to noncritical, as SRs with and without MA were included. The other six critical domains were used according to the tool guidelines. 18 Two reviewers (PV and AP) independently responded to each of the 16 items in the checklist with either “yes,” “partial yes,” or “no.” Based on the interpretations of the critical and noncritical domains, the overall quality of the review was rated as high, moderate, low, or critically low. 18 Disagreements were resolved through discussion or by consulting a third reviewer.

2.5. Data synthesis

To provide an understandable summary of existing evidence syntheses, characteristics of the methods evaluated in the included SRs were examined and key findings were categorized and presented based on the corresponding step in the SR process. The categories of key elements within each step were discussed and agreed by the authors. Results of the included reviews were tabulated and summarized descriptively, along with a discussion on any overlap in the primary studies. 15 No quantitative analyses of the data were performed.

From 41,556 unique citations identified through literature search, 50 full‐text records were reviewed, and nine systematic reviews 14 , 19 , 20 , 21 , 22 , 23 , 24 , 25 , 26 were deemed eligible for inclusion. The flow of studies through the screening process is presented in Figure  1 . A list of excluded studies with reasons can be found in Appendix B .

An external file that holds a picture, illustration, etc.
Object name is JEBM-15-39-g001.jpg

Study selection flowchart

3.1. Characteristics of included reviews

Table  1 summarizes the characteristics of included SRs. The majority of the included reviews (six of nine) were published after 2010. 14 , 22 , 23 , 24 , 25 , 26 Four of the nine included SRs were Cochrane reviews. 20 , 21 , 22 , 23 The number of databases searched in the reviews ranged from 2 to 14, 2 reviews searched gray literature sources, 24 , 25 and 7 reviews included a supplementary search strategy to identify relevant literature. 14 , 19 , 20 , 21 , 22 , 23 , 26 Three of the included SRs (all Cochrane reviews) included an integrated MA. 20 , 21 , 23

Characteristics of included studies

Author, yearSearch strategy (year last searched; no. databases; supplementary searches)SR design (type of review; no. of studies included)Topic; subject areaSR objectivesSR authors’ comments on study quality
Crumley, 2005 2004; Seven databases; four journals handsearched, reference lists and contacting authorsSR;  = 64RCTs and CCTs; not specifiedTo identify and quantitatively review studies comparing two or more different resources (e.g., databases, Internet, handsearching) used to identify RCTs and CCTs for systematic reviews.Most of the studies adequately described reproducible search methods, expected search yield. Poor quality in studies was mainly due to lack of rigor in reporting selection methodology. Majority of the studies did not indicate the number of people involved in independently screening the searches or applying eligibility criteria to identify potentially relevant studies.
Hopewell, 2007 2002; eight databases; selected journals and published abstracts handsearched, and contacting authorsSR and MA;  = 34 (34 in quantitative analysis)RCTs; health careTo review systematically empirical studies, which have compared the results of handsearching with the results of searching one or more electronic databases to identify reports of randomized trials.The electronic search was designed and carried out appropriately in majority of the studies, while the appropriateness of handsearching was unclear in half the studies because of limited information. The screening studies methods used in both groups were comparable in most of the studies.
Hopewell, 2007 2005; two databases; selected journals and published abstracts handsearched, reference lists, citations and contacting authorsSR and MA;  = 5 (5 in quantitative analysis)RCTs; health careTo review systematically research studies, which have investigated the impact of gray literature in meta‐analyses of randomized trials of health care interventions.In majority of the studies, electronic searches were designed and conducted appropriately, and the selection of studies for eligibility was similar for handsearching and database searching. Insufficient data for most studies to assess the appropriateness of handsearching and investigator agreeability on the eligibility of the trial reports.
Horsley, 2011 2008; three databases; reference lists, citations and contacting authorsSR;  = 12Any topic or study areaTo investigate the effectiveness of checking reference lists for the identification of additional, relevant studies for systematic reviews. Effectiveness is defined as the proportion of relevant studies identified by review authors solely by checking reference lists.Interpretability and generalizability of included studies was difficult. Extensive heterogeneity among the studies in the number and type of databases used. Lack of control in majority of the studies related to the quality and comprehensiveness of searching.
Morrison, 2012 2011; six databases and gray literatureSR;  = 5RCTs; conventional medicineTo examine the impact of English language restriction on systematic review‐based meta‐analysesThe included studies were assessed to have good reporting quality and validity of results. Methodological issues were mainly noted in the areas of sample power calculation and distribution of confounders.
Robson, 2019 2016; three databases; reference lists and contacting authorsSR;  = 37N/RTo identify and summarize studies assessing methodologies for study selection, data abstraction, or quality appraisal in systematic reviews.The quality of the included studies was generally low. Only one study was assessed as having low RoB across all four domains. Majority of the studies were assessed to having unclear RoB across one or more domains.
Schmucker, 2017 2016; four databases; reference listsSR;  = 10Study data; medicineTo assess whether the inclusion of data that were not published at all and/or published only in the gray literature influences pooled effect estimates in meta‐analyses and leads to different interpretation.Majority of the included studies could not be judged on the adequacy of matching or adjusting for confounders of the gray/unpublished data in comparison to published data.
Also, generalizability of results was low or unclear in four research projects
Morissette, 2011 2009; five databases; reference lists and contacting authorsSR and MA;  = 6 (5 included in quantitative analysis)N/RTo determine whether blinded versus unblinded assessments of risk of bias result in similar or systematically different assessments in studies included in a systematic review.Four studies had unclear risk of bias, while two studies had high risk of bias.
O'Mara‐Eves, 2015 2013; 14 databases and gray literatureSR;  = 44N/RTo gather and present the available research evidence on existing methods for text mining related to the title and abstract screening stage in a systematic review, including the performance metrics used to evaluate these technologies.Quality appraised based on two criteria‐sampling of test cases and adequacy of methods description for replication. No study was excluded based on the quality (author contact).

SR = systematic review; MA = meta‐analysis; RCT = randomized controlled trial; CCT = controlled clinical trial; N/R = not reported.

The included SRs evaluated 24 unique methodological approaches (26 in total) used across five steps in the SR process; 8 SRs evaluated 6 approaches, 19 , 20 , 21 , 22 , 23 , 24 , 25 , 26 while 1 review evaluated 18 approaches. 14 Exclusion of gray or unpublished literature 21 , 26 and blinding of reviewers for RoB assessment 14 , 23 were evaluated in two reviews each. Included SRs evaluated methods used in five different steps in the SR process, including methods used in defining the scope of review ( n  = 3), literature search ( n  = 3), study selection ( n  = 2), data extraction ( n  = 1), and RoB assessment ( n  = 2) (Table  2 ).

Summary of findings from review evaluating systematic review methods

Key elementsAuthor, yearMethod assessedEvaluations/outcomes (P—primary; S—secondary)Summary of SR authors’ conclusionsQuality of review
Excluding study data based on publication statusHopewell, 2007 Gray vs. published literaturePooled effect estimatePublished trials are usually larger and show an overall greater treatment effect than gray trials. Excluding trials reported in gray literature from SRs and MAs may exaggerate the results.Moderate
Schmucker, 2017 Gray and/or unpublished vs. published literatureP: Pooled effect estimateExcluding unpublished trials had no or only a small effect on the pooled estimates of treatment effects. Insufficient evidence to conclude the impact of including unpublished or gray study data on MA conclusions.Moderate
S: Impact on interpretation of MA
Excluding study data based on language of publicationMorrison, 2012 English language vs. non‐English language publicationsP: Bias in summary treatment effectsNo evidence of a systematic bias from the use of English language restrictions in systematic review‐based meta‐analyses in conventional medicine. Conflicting results on the methodological and reporting quality of English and non‐English language RCTs. Further research required.Low
S: number of included studies and patients, methodological quality and statistical heterogeneity
Resources searchingCrumley, 2005 Two or more resources searching vs. resource‐specific searchingRecall and precisionMultiple‐source comprehensive searches are necessary to identify all RCTs for a systematic review. For electronic databases, using the Cochrane HSS or complex search strategy in consultation with a librarian is recommended.Critically low
Supplementary searchingHopewell, 2007 Handsearching only vs. one or more electronic database(s) searchingNumber of identified randomized trialsHandsearching is important for identifying trial reports for inclusion in systematic reviews of health care interventions published in nonindexed journals. Where time and resources are limited, majority of the full English‐language trial reports can be identified using a complex search or the Cochrane HSS.Moderate
Horsley, 2011 Checking reference list (no comparison)P: additional yield of checking reference listsThere is some evidence to support the use of checking reference lists to complement literature search in systematic reviews.Low
S: additional yield by publication type, study design or both and data pertaining to costs
Reviewer characteristicsRobson, 2019 Single vs. double reviewer screeningP: Accuracy, reliability, or efficiency of a methodUsing two reviewers for screening is recommended. If resources are limited, one reviewer can screen, and other reviewer can verify the list of excluded studies.Low
S: factors affecting accuracy or reliability of a method
Experienced vs. inexperienced reviewers for screeningScreening must be performed by experienced reviewers
Screening by blinded vs. unblinded reviewersAuthors do not recommend blinding of reviewers during screening as the blinding process was time‐consuming and had little impact on the results of MA
Use of technology for study selectionRobson, 2019 Use of dual computer monitors vs. nonuse of dual monitors for screeningP: Accuracy, reliability, or efficiency of a methodThere are no significant differences in the time spent on abstract or full‐text screening with the use and nonuse of dual monitorsLow
S: factors affecting accuracy or reliability of a method
Use of Google translate to translate non‐English citations to facilitate screeningUse of Google translate to screen German language citations
O'Mara‐Eves, 2015 Use of text mining for title and abstract screeningAny evaluation concerning workload reductionText mining approaches can be used to reduce the number of studies to be screened, increase the rate of screening, improve the workflow with screening prioritization, and replace the second reviewer. The evaluated approaches reported saving a workload of between 30% and 70%Critically low
Order of screeningRobson, 2019 Title‐first screening vs. title‐and‐abstract simultaneous screeningP: Accuracy, reliability, or efficiency of a methodTitle‐first screening showed no substantial gain in time when compared to simultaneous title and abstract screening.Low
S: factors affecting accuracy or reliability of a method
Reviewer characteristicsRobson, 2019 Single vs. double reviewer data extractionP: Accuracy, reliability, or efficiency of a methodUse two reviewers for data extraction. Single reviewer data extraction followed by the verification of outcome data by a second reviewer (where statistical analysis is planned), if resources precludeLow
S: factors affecting accuracy or reliability of a method
Experienced vs. inexperienced reviewers for data extractionExperienced reviewers must be used for extracting continuous outcomes data
Data extraction by blinded vs. unblinded reviewersAuthors do not recommend blinding of reviewers during data extraction as it had no impact on the results of MA
Use of technology for data extractionUse of dual computer monitors vs. nonuse of dual monitors for data extractionUsing two computer monitors may improve the efficiency of data extraction
Data extraction by two English reviewers using Google translate vs. data extraction by two reviewers fluent in respective languagesGoogle translate provides limited accuracy for data extraction
Computer‐assisted vs. double reviewer extraction of graphical dataUse of computer‐assisted programs to extract graphical data
Obtaining additional dataContacting study authors for additional dataRecommend contacting authors for obtaining additional relevant data
Reviewer characteristicsRobson, 2019 Quality appraisal by blinded vs. unblinded reviewersP: Accuracy, reliability, or efficiency of a methodInconsistent results on RoB assessments performed by blinded and unblinded reviewers. Blinding reviewers for quality appraisal not recommendedLow
S: factors affecting accuracy or reliability of a method
Morissette, 2011 Risk of bias (RoB) assessment by blinded vs. unblinded reviewersP: Mean difference and 95% confidence interval between RoB assessment scoresFindings related to the difference between blinded and unblinded RoB assessments are inconsistent from the studies. Pooled effects show no differences in RoB assessments for assessments completed in a blinded or unblinded manner.Moderate
S: qualitative level of agreement, mean RoB scores and measures of variance for the results of the RoB assessments, and inter‐rater reliability between blinded and unblinded reviewers
Robson, 2019 Experienced vs. inexperienced reviewers for quality appraisalP: Accuracy, reliability, or efficiency of a methodReviewers performing quality appraisal must be trained. Quality assessment tool must be pilot tested.Low
S: factors affecting accuracy or reliability of a method
Use of additional guidance vs. nonuse of additional guidance for quality appraisalProviding guidance and decision rules for quality appraisal improved the inter‐rater reliability in RoB assessments.
Obtaining additional dataContacting study authors for obtaining additional information/use of supplementary information available in the published trials vs. no additional information for quality appraisalAdditional data related to study quality obtained by contacting study authors improved the quality assessment.
RoB assessment of qualitative studiesStructured vs. unstructured appraisal of qualitative research studiesUse of structured tool if qualitative and quantitative studies designs are included in the review. For qualitative reviews, either structured or unstructured quality appraisal tool can be used.

There was some overlap in the primary studies evaluated in the included SRs on the same topics: Schmucker et al. 26 and Hopewell et al. 21 ( n  = 4), Hopewell et al. 20 and Crumley et al. 19 ( n  = 30), and Robson et al. 14 and Morissette et al. 23 ( n  = 4). There were no conflicting results between any of the identified SRs on the same topic.

3.2. Methodological quality of included reviews

Overall, the quality of the included reviews was assessed as moderate at best (Table  2 ). The most common critical weakness in the reviews was failure to provide justification for excluding individual studies (four reviews). Detailed quality assessment is provided in Appendix C .

3.3. Evidence on systematic review methods

3.3.1. methods for defining review scope and eligibility.

Two SRs investigated the effect of excluding data obtained from gray or unpublished sources on the pooled effect estimates of MA. 21 , 26 Hopewell et al. 21 reviewed five studies that compared the impact of gray literature on the results of a cohort of MA of RCTs in health care interventions. Gray literature was defined as information published in “print or electronic sources not controlled by commercial or academic publishers.” Findings showed an overall greater treatment effect for published trials than trials reported in gray literature. In a more recent review, Schmucker et al. 26 addressed similar objectives, by investigating gray and unpublished data in medicine. In addition to gray literature, defined similar to the previous review by Hopewell et al., the authors also evaluated unpublished data—defined as “supplemental unpublished data related to published trials, data obtained from the Food and Drug Administration  or other regulatory websites or postmarketing analyses hidden from the public.” The review found that in majority of the MA, excluding gray literature had little or no effect on the pooled effect estimates. The evidence was limited to conclude if the data from gray and unpublished literature had an impact on the conclusions of MA. 26

Morrison et al. 24 examined five studies measuring the effect of excluding non‐English language RCTs on the summary treatment effects of SR‐based MA in various fields of conventional medicine. Although none of the included studies reported major difference in the treatment effect estimates between English only and non‐English inclusive MA, the review found inconsistent evidence regarding the methodological and reporting quality of English and non‐English trials. 24 As such, there might be a risk of introducing “language bias” when excluding non‐English language RCTs. The authors also noted that the numbers of non‐English trials vary across medical specialties, as does the impact of these trials on MA results. Based on these findings, Morrison et al. 24 conclude that literature searches must include non‐English studies when resources and time are available to minimize the risk of introducing “language bias.”

3.3.2. Methods for searching studies

Crumley et al. 19 analyzed recall (also referred to as “sensitivity” by some researchers; defined as “percentage of relevant studies identified by the search”) and precision (defined as “percentage of studies identified by the search that were relevant”) when searching a single resource to identify randomized controlled trials and controlled clinical trials, as opposed to searching multiple resources. The studies included in their review frequently compared a MEDLINE only search with the search involving a combination of other resources. The review found low median recall estimates (median values between 24% and 92%) and very low median precisions (median values between 0% and 49%) for most of the electronic databases when searched singularly. 19 A between‐database comparison, based on the type of search strategy used, showed better recall and precision for complex and Cochrane Highly Sensitive search strategies (CHSSS). In conclusion, the authors emphasize that literature searches for trials in SRs must include multiple sources. 19

In an SR comparing handsearching and electronic database searching, Hopewell et al. 20 found that handsearching retrieved more relevant RCTs (retrieval rate of 92%−100%) than searching in a single electronic database (retrieval rates of 67% for PsycINFO/PsycLIT, 55% for MEDLINE, and 49% for Embase). The retrieval rates varied depending on the quality of handsearching, type of electronic search strategy used (e.g., simple, complex or CHSSS), and type of trial reports searched (e.g., full reports, conference abstracts, etc.). The authors concluded that handsearching was particularly important in identifying full trials published in nonindexed journals and in languages other than English, as well as those published as abstracts and letters. 20

The effectiveness of checking reference lists to retrieve additional relevant studies for an SR was investigated by Horsley et al. 22 The review reported that checking reference lists yielded 2.5%–40% more studies depending on the quality and comprehensiveness of the electronic search used. The authors conclude that there is some evidence, although from poor quality studies, to support use of checking reference lists to supplement database searching. 22

3.3.3. Methods for selecting studies

Three approaches relevant to reviewer characteristics, including number, experience, and blinding of reviewers involved in the screening process were highlighted in an SR by Robson et al. 14 Based on the retrieved evidence, the authors recommended that two independent, experienced, and unblinded reviewers be involved in study selection. 14 A modified approach has also been suggested by the review authors, where one reviewer screens and the other reviewer verifies the list of excluded studies, when the resources are limited. It should be noted however this suggestion is likely based on the authors’ opinion, as there was no evidence related to this from the studies included in the review.

Robson et al. 14 also reported two methods describing the use of technology for screening studies: use of Google Translate for translating languages (for example, German language articles to English) to facilitate screening was considered a viable method, while using two computer monitors for screening did not increase the screening efficiency in SR. Title‐first screening was found to be more efficient than simultaneous screening of titles and abstracts, although the gain in time with the former method was lesser than the latter. Therefore, considering that the search results are routinely exported as titles and abstracts, Robson et al. 14 recommend screening titles and abstracts simultaneously. However, the authors note that these conclusions were based on very limited number (in most instances one study per method) of low‐quality studies. 14

3.3.4. Methods for data extraction

Robson et al. 14 examined three approaches for data extraction relevant to reviewer characteristics, including number, experience, and blinding of reviewers (similar to the study selection step). Although based on limited evidence from a small number of studies, the authors recommended use of two experienced and unblinded reviewers for data extraction. The experience of the reviewers was suggested to be especially important when extracting continuous outcomes (or quantitative) data. However, when the resources are limited, data extraction by one reviewer and a verification of the outcomes data by a second reviewer was recommended.

As for the methods involving use of technology, Robson et al. 14 identified limited evidence on the use of two monitors to improve the data extraction efficiency and computer‐assisted programs for graphical data extraction. However, use of Google Translate for data extraction in non‐English articles was not considered to be viable. 14 In the same review, Robson et al. 14 identified evidence supporting contacting authors for obtaining additional relevant data.

3.3.5. Methods for RoB assessment

Two SRs examined the impact of blinding of reviewers for RoB assessments. 14 , 23 Morissette et al. 23 investigated the mean differences between the blinded and unblinded RoB assessment scores and found inconsistent differences among the included studies providing no definitive conclusions. Similar conclusions were drawn in a more recent review by Robson et al., 14 which included four studies on reviewer blinding for RoB assessment that completely overlapped with Morissette et al. 23

Use of experienced reviewers and provision of additional guidance for RoB assessment were examined by Robson et al. 14 The review concluded that providing intensive training and guidance on assessing studies reporting insufficient data to the reviewers improves RoB assessments. 14 Obtaining additional data related to quality assessment by contacting study authors was also found to help the RoB assessments, although based on limited evidence. When assessing the qualitative or mixed method reviews, Robson et al. 14 recommends the use of a structured RoB tool as opposed to an unstructured tool. No SRs were identified on data synthesis and CoE assessment and reporting steps.

4. DISCUSSION

4.1. summary of findings.

Nine SRs examining 24 unique methods used across five steps in the SR process were identified in this overview. The collective evidence supports some current traditional and modified SR practices, while challenging other approaches. However, the quality of the included reviews was assessed to be moderate at best and in the majority of the included SRs, evidence related to the evaluated methods was obtained from very limited numbers of primary studies. As such, the interpretations from these SRs should be made cautiously.

The evidence gathered from the included SRs corroborate a few current SR approaches. 5 For example, it is important to search multiple resources for identifying relevant trials (RCTs and/or CCTs). The resources must include a combination of electronic database searching, handsearching, and reference lists of retrieved articles. 5 However, no SRs have been identified that evaluated the impact of the number of electronic databases searched. A recent study by Halladay et al. 27 found that articles on therapeutic intervention, retrieved by searching databases other than PubMed (including Embase), contributed only a small amount of information to the MA and also had a minimal impact on the MA results. The authors concluded that when the resources are limited and when large number of studies are expected to be retrieved for the SR or MA, PubMed‐only search can yield reliable results. 27

Findings from the included SRs also reiterate some methodological modifications currently employed to “expedite” the SR process. 10 , 11 For example, excluding non‐English language trials and gray/unpublished trials from MA have been shown to have minimal or no impact on the results of MA. 24 , 26 However, the efficiency of these SR methods, in terms of time and the resources used, have not been evaluated in the included SRs. 24 , 26 Of the SRs included, only two have focused on the aspect of efficiency 14 , 25 ; O'Mara‐Eves et al. 25 report some evidence to support the use of text‐mining approaches for title and abstract screening in order to increase the rate of screening. Moreover, only one included SR 14 considered primary studies that evaluated reliability (inter‐ or intra‐reviewer consistency) and accuracy (validity when compared against a “gold standard” method) of the SR methods. This can be attributed to the limited number of primary studies that evaluated these outcomes when evaluating the SR methods. 14 Lack of outcome measures related to reliability, accuracy, and efficiency precludes making definitive recommendations on the use of these methods/modifications. Future research studies must focus on these outcomes.

Some evaluated methods may be relevant to multiple steps; for example, exclusions based on publication status (gray/unpublished literature) and language of publication (non‐English language studies) can be outlined in the a priori eligibility criteria or can be incorporated as search limits in the search strategy. SRs included in this overview focused on the effect of study exclusions on pooled treatment effect estimates or MA conclusions. Excluding studies from the search results, after conducting a comprehensive search, based on different eligibility criteria may yield different results when compared to the results obtained when limiting the search itself. 28 Further studies are required to examine this aspect.

Although we acknowledge the lack of standardized quality assessment tools for methodological study designs, we adhered to the Cochrane criteria for identifying SRs in this overview. This was done to ensure consistency in the quality of the included evidence. As a result, we excluded three reviews that did not provide any form of discussion on the quality of the included studies. The methods investigated in these reviews concern supplementary search, 29 data extraction, 12 and screening. 13 However, methods reported in two of these three reviews, by Mathes et al. 12 and Waffenschmidt et al., 13 have also been examined in the SR by Robson et al., 14 which was included in this overview; in most instances (with the exception of one study included in Mathes et al. 12 and Waffenschmidt et al. 13 each), the studies examined in these excluded reviews overlapped with those in the SR by Robson et al. 14

One of the key gaps in the knowledge observed in this overview was the dearth of SRs on the methods used in the data synthesis component of SR. Narrative and quantitative syntheses are the two most commonly used approaches for synthesizing data in evidence synthesis. 5 There are some published studies on the proposed indications and implications of these two approaches. 30 , 31 These studies found that both data synthesis methods produced comparable results and have their own advantages, suggesting that the choice of the method must be based on the purpose of the review. 31 With increasing number of “expedited” SR approaches (so called “rapid reviews”) avoiding MA, 10 , 11 further research studies are warranted in this area to determine the impact of the type of data synthesis on the results of the SR.

4.2. Implications for future research

The findings of this overview highlight several areas of paucity in primary research and evidence synthesis on SR methods. First, no SRs were identified on methods used in two important components of the SR process, including data synthesis and CoE and reporting. As for the included SRs, a limited number of evaluation studies have been identified for several methods. This indicates that further research is required to corroborate many of the methods recommended in current SR guidelines. 4 , 5 , 6 , 7 Second, some SRs evaluated the impact of methods on the results of quantitative synthesis and MA conclusions. Future research studies must also focus on the interpretations of SR results. 28 , 32 Finally, most of the included SRs were conducted on specific topics related to the field of health care, limiting the generalizability of the findings to other areas. It is important that future research studies evaluating evidence syntheses broaden the objectives and include studies on different topics within the field of health care.

4.3. Strengths and limitations

To our knowledge, this is the first overview summarizing current evidence from SRs and MA on different methodological approaches used in several fundamental steps in SR conduct. The overview methodology followed well established guidelines and strict criteria defined for the inclusion of SRs.

There are several limitations related to the nature of the included reviews. Evidence for most of the methods investigated in the included reviews was derived from a limited number of primary studies. Also, the majority of the included SRs may be considered outdated as they were published (or last updated) more than 5 years ago 33 ; only three of the nine SRs have been published in the last 5 years. 14 , 25 , 26 Therefore, important and recent evidence related to these topics may not have been included. Substantial numbers of included SRs were conducted in the field of health, which may limit the generalizability of the findings. Some method evaluations in the included SRs focused on quantitative analyses components and MA conclusions only. As such, the applicability of these findings to SR more broadly is still unclear. 28 Considering the methodological nature of our overview, limiting the inclusion of SRs according to the Cochrane criteria might have resulted in missing some relevant evidence from those reviews without a quality assessment component. 12 , 13 , 29 Although the included SRs performed some form of quality appraisal of the included studies, most of them did not use a standardized RoB tool, which may impact the confidence in their conclusions. Due to the type of outcome measures used for the method evaluations in the primary studies and the included SRs, some of the identified methods have not been validated against a reference standard.

Some limitations in the overview process must be noted. While our literature search was exhaustive covering five bibliographic databases and supplementary search of reference lists, no gray sources or other evidence resources were searched. Also, the search was primarily conducted in health databases, which might have resulted in missing SRs published in other fields. Moreover, only English language SRs were included for feasibility. As the literature search retrieved large number of citations (i.e., 41,556), the title and abstract screening was performed by a single reviewer, calibrated for consistency in the screening process by another reviewer, owing to time and resource limitations. These might have potentially resulted in some errors when retrieving and selecting relevant SRs. The SR methods were grouped based on key elements of each recommended SR step, as agreed by the authors. This categorization pertains to the identified set of methods and should be considered subjective.

5. CONCLUSIONS

This overview identified limited SR‐level evidence on various methodological approaches currently employed during five of the seven fundamental steps in the SR process. Limited evidence was also identified on some methodological modifications currently used to expedite the SR process. Overall, findings highlight the dearth of SRs on SR methodologies, warranting further work to confirm several current recommendations on conventional and expedited SR processes.

CONFLICT OF INTEREST

The authors declare no conflicts of interest.

Supporting information

APPENDIX A: Detailed search strategies

ACKNOWLEDGMENTS

The first author is supported by a La Trobe University Full Fee Research Scholarship and a Graduate Research Scholarship.

Open Access Funding provided by La Trobe University.

Veginadu P, Calache H, Gussy M, Pandian A, Masood M. An overview of methodological approaches in systematic reviews . J Evid Based Med . 2022; 15 :39–54. 10.1111/jebm.12468 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]

Loading metrics

Open Access

Peer-reviewed

Research Article

Functional connectivity changes in the brain of adolescents with internet addiction: A systematic literature review of imaging studies

Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

Affiliation Child and Adolescent Mental Health, Department of Brain Sciences, Great Ormond Street Institute of Child Health, University College London, London, United Kingdom

Roles Conceptualization, Supervision, Validation, Writing – review & editing

* E-mail: [email protected]

Affiliation Behavioural Brain Sciences Unit, Population Policy Practice Programme, Great Ormond Street Institute of Child Health, University College London, London, United Kingdom

ORCID logo

  • Max L. Y. Chang, 
  • Irene O. Lee

PLOS

  • Published: June 4, 2024
  • https://doi.org/10.1371/journal.pmen.0000022
  • Peer Review
  • Reader Comments

Fig 1

Internet usage has seen a stark global rise over the last few decades, particularly among adolescents and young people, who have also been diagnosed increasingly with internet addiction (IA). IA impacts several neural networks that influence an adolescent’s behaviour and development. This article issued a literature review on the resting-state and task-based functional magnetic resonance imaging (fMRI) studies to inspect the consequences of IA on the functional connectivity (FC) in the adolescent brain and its subsequent effects on their behaviour and development. A systematic search was conducted from two databases, PubMed and PsycINFO, to select eligible articles according to the inclusion and exclusion criteria. Eligibility criteria was especially stringent regarding the adolescent age range (10–19) and formal diagnosis of IA. Bias and quality of individual studies were evaluated. The fMRI results from 12 articles demonstrated that the effects of IA were seen throughout multiple neural networks: a mix of increases/decreases in FC in the default mode network; an overall decrease in FC in the executive control network; and no clear increase or decrease in FC within the salience network and reward pathway. The FC changes led to addictive behaviour and tendencies in adolescents. The subsequent behavioural changes are associated with the mechanisms relating to the areas of cognitive control, reward valuation, motor coordination, and the developing adolescent brain. Our results presented the FC alterations in numerous brain regions of adolescents with IA leading to the behavioural and developmental changes. Research on this topic had a low frequency with adolescent samples and were primarily produced in Asian countries. Future research studies of comparing results from Western adolescent samples provide more insight on therapeutic intervention.

Citation: Chang MLY, Lee IO (2024) Functional connectivity changes in the brain of adolescents with internet addiction: A systematic literature review of imaging studies. PLOS Ment Health 1(1): e0000022. https://doi.org/10.1371/journal.pmen.0000022

Editor: Kizito Omona, Uganda Martyrs University, UGANDA

Received: December 29, 2023; Accepted: March 18, 2024; Published: June 4, 2024

Copyright: © 2024 Chang, Lee. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the paper and its Supporting information files.

Funding: The authors received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.

Introduction

The behavioural addiction brought on by excessive internet use has become a rising source of concern [ 1 ] since the last decade. According to clinical studies, individuals with Internet Addiction (IA) or Internet Gaming Disorder (IGD) may have a range of biopsychosocial effects and is classified as an impulse-control disorder owing to its resemblance to pathological gambling and substance addiction [ 2 , 3 ]. IA has been defined by researchers as a person’s inability to resist the urge to use the internet, which has negative effects on their psychological well-being as well as their social, academic, and professional lives [ 4 ]. The symptoms can have serious physical and interpersonal repercussions and are linked to mood modification, salience, tolerance, impulsivity, and conflict [ 5 ]. In severe circumstances, people may experience severe pain in their bodies or health issues like carpal tunnel syndrome, dry eyes, irregular eating and disrupted sleep [ 6 ]. Additionally, IA is significantly linked to comorbidities with other psychiatric disorders [ 7 ].

Stevens et al (2021) reviewed 53 studies including 17 countries and reported the global prevalence of IA was 3.05% [ 8 ]. Asian countries had a higher prevalence (5.1%) than European countries (2.7%) [ 8 ]. Strikingly, adolescents and young adults had a global IGD prevalence rate of 9.9% which matches previous literature that reported historically higher prevalence among adolescent populations compared to adults [ 8 , 9 ]. Over 80% of adolescent population in the UK, the USA, and Asia have direct access to the internet [ 10 ]. Children and adolescents frequently spend more time on media (possibly 7 hours and 22 minutes per day) than at school or sleeping [ 11 ]. Developing nations have also shown a sharp rise in teenage internet usage despite having lower internet penetration rates [ 10 ]. Concerns regarding the possible harms that overt internet use could do to adolescents and their development have arisen because of this surge, especially the significant impacts by the COVID-19 pandemic [ 12 ]. The growing prevalence and neurocognitive consequences of IA among adolescents makes this population a vital area of study [ 13 ].

Adolescence is a crucial developmental stage during which people go through significant changes in their biology, cognition, and personalities [ 14 ]. Adolescents’ emotional-behavioural functioning is hyperactivated, which creates risk of psychopathological vulnerability [ 15 ]. In accordance with clinical study results [ 16 ], this emotional hyperactivity is supported by a high level of neuronal plasticity. This plasticity enables teenagers to adapt to the numerous physical and emotional changes that occur during puberty as well as develop communication techniques and gain independence [ 16 ]. However, the strong neuronal plasticity is also associated with risk-taking and sensation seeking [ 17 ] which may lead to IA.

Despite the fact that the precise neuronal mechanisms underlying IA are still largely unclear, functional magnetic resonance imaging (fMRI) method has been used by scientists as an important framework to examine the neuropathological changes occurring in IA, particularly in the form of functional connectivity (FC) [ 18 ]. fMRI research study has shown that IA alters both the functional and structural makeup of the brain [ 3 ].

We hypothesise that IA has widespread neurological alteration effects rather than being limited to a few specific brain regions. Further hypothesis holds that according to these alterations of FC between the brain regions or certain neural networks, adolescents with IA would experience behavioural changes. An investigation of these domains could be useful for creating better procedures and standards as well as minimising the negative effects of overt internet use. This literature review aims to summarise and analyse the evidence of various imaging studies that have investigated the effects of IA on the FC in adolescents. This will be addressed through two research questions:

  • How does internet addiction affect the functional connectivity in the adolescent brain?
  • How is adolescent behaviour and development impacted by functional connectivity changes due to internet addiction?

The review protocol was conducted in line with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines (see S1 Checklist ).

Search strategy and selection process

A systematic search was conducted up until April 2023 from two sources of database, PubMed and PsycINFO, using a range of terms relevant to the title and research questions (see full list of search terms in S1 Appendix ). All the searched articles can be accessed in the S1 Data . The eligible articles were selected according to the inclusion and exclusion criteria. Inclusion criteria used for the present review were: (i) participants in the studies with clinical diagnosis of IA; (ii) participants between the ages of 10 and 19; (iii) imaging research investigations; (iv) works published between January 2013 and April 2023; (v) written in English language; (vi) peer-reviewed papers and (vii) full text. The numbers of articles excluded due to not meeting the inclusion criteria are shown in Fig 1 . Each study’s title and abstract were screened for eligibility.

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

https://doi.org/10.1371/journal.pmen.0000022.g001

Quality appraisal

Full texts of all potentially relevant studies were then retrieved and further appraised for eligibility. Furthermore, articles were critically appraised based on the GRADE (Grading of Recommendations, Assessment, Development, and Evaluations) framework to evaluate the individual study for both quality and bias. The subsequent quality levels were then appraised to each article and listed as either low, moderate, or high.

Data collection process

Data that satisfied the inclusion requirements was entered into an excel sheet for data extraction and further selection. An article’s author, publication year, country, age range, participant sample size, sex, area of interest, measures, outcome and article quality were all included in the data extraction spreadsheet. Studies looking at FC, for instance, were grouped, while studies looking at FC in specific area were further divided into sub-groups.

Data synthesis and analysis

Articles were classified according to their location in the brain as well as the network or pathway they were a part of to create a coherent narrative between the selected studies. Conclusions concerning various research trends relevant to particular groupings were drawn from these groupings and subgroupings. To maintain the offered information in a prominent manner, these assertions were entered into the data extraction excel spreadsheet.

With the search performed on the selected databases, 238 articles in total were identified (see Fig 1 ). 15 duplicated articles were eliminated, and another 6 items were removed for various other reasons. Title and abstract screening eliminated 184 articles because they were not in English (number of article, n, = 7), did not include imaging components (n = 47), had adult participants (n = 53), did not have a clinical diagnosis of IA (n = 19), did not address FC in the brain (n = 20), and were published outside the desired timeframe (n = 38). A further 21 papers were eliminated for failing to meet inclusion requirements after the remaining 33 articles underwent full-text eligibility screening. A total of 12 papers were deemed eligible for this review analysis.

Characteristics of the included studies, as depicted in the data extraction sheet in Table 1 provide information of the author(s), publication year, sample size, study location, age range, gender, area of interest, outcome, measures used and quality appraisal. Most of the studies in this review utilised resting state functional magnetic resonance imaging techniques (n = 7), with several studies demonstrating task-based fMRI procedures (n = 3), and the remaining studies utilising whole-brain imaging measures (n = 2). The studies were all conducted in Asiatic countries, specifically coming from China (8), Korea (3), and Indonesia (1). Sample sizes ranged from 12 to 31 participants with most of the imaging studies having comparable sample sizes. Majority of the studies included a mix of male and female participants (n = 8) with several studies having a male only participant pool (n = 3). All except one of the mixed gender studies had a majority male participant pool. One study did not disclose their data on the gender demographics of their experiment. Study years ranged from 2013–2022, with 2 studies in 2013, 3 studies in 2014, 3 studies in 2015, 1 study in 2017, 1 study in 2020, 1 study in 2021, and 1 study in 2022.

thumbnail

https://doi.org/10.1371/journal.pmen.0000022.t001

(1) How does internet addiction affect the functional connectivity in the adolescent brain?

The included studies were organised according to the brain region or network that they were observing. The specific networks affected by IA were the default mode network, executive control system, salience network and reward pathway. These networks are vital components of adolescent behaviour and development [ 31 ]. The studies in each section were then grouped into subsections according to their specific brain regions within their network.

Default mode network (DMN)/reward network.

Out of the 12 studies, 3 have specifically studied the default mode network (DMN), and 3 observed whole-brain FC that partially included components of the DMN. The effect of IA on the various centres of the DMN was not unilaterally the same. The findings illustrate a complex mix of increases and decreases in FC depending on the specific region in the DMN (see Table 2 and Fig 2 ). The alteration of FC in posterior cingulate cortex (PCC) in the DMN was the most frequently reported area in adolescents with IA, which involved in attentional processes [ 32 ], but Lee et al. (2020) additionally found alterations of FC in other brain regions, such as anterior insula cortex, a node in the DMN that controls the integration of motivational and cognitive processes [ 20 ].

thumbnail

https://doi.org/10.1371/journal.pmen.0000022.g002

thumbnail

The overall changes of functional connectivity in the brain network including default mode network (DMN), executive control network (ECN), salience network (SN) and reward network. IA = Internet Addiction, FC = Functional Connectivity.

https://doi.org/10.1371/journal.pmen.0000022.t002

Ding et al. (2013) revealed altered FC in the cerebellum, the middle temporal gyrus, and the medial prefrontal cortex (mPFC) [ 22 ]. They found that the bilateral inferior parietal lobule, left superior parietal lobule, and right inferior temporal gyrus had decreased FC, while the bilateral posterior lobe of the cerebellum and the medial temporal gyrus had increased FC [ 22 ]. The right middle temporal gyrus was found to have 111 cluster voxels (t = 3.52, p<0.05) and the right inferior parietal lobule was found to have 324 cluster voxels (t = -4.07, p<0.05) with an extent threshold of 54 voxels (figures above this threshold are deemed significant) [ 22 ]. Additionally, there was a negative correlation, with 95 cluster voxels (p<0.05) between the FC of the left superior parietal lobule and the PCC with the Chen Internet Addiction Scores (CIAS) which are used to determine the severity of IA [ 22 ]. On the other hand, in regions of the reward system, connection with the PCC was positively connected with CIAS scores [ 22 ]. The most significant was the right praecuneus with 219 cluster voxels (p<0.05) [ 22 ]. Wang et al. (2017) also discovered that adolescents with IA had 33% less FC in the left inferior parietal lobule and 20% less FC in the dorsal mPFC [ 24 ]. A potential connection between the effects of substance use and overt internet use is revealed by the generally decreased FC in these areas of the DMN of teenagers with drug addiction and IA [ 35 ].

The putamen was one of the main regions of reduced FC in adolescents with IA [ 19 ]. The putamen and the insula-operculum demonstrated significant group differences regarding functional connectivity with a cluster size of 251 and an extent threshold of 250 (Z = 3.40, p<0.05) [ 19 ]. The molecular mechanisms behind addiction disorders have been intimately connected to decreased striatal dopaminergic function [ 19 ], making this function crucial.

Executive Control Network (ECN).

5 studies out of 12 have specifically viewed parts of the executive control network (ECN) and 3 studies observed whole-brain FC. The effects of IA on the ECN’s constituent parts were consistent across all the studies examined for this analysis (see Table 2 and Fig 3 ). The results showed a notable decline in all the ECN’s major centres. Li et al. (2014) used fMRI imaging and a behavioural task to study response inhibition in adolescents with IA [ 25 ] and found decreased activation at the striatum and frontal gyrus, particularly a reduction in FC at inferior frontal gyrus, in the IA group compared to controls [ 25 ]. The inferior frontal gyrus showed a reduction in FC in comparison to the controls with a cluster size of 71 (t = 4.18, p<0.05) [ 25 ]. In addition, the frontal-basal ganglia pathways in the adolescents with IA showed little effective connection between areas and increased degrees of response inhibition [ 25 ].

thumbnail

https://doi.org/10.1371/journal.pmen.0000022.g003

Lin et al. (2015) found that adolescents with IA demonstrated disrupted corticostriatal FC compared to controls [ 33 ]. The corticostriatal circuitry experienced decreased connectivity with the caudate, bilateral anterior cingulate cortex (ACC), as well as the striatum and frontal gyrus [ 33 ]. The inferior ventral striatum showed significantly reduced FC with the subcallosal ACC and caudate head with cluster size of 101 (t = -4.64, p<0.05) [ 33 ]. Decreased FC in the caudate implies dysfunction of the corticostriatal-limbic circuitry involved in cognitive and emotional control [ 36 ]. The decrease in FC in both the striatum and frontal gyrus is related to inhibitory control, a common deficit seen with disruptions with the ECN [ 33 ].

The dorsolateral prefrontal cortex (DLPFC), ACC, and right supplementary motor area (SMA) of the prefrontal cortex were all found to have significantly decreased grey matter volume [ 29 ]. In addition, the DLPFC, insula, temporal cortices, as well as significant subcortical regions like the striatum and thalamus, showed decreased FC [ 29 ]. According to Tremblay (2009), the striatum plays a significant role in the processing of rewards, decision-making, and motivation [ 37 ]. Chen et al. (2020) reported that the IA group demonstrated increased impulsivity as well as decreased reaction inhibition using a Stroop colour-word task [ 26 ]. Furthermore, Chen et al. (2020) observed that the left DLPFC and dorsal striatum experienced a negative connection efficiency value, specifically demonstrating that the dorsal striatum activity suppressed the left DLPFC [ 27 ].

Salience network (SN).

Out of the 12 chosen studies, 3 studies specifically looked at the salience network (SN) and 3 studies have observed whole-brain FC. Relative to the DMN and ECN, the findings on the SN were slightly sparser. Despite this, adolescents with IA demonstrated a moderate decrease in FC, as well as other measures like fibre connectivity and cognitive control, when compared to healthy control (see Table 2 and Fig 4 ).

thumbnail

https://doi.org/10.1371/journal.pmen.0000022.g004

Xing et al. (2014) used both dorsal anterior cingulate cortex (dACC) and insula to test FC changes in the SN of adolescents with IA and found decreased structural connectivity in the SN as well as decreased fractional anisotropy (FA) that correlated to behaviour performance in the Stroop colour word-task [ 21 ]. They examined the dACC and insula to determine whether the SN’s disrupted connectivity may be linked to the SN’s disruption of regulation, which would explain the impaired cognitive control seen in adolescents with IA. However, researchers did not find significant FC differences in the SN when compared to the controls [ 21 ]. These results provided evidence for the structural changes in the interconnectivity within SN in adolescents with IA.

Wang et al. (2017) investigated network interactions between the DMN, ECN, SN and reward pathway in IA subjects [ 24 ] (see Fig 5 ), and found 40% reduction of FC between the DMN and specific regions of the SN, such as the insula, in comparison to the controls (p = 0.008) [ 24 ]. The anterior insula and dACC are two areas that are impacted by this altered FC [ 24 ]. This finding supports the idea that IA has similar neurobiological abnormalities with other addictive illnesses, which is in line with a study that discovered disruptive changes in the SN and DMN’s interaction in cocaine addiction [ 38 ]. The insula has also been linked to the intensity of symptoms and has been implicated in the development of IA [ 39 ].

thumbnail

“+” indicates an increase in behaivour; “-”indicates a decrease in behaviour; solid arrows indicate a direct network interaction; and the dotted arrows indicates a reduction in network interaction. This diagram depicts network interactions juxtaposed with engaging in internet related behaviours. Through the neural interactions, the diagram illustrates how the networks inhibit or amplify internet usage and vice versa. Furthermore, it demonstrates how the SN mediates both the DMN and ECN.

https://doi.org/10.1371/journal.pmen.0000022.g005

(2) How is adolescent behaviour and development impacted by functional connectivity changes due to internet addiction?

The findings that IA individuals demonstrate an overall decrease in FC in the DMN is supported by numerous research [ 24 ]. Drug addict populations also exhibited similar decline in FC in the DMN [ 40 ]. The disruption of attentional orientation and self-referential processing for both substance and behavioural addiction was then hypothesised to be caused by DMN anomalies in FC [ 41 ].

In adolescents with IA, decline of FC in the parietal lobule affects visuospatial task-related behaviour [ 22 ], short-term memory [ 42 ], and the ability of controlling attention or restraining motor responses during response inhibition tests [ 42 ]. Cue-induced gaming cravings are influenced by the DMN [ 43 ]. A visual processing area called the praecuneus links gaming cues to internal information [ 22 ]. A meta-analysis found that the posterior cingulate cortex activity of individuals with IA during cue-reactivity tasks was connected with their gaming time [ 44 ], suggesting that excessive gaming may impair DMN function and that individuals with IA exert more cognitive effort to control it. Findings for the behavioural consequences of FC changes in the DMN illustrate its underlying role in regulating impulsivity, self-monitoring, and cognitive control.

Furthermore, Ding et al. (2013) reported an activation of components of the reward pathway, including areas like the nucleus accumbens, praecuneus, SMA, caudate, and thalamus, in connection to the DMN [ 22 ]. The increased FC of the limbic and reward networks have been confirmed to be a major biomarker for IA [ 45 , 46 ]. The increased reinforcement in these networks increases the strength of reward stimuli and makes it more difficult for other networks, namely the ECN, to down-regulate the increased attention [ 29 ] (See Fig 5 ).

Executive control network (ECN).

The numerous IA-affected components in the ECN have a role in a variety of behaviours that are connected to both response inhibition and emotional regulation [ 47 ]. For instance, brain regions like the striatum, which are linked to impulsivity and the reward system, are heavily involved in the act of playing online games [ 47 ]. Online game play activates the striatum, which suppresses the left DLPFC in ECN [ 48 ]. As a result, people with IA may find it difficult to control their want to play online games [ 48 ]. This system thus causes impulsive and protracted gaming conduct, lack of inhibitory control leading to the continued use of internet in an overt manner despite a variety of negative effects, personal distress, and signs of psychological dependence [ 33 ] (See Fig 5 ).

Wang et al. (2017) report that disruptions in cognitive control networks within the ECN are frequently linked to characteristics of substance addiction [ 24 ]. With samples that were addicted to heroin and cocaine, previous studies discovered abnormal FC in the ECN and the PFC [ 49 ]. Electronic gaming is known to promote striatal dopamine release, similar to drug addiction [ 50 ]. According to Drgonova and Walther (2016), it is hypothesised that dopamine could stimulate the reward system of the striatum in the brain, leading to a loss of impulse control and a failure of prefrontal lobe executive inhibitory control [ 51 ]. In the end, IA’s resemblance to drug use disorders may point to vital biomarkers or underlying mechanisms that explain how cognitive control and impulsive behaviour are related.

A task-related fMRI study found that the decrease in FC between the left DLPFC and dorsal striatum was congruent with an increase in impulsivity in adolescents with IA [ 26 ]. The lack of response inhibition from the ECN results in a loss of control over internet usage and a reduced capacity to display goal-directed behaviour [ 33 ]. Previous studies have linked the alteration of the ECN in IA with higher cue reactivity and impaired ability to self-regulate internet specific stimuli [ 52 ].

Salience network (SN)/ other networks.

Xing et al. (2014) investigated the significance of the SN regarding cognitive control in teenagers with IA [ 21 ]. The SN, which is composed of the ACC and insula, has been demonstrated to control dynamic changes in other networks to modify cognitive performance [ 21 ]. The ACC is engaged in conflict monitoring and cognitive control, according to previous neuroimaging research [ 53 ]. The insula is a region that integrates interoceptive states into conscious feelings [ 54 ]. The results from Xing et al. (2014) showed declines in the SN regarding its structural connectivity and fractional anisotropy, even though they did not observe any appreciable change in FC in the IA participants [ 21 ]. Due to the small sample size, the results may have indicated that FC methods are not sensitive enough to detect the significant functional changes [ 21 ]. However, task performance behaviours associated with impaired cognitive control in adolescents with IA were correlated with these findings [ 21 ]. Our comprehension of the SN’s broader function in IA can be enhanced by this relationship.

Research study supports the idea that different psychological issues are caused by the functional reorganisation of expansive brain networks, such that strong association between SN and DMN may provide neurological underpinnings at the system level for the uncontrollable character of internet-using behaviours [ 24 ]. In the study by Wang et al. (2017), the decreased interconnectivity between the SN and DMN, comprising regions such the DLPFC and the insula, suggests that adolescents with IA may struggle to effectively inhibit DMN activity during internally focused processing, leading to poorly managed desires or preoccupations to use the internet [ 24 ] (See Fig 5 ). Subsequently, this may cause a failure to inhibit DMN activity as well as a restriction of ECN functionality [ 55 ]. As a result, the adolescent experiences an increased salience and sensitivity towards internet addicting cues making it difficult to avoid these triggers [ 56 ].

The primary aim of this review was to present a summary of how internet addiction impacts on the functional connectivity of adolescent brain. Subsequently, the influence of IA on the adolescent brain was compartmentalised into three sections: alterations of FC at various brain regions, specific FC relationships, and behavioural/developmental changes. Overall, the specific effects of IA on the adolescent brain were not completely clear, given the variety of FC changes. However, there were overarching behavioural, network and developmental trends that were supported that provided insight on adolescent development.

The first hypothesis that was held about this question was that IA was widespread and would be regionally similar to substance-use and gambling addiction. After conducting a review of the information in the chosen articles, the hypothesis was predictably supported. The regions of the brain affected by IA are widespread and influence multiple networks, mainly DMN, ECN, SN and reward pathway. In the DMN, there was a complex mix of increases and decreases within the network. However, in the ECN, the alterations of FC were more unilaterally decreased, but the findings of SN and reward pathway were not quite clear. Overall, the FC changes within adolescents with IA are very much network specific and lay a solid foundation from which to understand the subsequent behaviour changes that arise from the disorder.

The second hypothesis placed emphasis on the importance of between network interactions and within network interactions in the continuation of IA and the development of its behavioural symptoms. The results from the findings involving the networks, DMN, SN, ECN and reward system, support this hypothesis (see Fig 5 ). Studies confirm the influence of all these neural networks on reward valuation, impulsivity, salience to stimuli, cue reactivity and other changes that alter behaviour towards the internet use. Many of these changes are connected to the inherent nature of the adolescent brain.

There are multiple explanations that underlie the vulnerability of the adolescent brain towards IA related urges. Several of them have to do with the inherent nature and underlying mechanisms of the adolescent brain. Children’s emotional, social, and cognitive capacities grow exponentially during childhood and adolescence [ 57 ]. Early teenagers go through a process called “social reorientation” that is characterised by heightened sensitivity to social cues and peer connections [ 58 ]. Adolescents’ improvements in their social skills coincide with changes in their brains’ anatomical and functional organisation [ 59 ]. Functional hubs exhibit growing connectivity strength [ 60 ], suggesting increased functional integration during development. During this time, the brain’s functional networks change from an anatomically dominant structure to a scattered architecture [ 60 ].

The adolescent brain is very responsive to synaptic reorganisation and experience cues [ 61 ]. As a result, one of the distinguishing traits of the maturation of adolescent brains is the variation in neural network trajectory [ 62 ]. Important weaknesses of the adolescent brain that may explain the neurobiological change brought on by external stimuli are illustrated by features like the functional gaps between networks and the inadequate segregation of networks [ 62 ].

The implications of these findings towards adolescent behaviour are significant. Although the exact changes and mechanisms are not fully clear, the observed changes in functional connectivity have the capacity of influencing several aspects of adolescent development. For example, functional connectivity has been utilised to investigate attachment styles in adolescents [ 63 ]. It was observed that adolescent attachment styles were negatively associated with caudate-prefrontal connectivity, but positively with the putamen-visual area connectivity [ 63 ]. Both named areas were also influenced by the onset of internet addiction, possibly providing a connection between the two. Another study associated neighbourhood/socioeconomic disadvantage with functional connectivity alterations in the DMN and dorsal attention network [ 64 ]. The study also found multivariate brain behaviour relationships between the altered/disadvantaged functional connectivity and mental health and cognition [ 64 ]. This conclusion supports the notion that the functional connectivity alterations observed in IA are associated with specific adolescent behaviours as well as the fact that functional connectivity can be utilised as a platform onto which to compare various neurologic conditions.

Limitations/strengths

There were several limitations that were related to the conduction of the review as well as the data extracted from the articles. Firstly, the study followed a systematic literature review design when analysing the fMRI studies. The data pulled from these imaging studies were namely qualitative and were subject to bias contrasting the quantitative nature of statistical analysis. Components of the study, such as sample sizes, effect sizes, and demographics were not weighted or controlled. The second limitation brought up by a similar review was the lack of a universal consensus of terminology given IA [ 47 ]. Globally, authors writing about this topic use an array of terminology including online gaming addiction, internet addiction, internet gaming disorder, and problematic internet use. Often, authors use multiple terms interchangeably which makes it difficult to depict the subtle similarities and differences between the terms.

Reviewing the explicit limitations in each of the included studies, two major limitations were brought up in many of the articles. One was relating to the cross-sectional nature of the included studies. Due to the inherent qualities of a cross-sectional study, the studies did not provide clear evidence that IA played a causal role towards the development of the adolescent brain. While several biopsychosocial factors mediate these interactions, task-based measures that combine executive functions with imaging results reinforce the assumed connection between the two that is utilised by the papers studying IA. Another limitation regarded the small sample size of the included studies, which averaged to around 20 participants. The small sample size can influence the generalisation of the results as well as the effectiveness of statistical analyses. Ultimately, both included study specific limitations illustrate the need for future studies to clarify the causal relationship between the alterations of FC and the development of IA.

Another vital limitation was the limited number of studies applying imaging techniques for investigations on IA in adolescents were a uniformly Far East collection of studies. The reason for this was because the studies included in this review were the only fMRI studies that were found that adhered to the strict adolescent age restriction. The adolescent age range given by the WHO (10–19 years old) [ 65 ] was strictly followed. It is important to note that a multitude of studies found in the initial search utilised an older adolescent demographic that was slightly higher than the WHO age range and had a mean age that was outside of the limitations. As a result, the results of this review are biased and based on the 12 studies that met the inclusion and exclusion criteria.

Regarding the global nature of the research, although the journals that the studies were published in were all established western journals, the collection of studies were found to all originate from Asian countries, namely China and Korea. Subsequently, it pulls into question if the results and measures from these studies are generalisable towards a western population. As stated previously, Asian countries have a higher prevalence of IA, which may be the reasoning to why the majority of studies are from there [ 8 ]. However, in an additional search including other age groups, it was found that a high majority of all FC studies on IA were done in Asian countries. Interestingly, western papers studying fMRI FC were primarily focused on gambling and substance-use addiction disorders. The western papers on IA were less focused on fMRI FC but more on other components of IA such as sleep, game-genre, and other non-imaging related factors. This demonstrated an overall lack of western fMRI studies on IA. It is important to note that both western and eastern fMRI studies on IA presented an overall lack on children and adolescents in general.

Despite the several limitations, this review provided a clear reflection on the state of the data. The strengths of the review include the strict inclusion/exclusion criteria that filtered through studies and only included ones that contained a purely adolescent sample. As a result, the information presented in this review was specific to the review’s aims. Given the sparse nature of adolescent specific fMRI studies on the FC changes in IA, this review successfully provided a much-needed niche representation of adolescent specific results. Furthermore, the review provided a thorough functional explanation of the DMN, ECN, SN and reward pathway making it accessible to readers new to the topic.

Future directions and implications

Through the search process of the review, there were more imaging studies focused on older adolescence and adulthood. Furthermore, finding a review that covered a strictly adolescent population, focused on FC changes, and was specifically depicting IA, was proven difficult. Many related reviews, such as Tereshchenko and Kasparov (2019), looked at risk factors related to the biopsychosocial model, but did not tackle specific alterations in specific structural or functional changes in the brain [ 66 ]. Weinstein (2017) found similar structural and functional results as well as the role IA has in altering response inhibition and reward valuation in adolescents with IA [ 47 ]. Overall, the accumulated findings only paint an emerging pattern which aligns with similar substance-use and gambling disorders. Future studies require more specificity in depicting the interactions between neural networks, as well as more literature on adolescent and comorbid populations. One future field of interest is the incorporation of more task-based fMRI data. Advances in resting-state fMRI methods have yet to be reflected or confirmed in task-based fMRI methods [ 62 ]. Due to the fact that network connectivity is shaped by different tasks, it is critical to confirm that the findings of the resting state fMRI studies also apply to the task based ones [ 62 ]. Subsequently, work in this area will confirm if intrinsic connectivity networks function in resting state will function similarly during goal directed behaviour [ 62 ]. An elevated focus on adolescent populations as well as task-based fMRI methodology will help uncover to what extent adolescent network connectivity maturation facilitates behavioural and cognitive development [ 62 ].

A treatment implication is the potential usage of bupropion for the treatment of IA. Bupropion has been previously used to treat patients with gambling disorder and has been effective in decreasing overall gambling behaviour as well as money spent while gambling [ 67 ]. Bae et al. (2018) found a decrease in clinical symptoms of IA in line with a 12-week bupropion treatment [ 31 ]. The study found that bupropion altered the FC of both the DMN and ECN which in turn decreased impulsivity and attentional deficits for the individuals with IA [ 31 ]. Interventions like bupropion illustrate the importance of understanding the fundamental mechanisms that underlie disorders like IA.

The goal for this review was to summarise the current literature on functional connectivity changes in adolescents with internet addiction. The findings answered the primary research questions that were directed at FC alterations within several networks of the adolescent brain and how that influenced their behaviour and development. Overall, the research demonstrated several wide-ranging effects that influenced the DMN, SN, ECN, and reward centres. Additionally, the findings gave ground to important details such as the maturation of the adolescent brain, the high prevalence of Asian originated studies, and the importance of task-based studies in this field. The process of making this review allowed for a thorough understanding IA and adolescent brain interactions.

Given the influx of technology and media in the lives and education of children and adolescents, an increase in prevalence and focus on internet related behavioural changes is imperative towards future children/adolescent mental health. Events such as COVID-19 act to expose the consequences of extended internet usage on the development and lifestyle of specifically young people. While it is important for parents and older generations to be wary of these changes, it is important for them to develop a base understanding of the issue and not dismiss it as an all-bad or all-good scenario. Future research on IA will aim to better understand the causal relationship between IA and psychological symptoms that coincide with it. The current literature regarding functional connectivity changes in adolescents is limited and requires future studies to test with larger sample sizes, comorbid populations, and populations outside Far East Asia.

This review aimed to demonstrate the inner workings of how IA alters the connection between the primary behavioural networks in the adolescent brain. Predictably, the present answers merely paint an unfinished picture that does not necessarily depict internet usage as overwhelmingly positive or negative. Alternatively, the research points towards emerging patterns that can direct individuals on the consequences of certain variables or risk factors. A clearer depiction of the mechanisms of IA would allow physicians to screen and treat the onset of IA more effectively. Clinically, this could be in the form of more streamlined and accurate sessions of CBT or family therapy, targeting key symptoms of IA. Alternatively clinicians could potentially prescribe treatment such as bupropion to target FC in certain regions of the brain. Furthermore, parental education on IA is another possible avenue of prevention from a public health standpoint. Parents who are aware of the early signs and onset of IA will more effectively handle screen time, impulsivity, and minimize the risk factors surrounding IA.

Additionally, an increased attention towards internet related fMRI research is needed in the West, as mentioned previously. Despite cultural differences, Western countries may hold similarities to the eastern countries with a high prevalence of IA, like China and Korea, regarding the implications of the internet and IA. The increasing influence of the internet on the world may contribute to an overall increase in the global prevalence of IA. Nonetheless, the high saturation of eastern studies in this field should be replicated with a Western sample to determine if the same FC alterations occur. A growing interest in internet related research and education within the West will hopefully lead to the knowledge of healthier internet habits and coping strategies among parents with children and adolescents. Furthermore, IA research has the potential to become a crucial proxy for which to study adolescent brain maturation and development.

Supporting information

S1 checklist. prisma checklist..

https://doi.org/10.1371/journal.pmen.0000022.s001

S1 Appendix. Search strategies with all the terms.

https://doi.org/10.1371/journal.pmen.0000022.s002

S1 Data. Article screening records with details of categorized content.

https://doi.org/10.1371/journal.pmen.0000022.s003

Acknowledgments

The authors thank https://www.stockio.com/free-clipart/brain-01 (with attribution to Stockio.com); and https://www.rawpixel.com/image/6442258/png-sticker-vintage for the free images used to create Figs 2 – 4 .

  • View Article
  • PubMed/NCBI
  • Google Scholar
  • 2. Association AP. Diagnostic and statistical manual of mental disorders: DSM-5. 5 ed. Washington, D.C.: American Psychiatric Publishing; 2013.
  • 10. Stats IW. World Internet Users Statistics and World Population Stats 2013 [ http://www.internetworldstats.com/stats.htm .
  • 11. Rideout VJR M. B. The common sense census: media use by tweens and teens. San Francisco, CA: Common Sense Media; 2019.
  • 37. Tremblay L. The Ventral Striatum. Handbook of Reward and Decision Making: Academic Press; 2009.
  • 57. Bhana A. Middle childhood and pre-adolescence. Promoting mental health in scarce-resource contexts: emerging evidence and practice. Cape Town: HSRC Press; 2010. p. 124–42.
  • 65. Organization WH. Adolescent Health 2023 [ https://www.who.int/health-topics/adolescent-health#tab=tab_1 .

IMAGES

  1. Systematic literature review phases.

    systematic literature reviews

  2. Systematic Review

    systematic literature reviews

  3. How to write a systematic literature review [9 steps]

    systematic literature reviews

  4. How to Write A Systematic Literature Review?

    systematic literature reviews

  5. 3 Systematic Reviews and Meta-Analyses

    systematic literature reviews

  6. Systematic Literature Review Methodology

    systematic literature reviews

VIDEO

  1. Powerful AI Techniques for Systematic Literature Reviews!

  2. Introduction Systematic Literature Review-Various frameworks Bibliometric Analysis

  3. Web of Science Essentials at Covenant University

  4. Introduction to Systematic Literature Review || Topic 10|| Perspectives by Ummara

  5. SYSTEMATIC LITERATURE REVIEW

  6. Consensus

COMMENTS

  1. PDF Systematic Literature Reviews: an Introduction

    Systematic literature reviews (SRs) are a way of synthesising scientific evidence to answer a particular research question in a way that is transparent and reproducible, while seeking to include all published evidence on the topic and appraising the quality of th is evidence. SRs have become a major methodology

  2. Guidance on Conducting a Systematic Literature Review

    Literature reviews establish the foundation of academic inquires. However, in the planning field, we lack rigorous systematic reviews. In this article, through a systematic search on the methodology of literature review, we categorize a typology of literature reviews, discuss steps in conducting a systematic literature review, and provide suggestions on how to enhance rigor in literature ...

  3. How-to conduct a systematic literature review: A quick guide for

    A Systematic Literature Review (SLR) is a research methodology to collect, identify, and critically analyze the available research studies (e.g., articles, conference proceedings, books, dissertations) through a systematic procedure [12]. An SLR updates the reader with current literature about a subject [6].

  4. Systematic Review

    Systematic review vs. literature review. A literature review is a type of review that uses a less systematic and formal approach than a systematic review. Typically, an expert in a topic will qualitatively summarize and evaluate previous work, without using a formal, explicit method.

  5. How to Do a Systematic Review: A Best Practice Guide for ...

    The best reviews synthesize studies to draw broad theoretical conclusions about what a literature means, linking theory to evidence and evidence to theory. This guide describes how to plan, conduct, organize, and present a systematic review of quantitative (meta-analysis) or qualitative (narrative review, meta-synthesis) information.

  6. Systematic reviews: Structure, form and content

    A systematic review collects secondary data, and is a synthesis of all available, relevant evidence which brings together all existing primary studies for review (Cochrane 2016). A systematic review differs from other types of literature review in several major ways.

  7. (PDF) Systematic Literature Reviews: An Introduction

    Systematic literature reviews (SRs) are a way of synt hesising scientific evidence to answer a particular. research question in a way that is transparent and reproducible, while seeking to include ...

  8. Systematic reviews: Structure, form and content

    A systematic review collects secondary data, and is a synthesis of all available, relevant evidence which brings together all existing primary studies for review (Cochrane 2016).A systematic review differs from other types of literature review in several major ways.

  9. Introduction to Systematic Reviews

    A systematic review identifies and synthesizes all relevant studies that fit prespecified criteria to answer a research question. Systematic review methods can be used to answer many types of research questions. ... Second, systematic reviews conduct a search of other literature that is outside of traditional peer-reviewed journals. Examples of ...

  10. Literature review as a research methodology: An overview and guidelines

    2.1.1. Systematic literature review. What is it and when should we use it? Systematic reviews have foremost been developed within medical science as a way to synthesize research findings in a systematic, transparent, and reproducible way and have been referred to as the gold standard among reviews (Davis et al., 2014).Despite all the advantages of this method, its use has not been overly ...

  11. Introduction to systematic review and meta-analysis

    A systematic review collects all possible studies related to a given topic and design, and reviews and analyzes their results [ 1 ]. During the systematic review process, the quality of studies is evaluated, and a statistical meta-analysis of the study results is conducted on the basis of their quality. A meta-analysis is a valid, objective ...

  12. Carrying out systematic literature reviews: an introduction

    Systematic Reviews as Topic*. Systematic reviews provide a synthesis of evidence for a specific topic of interest, summarising the results of multiple studies to aid in clinical decisions and resource allocation. They remain among the best forms of evidence, and reduce the bias inherent in other methods. A solid understanding of ….

  13. How to Write a Systematic Review of the Literature

    This article provides a step-by-step approach to conducting and reporting systematic literature reviews (SLRs) in the domain of healthcare design and discusses some of the key quality issues associated with SLRs. SLR, as the name implies, is a systematic way of collecting, critically evaluating, integrating, and presenting findings from across ...

  14. Systematic review

    A systematic review is a scholarly synthesis of the evidence on a clearly presented topic using critical methods to identify, define and assess research on the topic. A systematic review extracts and interprets data from published studies on the topic (in the scientific literature), then analyzes, describes, critically appraises and summarizes interpretations into a refined evidence-based ...

  15. What are systematic reviews?

    Systematic reviews are a type of literature review of research which require equivalent standards of rigour as primary research. They have a clear, logical rationale that is reported to the reader of the review. They are used in research and policymaking to inform evidence-based decisions and practice. They differ from traditional literature ...

  16. Types of Literature Reviews

    Rapid review. Assessment of what is already known about a policy or practice issue, by using systematic review methods to search and critically appraise existing research. Completeness of searching determined by time constraints. Time-limited formal quality assessment. Typically narrative and tabular.

  17. (PDF) A guide to systematic literature reviews

    The first stage in conducting a systematic. review is to develop a protocol that clearly defines: 1) the aims. and objectives of the review; 2) the inclusion and exclusion. criteria for studies ...

  18. Description of the Systematic Literature Review Method

    A systematic literature review (SLR) is an independent academic method that aims to identify and evaluate all relevant literature on a topic in order to derive conclusions about the question under consideration."Systematic reviews are undertaken to clarify the state of existing research and the implications that should be drawn from this."

  19. How-to conduct a systematic literature review: A quick guide for

    Abstract. Performing a literature review is a critical first step in research to understanding the state-of-the-art and identifying gaps and challenges in the field. A systematic literature review is a method which sets out a series of steps to methodically organize the review. In this paper, we present a guide designed for researchers and in ...

  20. Home

    A systematic review is a literature review that gathers all of the available evidence matching pre-specified eligibility criteria to answer a specific research question. It uses explicit, systematic methods, documented in a protocol, to minimize bias, provide reliable findings, and inform decision-making.

  21. Full article: Systematic literature reviews over the years

    Nowadays, systematic literature reviews (SLRs) and meta-analyses are often placed at the top of the evidence hierarchy, usually depicted as a pyramid, ordered by the design and risk of bias of included studies [Citation 7]. In contrast to narrative reviews, systematic reviews address a specific research question [Citation 8].

  22. How to write a systematic literature review [9 steps]

    Screen the literature. Assess the quality of the studies. Extract the data. Analyze the results. Interpret and present the results. 1. Decide on your team. When carrying out a systematic literature review, you should employ multiple reviewers in order to minimize bias and strengthen analysis.

  23. Systematic Reviews and Meta-analysis: Understanding the Best Evidence

    A systematic review is a summary of the medical literature that uses explicit and reproducible methods to systematically search, critically appraise, and synthesize on a specific issue. It synthesizes the results of multiple primary studies related to each other by using strategies that reduce biases and random errors.[ 7 ]

  24. Systematic, Scoping, and Other Literature Reviews: Overview

    A systematic review, however, is a comprehensive literature review conducted to answer a specific research question. Authors of a systematic review aim to find, code, appraise, and synthesize all of the previous research on their question in an unbiased and well-documented manner.

  25. Systematic Literature Review: Easy Guide

    Is that the same as a systematic literature review? In this case, there actually is a difference, albeit a relatively small one. The methodology for both types of reviews will be the same (whew!), but the reason for conducting one versus the other will be a bit different. Let me give you an example based on my own research.

  26. Between-hospital variation in indicators of quality of care: a

    This systematic literature review aims to synthesise the results of studies that quantify the extent to which hospitals contribute to variation in quality indicator scores. Methods Embase, Medline, Web of Science, Cochrane and Google Scholar were systematically searched from 2010 to November 2023. We included studies that reported a measure of ...

  27. The organisational impact of agility: a systematic literature review

    This paper adopts the widely used systematic review methodology in literature review studies to collect and analyse data because it is comprehensive, transparent, evidence-based, and unbiased (Khan et al. 2003; Snyder 2019; Tranfield et al. 2003).Figure 1 explains the strategy and steps taken to conduct this literature review.

  28. Title and abstract screening for literature reviews using large

    Systematically screening published literature to determine the relevant publications to synthesize in a review is a time-consuming and difficult task. Large language models (LLMs) are an emerging technology with promising capabilities for the automation of language-related tasks that may be useful for such a purpose. LLMs were used as part of an automated system to evaluate the relevance of ...

  29. An overview of methodological approaches in systematic reviews

    1. INTRODUCTION. Evidence synthesis is a prerequisite for knowledge translation. 1 A well conducted systematic review (SR), often in conjunction with meta‐analyses (MA) when appropriate, is considered the "gold standard" of methods for synthesizing evidence related to a topic of interest. 2 The central strength of an SR is the transparency of the methods used to systematically search ...

  30. Functional connectivity changes in the brain of adolescents with

    Internet usage has seen a stark global rise over the last few decades, particularly among adolescents and young people, who have also been diagnosed increasingly with internet addiction (IA). IA impacts several neural networks that influence an adolescent's behaviour and development. This article issued a literature review on the resting-state and task-based functional magnetic resonance ...