• UNC Libraries
  • HSL Academic Process
  • Systematic Reviews
  • Step 6: Assess Quality of Included Studies

Systematic Reviews: Step 6: Assess Quality of Included Studies

Created by health science librarians.

HSL Logo

  • Step 1: Complete Pre-Review Tasks
  • Step 2: Develop a Protocol
  • Step 3: Conduct Literature Searches
  • Step 4: Manage Citations
  • Step 5: Screen Citations

Assess studies for quality and bias

Critically appraise included studies, select a quality assessment tool, a closer look at popular tools, use covidence for quality assessment.

  • Quality Assessment FAQs
  • Step 7: Extract Data from Included Studies
  • Step 8: Write the Review

  Check our FAQ's

   Email us

   Call (919) 962-0800

   Make an appointment with a librarian

  Request a systematic or scoping review consultation

About Step 6: Assess Quality of Included Studies

In step 6 you will evaluate the articles you included in your review for quality and bias. To do so, you will:

  • Use quality assessment tools to grade each article.
  • Create a summary of the quality of literature included in your review.

This page has links to quality assessment tools you can use to evaluate different study types. Librarians can help you find widely used tools to evaluate the articles in your review.

Reporting your review with PRISMA

If you reach the quality assessment step and choose to exclude articles for any reason, update the number of included and excluded studies in your PRISMA flow diagram.

Managing your review with Covidence

Covidence includes the Cochrane Risk of Bias 2.0 quality assessment template, but you can also create your own custom quality assessment template.

How a librarian can help with Step 6

  • What the quality assessment or risk of bias stage of the review entails
  • How to choose an appropriate quality assessment tool
  • Best practices for reporting quality assessment results in your review

After the screening process is complete, the systematic review team must assess each article for quality and bias. There are various types of bias, some of which are outlined in the table below from the Cochrane Handbook.

The most important thing to remember when choosing a quality assessment tool is to pick one that was created and validated to assess the study design(s) of your included articles.

For example, if one item in the inclusion criteria of your systematic review is to only include randomized controlled trials (RCTs), then you need to pick a quality assessment tool specifically designed for RCTs (for example, the Cochrane Risk of Bias tool)

Once you have gathered your included studies, you will need to appraise the evidence for its relevance, reliability, validity, and applicability​.

Ask questions like:

Relevance:  ​.

  • Is the research method/study design appropriate for answering the research question?​
  • Are specific inclusion / exclusion criteria used? ​

Reliability:  ​

  • Is the effect size practically relevant? How precise is the estimate of the effect? Were confidence intervals given?  ​

Validity: ​

  • Were there enough subjects in the study to establish that the findings did not occur by chance?    ​
  • Were subjects randomly allocated? Were the groups comparable? If not, could this have introduced bias?  ​
  • Are the measurements/ tools validated by other studies?  ​
  • Could there be confounding factors?   ​

Applicability:  ​

  • Can the results be applied to my organization and my patient?   ​

What are Quality Assessment tools?

Quality Assessment tools are questionnaires created to help you assess the quality of a variety of study designs.  Depending on the types of studies you are analyzing, the questionnaire will be tailored to ask specific questions about the methodology of the study.  There are appraisal tools for most kinds of study designs.  You should choose a Quality Assessment tool that matches the types of studies you expect to see in your results.  If you have multiple types of study designs, you may wish to use several tools from one organization, such as the CASP or LEGEND tools, as they have a range of assessment tools for many study designs.

Click on a study design below to see some examples of quality assessment tools for that type of study.

Randomized Controlled Trials (RCTs)

  • Cochrane Risk of Bias (ROB) 2.0 Tool Templates are tailored to randomized parallel-group trials, cluster-randomized parallel-group trails (including stepped-wedge designs), and randomized cross-over trails and other matched designs.
  • CASP- Randomized Controlled Trial Appraisal Tool A checklist for RCTs created by the Critical Appraisal Skills Program (CASP)
  • The Jadad Scale A scale that assesses the quality of published clinical trials based methods relevant to random assignment, double blinding, and the flow of patients
  • CEBM-RCT A critical appraisal tool for RCTs from the Centre for Evidence Based Medicine (CEBM)
  • Checklist for Randomized Controlled Trials (JBI) A critical appraisal checklist from the Joanna Briggs Institute (JBI)
  • Scottish Intercollegiate Guidelines Network (SIGN) Checklists for quality assessment
  • LEGEND Evidence Evaluation Tools A series of critical appraisal tools from the Cincinnati Children's Hospital. Contains tools for a wide variety of study designs, including prospective, retrospective, qualitative, and quantitative designs.

Cohort Studies

  • CASP- Cohort Studies A checklist created by the Critical Appraisal Skills Programme (CASP) to assess key criteria relevant to cohort studies
  • Checklist for Cohort Studies (JBI) A checklist for cohort studies from the Joanna Briggs Institute
  • The Newcastle-Ottawa Scale (NOS) for assessing the quality of nonrandomised studies in meta-analyses A validated tool for assessing case-control and cohort studies
  • STROBE Checklist A checklist for quality assessment of case-control, cohort, and cross-sectional studies

Case-Control Studies

  • CASP- Case Control Study A checklist created by the Critical Appraisal Skills Programme (CASP) to assess key criteria relevant to case-control studies
  • Tool to Assess Risk of Bias in Case Control Studies by the CLARITY Group at McMaster University A quality assessment tool for case-control studies from the CLARITY Group at McMaster University
  • Checklist for Case-Control Studies A checklist created by the Joanna Briggs Institute

Cross-Sectional Studies

Diagnostic studies.

  • CASP- Diagnostic Studies A checklist for diagnostic studies created by the Critical Appraisal Skills Program (CASP)
  • QUADAS-2 A quality assessment tool developed by a team at the Bristol Medical School: Population Health Sciences at the University of Bristol
  • Critical Appraisal Checklist for Diagnostic Test Accuracy Studies (JBI) A checklist for quality assessment of diagnostic studies developed by the Joanna Briggs Institute

Economic Studies

  • Consensus Health Economic Criteria (CHEC) List 19 yes-or-no questions, one for each category to assess economic evaluations
  • CASP- Economic Evaluation A checklist for quality assessment of economic studies by the Critical Appraisal Skills Programme

Mixed Methods

  • McGill Mixed Methods Appraisal Tool (MMAT) 2018 User Guide See full site for additional information, including FAQ's, references and resources, earlier versions, and more

Qualitative Studies

  • CASP- Qualitative Studies 10 questions to help assess qualitative research from the Critical Appraisal Skills Programme

Systematic Reviews and Meta-Analyses

  • Critical Appraisal Checklist for Systematic Reviews and Research Syntheses An 11-item checklist for evaluating systematic reviews
  • AMSTAR Checklist A 16-question measurement tool to assess systematic reviews
  • AHRQ Methods Guide for Effectiveness and Comparative Effectiveness Reviews A guide to selecting eligibility criteria, searching the literature, extracting data, assessing quality, and completing other steps in the creation of a systematic review
  • CASP - Systematic Review A checklist for quality assessment of systematic review from the Critical Appraisal Skills Programme

Clinical Practice Guidelines

  • National Guideline Clearinghouse Extent of Adherence to Trustworthy Standards (NEATS) Instrument A 15-item instrument using a scale of 1-5 to evaluate a guideline's adherence to the Institute of Medicine's standard for trust worth guidelines
  • AGREE-II Appraisal of Guidelines for Research and Evaluation The Appraisal of Guidelines for Research and Evaluation (AGREE) Instrument evaluates the process of practice guideline development and the quality of reporting

Other Study Designs

  • NTACT Quality Checklists Quality indicator checklists for correlational studies, group experimental studies, single case research studies, and qualitative studies developed by the National Technical Assistance Center on Transition (NTACT). (Users must make an account.)

Below, you will find a sample of four popular quality assessment tools and some basic information about each. For more quality assessment tools, please view the blue tabs in the boxes above, organized by study design.

More information about popular quality assessment tools.
Tool Study Design About
Randomized controlled trials (RCTs)

The Cochrane Risk of Bias 2.0 tool asks questions about five types of potential bias for individually randomized trials:

Non-randomized studies

The Newcastle-Ottawa scale assesses the quality of nonrandomized studies based on three broad perspectives:

Mixed methods

These quality assessment checklists ask 11 or 12 questions each to help you identify

Available study designs include randomized controlled trials, systematic reviews, qualitative studies, cohort studies, diagnostic studies, case control studies, economic evaluations, and clinical prediction rules.

Mixed methods

These evidence evaluation tools ask questions each to help you examine

across the clinical question domains of intervention, diagnosis & assessment, prognosis, etiology & risk factors, incidence, prevalence, and meaning.

Available study designs include systematic review / meta analysis, meta-synthesis, randomized controlled trials, controlled clinical trials, psychometric studies, cohort-prospective / retrospective, case control, longitudinal, cross sectional, descriptive / epidemiology / case series, qualitative study, quality improvement, mixed methods, decision analysis / economic analysis / computer simulation, case report / n-of-1 study, published expert opinion, bench studies, and guidelines.

Covidence uses Cochrane Risk of Bias (which is designed for rating RCTs and cannot be used for other study types) as the default tool for quality assessment of included studies. You can opt to manually customize the quality assessment template and use a different tool better suited to your review. More information about quality assessment using Covidence, including how to customize the quality assessment template, can be found below. If you decide to customize the quality assessment template, you cannot switch back to using the Cochrane Risk of Bias template.

More Information

  • Quality Assessment on the Covidence Guide
  • Covidence FAQs on Quality Assessment Commonly asked questions about quality assessment using Covidence
  • Covidence YouTube Channel A collection of Covidence-created videos
  • << Previous: Step 5: Screen Citations
  • Next: Step 7: Extract Data from Included Studies >>
  • Last Updated: Jul 15, 2024 4:55 PM
  • URL: https://guides.lib.unc.edu/systematic-reviews

Child Care and Early Education Research Connections

Assessing research quality.

Using research is one method to identify effective strategies to help inform practice and policymaking decisions. Effective use of research in decision-making can help agencies and organizations:

  • Understand social and educational challenges and their root causes,
  • Make well-informed decisions about programs and policies, and 
  • Leverage public and private resources effectively

Early care and education leaders are responsible for developing policy and making practice decisions that impact providers, families, and children. Often, leaders will seek input and guidance from different sources when making these decisions. Some of these sources may include assessing community needs, reviewing administrative data, understanding financial resources, and searching for research, and other sources of information. 

As beneficial as research can be to help shape policy and practice decisions, understanding research can feel overwhelming. The technical language used in research may make it hard to understand and prevent you from fully using it as a tool. Also, it might not be easy to determine how trustworthy or useful a research-based resource might be for your decision-making process.

This resource is designed to help you think about key questions about the relevance, credibility, and the rigor of research as you consider decisions about investments in early care and education practice and policy. 

How can I learn more about understanding research?

For more information on understanding research, consider exploring other tools available on the Research Connections website. 

  • Consider using this  research glossary when reading research articles.
  • Use this quick reference to better understand different types of  study designs .
  • Check out the  Research Assessment Tools for more information on assessing the rigor of qualitative or quantitative research. 
  • Use the Assessing Research Quality: Key Questions to Ask tool as you think about the usefulness of research when considering policy and practice decisions

Contact the authors of the study if you have additional questions. Most researchers would be open to hearing from practitioners and policymakers.

the quality of a research study is primarily assessed on

Announcements

Find announcements, including conferences and meetings, Research Connections newsletters, opportunities, and more.

the quality of a research study is primarily assessed on

Search Resources

Search all resources in the Research Connections Library.

the quality of a research study is primarily assessed on

Explore Our Topics

Research Connections' resources are organized into topical categories and subcategories.

Key Questions to Ask

This section provides key questions to ask when assessing the usefulness of research when considering policy and practice decisions.

Research Assessment Tools

This section provides resources related to quantitative and qualitative assessment tools.

Ethics of Research

This section provides an overview of three basic ethical principles.

  • Search Menu
  • Sign in through your institution
  • Advance articles
  • Author Guidelines
  • Submission Site
  • Open Access
  • Why Publish?
  • About Research Evaluation
  • Editorial Board
  • Advertising and Corporate Services
  • Journals Career Network
  • Self-Archiving Policy
  • Dispatch Dates
  • Journals on Oxford Academic
  • Books on Oxford Academic

Article Contents

1. introduction, 4. synthesis, 4.1 principles of tdr quality, 5. conclusions, supplementary data, acknowledgements, defining and assessing research quality in a transdisciplinary context.

  • Article contents
  • Figures & tables
  • Supplementary Data

Brian M. Belcher, Katherine E. Rasmussen, Matthew R. Kemshaw, Deborah A. Zornes, Defining and assessing research quality in a transdisciplinary context, Research Evaluation , Volume 25, Issue 1, January 2016, Pages 1–17, https://doi.org/10.1093/reseval/rvv025

  • Permissions Icon Permissions

Research increasingly seeks both to generate knowledge and to contribute to real-world solutions, with strong emphasis on context and social engagement. As boundaries between disciplines are crossed, and as research engages more with stakeholders in complex systems, traditional academic definitions and criteria of research quality are no longer sufficient—there is a need for a parallel evolution of principles and criteria to define and evaluate research quality in a transdisciplinary research (TDR) context. We conducted a systematic review to help answer the question: What are appropriate principles and criteria for defining and assessing TDR quality? Articles were selected and reviewed seeking: arguments for or against expanding definitions of research quality, purposes for research quality evaluation, proposed principles of research quality, proposed criteria for research quality assessment, proposed indicators and measures of research quality, and proposed processes for evaluating TDR. We used the information from the review and our own experience in two research organizations that employ TDR approaches to develop a prototype TDR quality assessment framework, organized as an evaluation rubric. We provide an overview of the relevant literature and summarize the main aspects of TDR quality identified there. Four main principles emerge: relevance, including social significance and applicability; credibility, including criteria of integration and reflexivity, added to traditional criteria of scientific rigor; legitimacy, including criteria of inclusion and fair representation of stakeholder interests, and; effectiveness, with criteria that assess actual or potential contributions to problem solving and social change.

Contemporary research in the social and environmental realms places strong emphasis on achieving ‘impact’. Research programs and projects aim to generate new knowledge but also to promote and facilitate the use of that knowledge to enable change, solve problems, and support innovation ( Clark and Dickson 2003 ). Reductionist and purely disciplinary approaches are being augmented or replaced with holistic approaches that recognize the complex nature of problems and that actively engage within complex systems to contribute to change ‘on the ground’ ( Gibbons et al. 1994 ; Nowotny, Scott and Gibbons 2001 , Nowotny, Scott and Gibbons 2003 ; Klein 2006 ; Hemlin and Rasmussen 2006 ; Chataway, Smith and Wield 2007 ; Erno-Kjolhede and Hansson 2011 ). Emerging fields such as sustainability science have developed out of a need to address complex and urgent real-world problems ( Komiyama and Takeuchi 2006 ). These approaches are inherently applied and transdisciplinary, with explicit goals to contribute to real-world solutions and strong emphasis on context and social engagement ( Kates 2000 ).

While there is an ongoing conceptual and theoretical debate about the nature of the relationship between science and society (e.g. Hessels 2008 ), we take a more practical starting point based on the authors’ experience in two research organizations. The first author has been involved with the Center for International Forestry Research (CIFOR) for almost 20 years. CIFOR, as part of the Consultative Group on International Agricultural Research (CGIAR), began a major transformation in 2010 that shifted the emphasis from a primary focus on delivering high-quality science to a focus on ‘…producing, assembling and delivering, in collaboration with research and development partners, research outputs that are international public goods which will contribute to the solution of significant development problems that have been identified and prioritized with the collaboration of developing countries.’ ( CGIAR 2011 ). It was always intended that CGIAR research would be relevant to priority development and conservation issues, with emphasis on high-quality scientific outputs. The new approach puts much stronger emphasis on welfare and environmental results; research centers, programs, and individual scientists now assume shared responsibility for achieving development outcomes. This requires new ways of working, with more and different kinds of partnerships and more deliberate and strategic engagement in social systems.

Royal Roads University (RRU), the home institute of all four authors, is a relatively new (created in 1995) public university in Canada. It is deliberately interdisciplinary by design, with just two faculties (Faculty of Social and Applied Science; Faculty of Management) and strong emphasis on problem-oriented research. Faculty and student research is typically ‘applied’ in the Organization for Economic Co-operation and Development (2012) sense of ‘original investigation undertaken in order to acquire new knowledge … directed primarily towards a specific practical aim or objective’.

An increasing amount of the research done within both of these organizations can be classified as transdisciplinary research (TDR). TDR crosses disciplinary and institutional boundaries, is context specific, and problem oriented ( Klein 2006 ; Carew and Wickson 2010 ). It combines and blends methodologies from different theoretical paradigms, includes a diversity of both academic and lay actors, and is conducted with a range of research goals, organizational forms, and outputs ( Klein 2006 ; Boix-Mansilla 2006a ; Erno-Kjolhede and Hansson 2011 ). The problem-oriented nature of TDR and the importance placed on societal relevance and engagement are broadly accepted as defining characteristics of TDR ( Carew and Wickson 2010 ).

The experience developing and using TDR approaches at CIFOR and RRU highlights the need for a parallel evolution of principles and criteria for evaluating research quality in a TDR context. Scientists appreciate and often welcome the need and the opportunity to expand the reach of their research, to contribute more effectively to change processes. At the same time, they feel the pressure of added expectations and are looking for guidance.

In any activity, we need principles, guidelines, criteria, or benchmarks that can be used to design the activity, assess its potential, and evaluate its progress and accomplishments. Effective research quality criteria are necessary to guide the funding, management, ongoing development, and advancement of research methods, projects, and programs. The lack of quality criteria to guide and assess research design and performance is seen as hindering the development of transdisciplinary approaches ( Bergmann et al. 2005 ; Feller 2006 ; Chataway, Smith and Wield 2007 ; Ozga 2008 ; Carew and Wickson 2010 ; Jahn and Keil 2015 ). Appropriate quality evaluation is essential to ensure that research receives support and funding, and to guide and train researchers and managers to realize high-quality research ( Boix-Mansilla 2006a ; Klein 2008 ; Aagaard-Hansen and Svedin 2009 ; Carew and Wickson 2010 ).

Traditional disciplinary research is built on well-established methodological and epistemological principles and practices. Within disciplinary research, quality has been defined narrowly, with the primary criteria being scientific excellence and scientific relevance ( Feller 2006 ; Chataway, Smith and Wield 2007 ; Erno-Kjolhede and Hansson 2011 ). Disciplines have well-established (often implicit) criteria and processes for the evaluation of quality in research design ( Erno-Kjolhede and Hansson 2011 ). TDR that is highly context specific, problem oriented, and includes nonacademic societal actors in the research process is challenging to evaluate ( Wickson, Carew and Russell 2006 ; Aagaard-Hansen and Svedin 2009 ; Andrén 2010 ; Carew and Wickson 2010 ; Huutoniemi 2010 ). There is no one definition or understanding of what constitutes quality, nor a set guide for how to do TDR ( Lincoln 1995 ; Morrow 2005 ; Oberg 2008 ; Andrén 2010 ; Huutoniemi 2010 ). When epistemologies and methods from more than one discipline are used, disciplinary criteria may be insufficient and criteria from more than one discipline may be contradictory; cultural conflicts can arise as a range of actors use different terminology for the same concepts or the same terminology for different concepts ( Chataway, Smith and Wield 2007 ; Oberg 2008 ).

Current research evaluation approaches as applied to individual researchers, programs, and research units are still based primarily on measures of academic outputs (publications and the prestige of the publishing journal), citations, and peer assessment ( Boix-Mansilla 2006a ; Feller 2006 ; Erno-Kjolhede and Hansson 2011 ). While these indicators of research quality remain relevant, additional criteria are needed to address the innovative approaches and the diversity of actors, outputs, outcomes, and long-term social impacts of TDR. It can be difficult to find appropriate outlets for TDR publications simply because the research does not meet the expectations of traditional discipline-oriented journals. Moreover, a wider range of inputs and of outputs means that TDR may result in fewer academic outputs. This has negative implications for transdisciplinary researchers, whose performance appraisals and long-term career progression are largely governed by traditional publication and citation-based metrics of evaluation. Research managers, peer reviewers, academic committees, and granting agencies all struggle with how to evaluate and how to compare TDR projects ( ex ante or ex post ) in the absence of appropriate criteria to address epistemological and methodological variability. The extent of engagement of stakeholders 1 in the research process will vary by project, from information sharing through to active collaboration ( Brandt et al. 2013) , but at any level, the involvement of stakeholders adds complexity to the conceptualization of quality. We need to know what ‘good research’ is in a transdisciplinary context.

As Tijssen ( 2003 : 93) put it: ‘Clearly, in view of its strategic and policy relevance, developing and producing generally acceptable measures of “research excellence” is one of the chief evaluation challenges of the years to come’. Clear criteria are needed for research quality evaluation to foster excellence while supporting innovation: ‘A principal barrier to a broader uptake of TD research is a lack of clarity on what good quality TD research looks like’ ( Carew and Wickson 2010 : 1154). In the absence of alternatives, many evaluators, including funding bodies, rely on conventional, discipline-specific measures of quality which do not address important aspects of TDR.

There is an emerging literature that reviews, synthesizes, or empirically evaluates knowledge and best practice in research evaluation in a TDR context and that proposes criteria and evaluation approaches ( Defila and Di Giulio 1999 ; Bergmann et al. 2005 ; Wickson, Carew and Russell 2006 ; Klein 2008 ; Carew and Wickson 2010 ; ERIC 2010; de Jong et al. 2011 ; Spaapen and Van Drooge 2011 ). Much of it comes from a few fields, including health care, education, and evaluation; little comes from the natural resource management and sustainability science realms, despite these areas needing guidance. National-scale reviews have begun to recognize the need for broader research evaluation criteria but have had difficulty dealing with it and have made little progress in addressing it ( Donovan 2008 ; KNAW 2009 ; REF 2011 ; ARC 2012 ; TEC 2012 ). A summary of the national reviews that we reviewed in the development of this research is provided in Supplementary Appendix 1 . While there are some published evaluation schemes for TDR and interdisciplinary research (IDR), there is ‘substantial variation in the balance different authors achieve between comprehensiveness and over-prescription’ ( Wickson and Carew 2014 : 256) and still a need to develop standardized quality criteria that are ‘uniquely flexible to provide valid, reliable means to evaluate and compare projects, while not stifling the evolution and responsiveness of the approach’ ( Wickson and Carew 2014 : 256).

There is a need and an opportunity to synthesize current ideas about how to define and assess quality in TDR. To address this, we conducted a systematic review of the literature that discusses the definitions of research quality as well as the suggested principles and criteria for assessing TDR quality. The aim is to identify appropriate principles and criteria for defining and measuring research quality in a transdisciplinary context and to organize those principles and criteria as an evaluation framework.

The review question was: What are appropriate principles, criteria, and indicators for defining and assessing research quality in TDR?

This article presents the method used for the systematic review and our synthesis, followed by key findings. Theoretical concepts about why new principles and criteria are needed for TDR, along with associated discussions about evaluation process are presented. A framework, derived from our synthesis of the literature, of principles and criteria for TDR quality evaluation is presented along with guidance on its application. Finally, recommendations for next steps in this research and needs for future research are discussed.

2.1 Systematic review

Systematic review is a rigorous, transparent, and replicable methodology that has become widely used to inform evidence-based policy, management, and decision making ( Pullin and Stewart 2006 ; CEE 2010). Systematic reviews follow a detailed protocol with explicit inclusion and exclusion criteria to ensure a repeatable and comprehensive review of the target literature. Review protocols are shared and often published as peer reviewed articles before undertaking the review to invite critique and suggestions. Systematic reviews are most commonly used to synthesize knowledge on an empirical question by collating data and analyses from a series of comparable studies, though methods used in systematic reviews are continually evolving and are increasingly being developed to explore a wider diversity of questions ( Chandler 2014 ). The current study question is theoretical and methodological, not empirical. Nevertheless, with a diverse and diffuse literature on the quality of TDR, a systematic review approach provides a method for a thorough and rigorous review. The protocol is published and available at http://www.cifor.org/online-library/browse/view-publication/publication/4382.html . A schematic diagram of the systematic review process is presented in Fig. 1 .

Search process.

Search process.

2.2 Search terms

Search terms were designed to identify publications that discuss the evaluation or assessment of quality or excellence 2 of research 3 that is done in a TDR context. Search terms are listed online in Supplementary Appendices 2 and 3 . The search strategy favored sensitivity over specificity to ensure that we captured the relevant information.

2.3 Databases searched

ISI Web of Knowledge (WoK) and Scopus were searched between 26 June 2013 and 6 August 2013. The combined searches yielded 15,613 unique citations. Additional searches to update the first searchers were carried out in June 2014 and March 2015, for a total of 19,402 titles scanned. Google Scholar (GS) was searched separately by two reviewers during each search period. The first reviewer’s search was done on 2 September 2013 (Search 1) and 3 September 2013 (Search 2), yielding 739 and 745 titles, respectively. The second reviewer’s search was done on 19 November 2013 (Search 1) and 25 November 2013 (Search 2), yielding 769 and 774 titles, respectively. A third search done on 17 March 2015 by one reviewer yielded 98 new titles. Reviewers found high redundancy between the WoK/Scopus searches and the GS searches.

2.4 Targeted journal searches

Highly relevant journals, including Research Evaluation, Evaluation and Program Planning, Scientometrics, Research Policy, Futures, American Journal of Evaluation, Evaluation Review, and Evaluation, were comprehensively searched using broader, more inclusive search strings that would have been unmanageable for the main database search.

2.5 Supplementary searches

References in included articles were reviewed to identify additional relevant literature. td-net’s ‘Tour d’Horizon of Literature’, lists important inter- and transdisciplinary publications collected through an invitation to experts in the field to submit publications ( td-net 2014 ). Six additional articles were identified via supplementary search.

2.6 Limitations of coverage

The review was limited to English-language published articles and material available through internet searches. There was no systematic way to search the gray (unpublished) literature, but relevant material identified through supplementary searches was included.

2.7 Inclusion of articles

This study sought articles that review, critique, discuss, and/or propose principles, criteria, indicators, and/or measures for the evaluation of quality relevant to TDR. As noted, this yielded a large number of titles. We then selected only those articles with an explicit focus on the meaning of IDR and/or TDR quality and how to achieve, measure or evaluate it. Inclusion and exclusion criteria were developed through an iterative process of trial article screening and discussion within the research team. Through this process, inter-reviewer agreement was tested and strengthened. Inclusion criteria are listed in Tables 1 and 2 .

Inclusion criteria for title and abstract screening

Topic coverage
Document type
GeographicNo geographic barriers
DateNo temporal barriers
Discipline/fieldDiscussion must be relevant to environment, natural resources management, sustainability, livelihoods, or related areas of human–environmental interactionsThe discussion need not explicitly reference any of the above subject areas
Topic coverage
Document type
GeographicNo geographic barriers
DateNo temporal barriers
Discipline/fieldDiscussion must be relevant to environment, natural resources management, sustainability, livelihoods, or related areas of human–environmental interactionsThe discussion need not explicitly reference any of the above subject areas

Inclusion criteria for abstract and full article screening

ThemeInclusion criteria
Relevance to review objectives (all articles must meet this criteria)Intention of article, or part of article, is to discuss the meaning of research quality and how to measure/evaluate it
Theoretical discussion
Quality definitions and criteriaOffers an explicit definition or criteria of inter and/or transdisciplinary research quality
Evaluation processSuggests approaches to evaluate inter and/or transdisciplinary research quality. (will only be included if there is relevant discussion of research quality criteria and/or measurement)
Research ‘impact’Discusses research outcomes (diffusion, uptake, utilization, impact) as an indicator or consequence of research quality.
ThemeInclusion criteria
Relevance to review objectives (all articles must meet this criteria)Intention of article, or part of article, is to discuss the meaning of research quality and how to measure/evaluate it
Theoretical discussion
Quality definitions and criteriaOffers an explicit definition or criteria of inter and/or transdisciplinary research quality
Evaluation processSuggests approaches to evaluate inter and/or transdisciplinary research quality. (will only be included if there is relevant discussion of research quality criteria and/or measurement)
Research ‘impact’Discusses research outcomes (diffusion, uptake, utilization, impact) as an indicator or consequence of research quality.

Article screening was done in parallel by two reviewers in three rounds: (1) title, (2) abstract, and (3) full article. In cases of uncertainty, papers were included to the next round. Final decisions on inclusion of contested papers were made by consensus among the four team members.

2.8 Critical appraisal

In typical systematic reviews, individual articles are appraised to ensure that they are adequate for answering the research question and to assess the methods of each study for susceptibility to bias that could influence the outcome of the review (Petticrew and Roberts 2006). Most papers included in this review are theoretical and methodological papers, not empirical studies. Most do not have explicit methods that can be appraised with existing quality assessment frameworks. Our critical appraisal considered four criteria adapted from Spencer et al. (2003): (1) relevance to the review question, (2) clarity and logic of how information in the paper was generated, (3) significance of the contribution (are new ideas offered?), and (4) generalizability (is the context specified; do the ideas apply in other contexts?). Disagreements were discussed to reach consensus.

2.9 Data extraction and management

The review sought information on: arguments for or against expanding definitions of research quality, purposes for research quality evaluation, principles of research quality, criteria for research quality assessment, indicators and measures of research quality, and processes for evaluating TDR. Four reviewers independently extracted data from selected articles using the parameters listed in Supplementary Appendix 4 .

2.10 Data synthesis and TDR framework design

Our aim was to synthesize ideas, definitions, and recommendations for TDR quality criteria into a comprehensive and generalizable framework for the evaluation of quality in TDR. Key ideas were extracted from each article and summarized in an Excel database. We classified these ideas into themes and ultimately into overarching principles and associated criteria of TDR quality organized as a rubric ( Wickson and Carew 2014 ). Definitions of each principle and criterion were developed and rubric statements formulated based on the literature and our experience. These criteria (adjusted appropriately to be applied ex ante or ex post ) are intended to be used to assess a TDR project. The reviewer should consider whether the project fully satisfies, partially satisfies, or fails to satisfy each criterion. More information on application is provided in Section 4.3 below.

We tested the framework on a set of completed RRU graduate theses that used transdisciplinary approaches, with an explicit problem orientation and intent to contribute to social or environmental change. Three rounds of testing were done, with revisions after each round to refine and improve the framework.

3.1 Overview of the selected articles

Thirty-eight papers satisfied the inclusion criteria. A wide range of terms are used in the selected papers, including: cross-disciplinary; interdisciplinary; transdisciplinary; methodological pluralism; mode 2; triple helix; and supradisciplinary. Eight included papers specifically focused on sustainability science or TDR in natural resource management, or identified sustainability research as a growing TDR field that needs new forms of evaluation ( Cash et al. 2002 ; Bergmann et al. 2005 ; Chataway, Smith and Wield 2007 ; Spaapen, Dijstelbloem and Wamelink 2007 ; Andrén 2010 ; Carew and Wickson 2010 ; Lang et al. 2012 ; Gaziulusoy and Boyle 2013 ). Carew and Wickson (2010) build on the experience in the TDR realm to propose criteria and indicators of quality for ‘responsible research and innovation’.

The selected articles are written from three main perspectives. One set is primarily interested in advancing TDR approaches. These papers recognize the need for new quality measures to encourage and promote high-quality research and to overcome perceived biases against TDR approaches in research funding and publishing. A second set of papers is written from an evaluation perspective, with a focus on improving evaluation of TDR. The third set is written from the perspective of qualitative research characterized by methodological pluralism, with many characteristics and issues relevant to TDR approaches.

The majority of the articles focus at the project scale, some at the organization level, and some do not specify. Some articles explicitly focus on ex ante evaluation (e.g. proposal evaluation), others on ex post evaluation, and many are not explicit about the project stage they are concerned with. The methods used in the reviewed articles include authors’ reflection and opinion, literature review, expert consultation, document analysis, and case study. Summaries of report characteristics are available online ( Supplementary Appendices 5–8 ). Eight articles provide comprehensive evaluation frameworks and quality criteria specifically for TDR and research-in-context. The rest of the articles discuss aspects of quality related to TDR and recommend quality definitions, criteria, and/or evaluation processes.

3.2 The need for quality criteria and evaluation methods for TDR

Many of the selected articles highlight the lack of widely agreed principles and criteria of TDR quality. They note that, in the absence of TDR quality frameworks, disciplinary criteria are used ( Morrow 2005 ; Boix-Mansilla 2006a , b ; Feller 2006 ; Klein 2006 , 2008 ; Wickson, Carew and Russell 2006 ; Scott 2007 ; Spaapen, Dijstelbloem and Wamelink 2007 ; Oberg 2008 ; Erno-Kjolhede and Hansson 2011 ), and evaluations are often carried out by reviewers who lack cross-disciplinary experience and do not have a shared understanding of quality ( Aagaard-Hansen and Svedin 2009 ). Quality is discussed by many as a relative concept, developed within disciplines, and therefore defined and understood differently in each field ( Morrow 2005 ; Klein 2006 ; Oberg 2008 ; Mitchell and Willets 2009 ; Huutoniemi 2010 ; Hellstrom 2011 ). Jahn and Keil (2015) point out the difficulty of creating a common set of quality criteria for TDR in the absence of a standard agreed-upon definition of TDR. Many of the selected papers argue the need to move beyond narrowly defined ideas of ‘scientific excellence’ to incorporate a broader assessment of quality which includes societal relevance ( Hemlin and Rasmussen 2006 ; Chataway, Smith and Wield 2007 ; Ozga 2007 ; Spaapen, Dijstelbloem and Wamelink 2007 ). This shift includes greater focus on research organization, research process, and continuous learning, rather than primarily on research outputs ( Hemlin and Rasmussen 2006 ; de Jong et al. 2011 ; Wickson and Carew 2014 ; Jahn and Keil 2015 ). This responds to and reflects societal expectations that research should be accountable and have demonstrated utility ( Cloete 1997 ; Defila and Di Giulio 1999 ; Wickson, Carew and Russell 2006 ; Spaapen, Dijstelbloem and Wamelink 2007 ; Stige 2009 ).

A central aim of TDR is to achieve socially relevant outcomes, and TDR quality criteria should demonstrate accountability to society ( Cloete 1997 ; Hemlin and Rasmussen 2006 ; Chataway, Smith and Wield 2007 ; Ozga 2007 ; Spaapen, Dijstelbloem and Wamelink 2007 ; de Jong et al. 2011 ). Integration and mutual learning are a core element of TDR; it is not enough to transcend boundaries and incorporate societal knowledge but, as Carew and Wickson ( 2010 : 1147) summarize: ‘…the TD researcher needs to put effort into integrating these potentially disparate knowledges with a view to creating useable knowledge. That is, knowledge that can be applied in a given problem context and has some prospect of producing desired change in that context’. The inclusion of societal actors in the research process, the unique and often dispersed organization of research teams, and the deliberate integration of different traditions of knowledge production all fall outside of conventional assessment criteria ( Feller 2006 ).

Not only do the range of criteria need to be updated, expanded, agreed upon, and assumptions made explicit ( Boix-Mansilla 2006a ; Klein 2006 ; Scott 2007 ) but, given the specific problem orientation of TDR, reviewers beyond disciplinary academic peers need to be included in the assessment of quality ( Cloete 1997 ; Scott 2007 ; Spappen et al. 2007 ; Klein 2008 ). Several authors discuss the lack of reviewers with strong cross-disciplinary experience ( Aagaard-Hansen and Svedin 2009 ) and the lack of common criteria, philosophical foundations, and language for use by peer reviewers ( Klein 2008 ; Aagaard-Hansen and Svedin 2009 ). Peer review of TDR could be improved with explicit TDR quality criteria, and appropriate processes in place to ensure clear dialog between reviewers.

Finally, there is the need for increased emphasis on evaluation as part of the research process ( Bergmann et al. 2005 ; Hemlin and Rasmussen 2006 ; Meyrick 2006 ; Chataway, Smith and Wield 2007 ; Stige, Malterud and Midtgarden 2009 ; Hellstrom 2011 ; Lang et al. 2012 ; Wickson and Carew 2014 ). This is particularly true in large, complex, problem-oriented research projects. Ongoing monitoring of the research organization and process contributes to learning and adaptive management while research is underway and so helps improve quality. As stated by Wickson and Carew ( 2014 : 262): ‘We believe that in any process of interpreting, rearranging and/or applying these criteria, open negotiation on their meaning and application would only positively foster transformative learning, which is a valued outcome of good TD processes’.

3.3 TDR quality criteria and assessment approaches

Many of the papers provide quality criteria and/or describe constituent parts of quality. Aagaard-Hansen and Svedin (2009) define three key aspects of quality: societal relevance, impact, and integration. Meyrick (2006) states that quality research is transparent and systematic. Boaz and Ashby (2003) describe quality in four dimensions: methodological quality, quality of reporting, appropriateness of methods, and relevance to policy and practice. Although each article deconstructs quality in different ways and with different foci and perspectives, there is significant overlap and recurring themes in the papers reviewed. There is a broadly shared perspective that TDR quality is a multidimensional concept shaped by the specific context within which research is done ( Spaapen, Dijstelbloem and Wamelink 2007 ; Klein 2008 ), making a universal definition of TDR quality difficult or impossible ( Huutoniemi 2010 ).

Huutoniemi (2010) identifies three main approaches to conceptualizing quality in IDR and TDR: (1) using existing disciplinary standards adapted as necessary for IDR; (2) building on the quality standards of disciplines while fundamentally incorporating ways to deal with epistemological integration, problem focus, context, stakeholders, and process; and (3) radical departure from any disciplinary orientation in favor of external, emergent, context-dependent quality criteria that are defined and enacted collaboratively by a community of users.

The first approach is prominent in current research funding and evaluation protocols. Conservative approaches of this kind are criticized for privileging disciplinary research and for failing to provide guidance and quality control for transdisciplinary projects. The third approach would ‘undermine the prevailing status of disciplinary standards in the pursuit of a non-disciplinary, integrated knowledge system’ ( Huutoniemi 2010 : 313). No predetermined quality criteria are offered, only contextually embedded criteria that need to be developed within a specific research project. To some extent, this is the approach taken by Spaapen, Dijstelbloem and Wamelink (2007) and de Jong et al. (2011) . Such a sui generis approach cannot be used to compare across projects. Most of the reviewed papers take the second approach, and recommend TDR quality criteria that build on a disciplinary base.

Eight articles present comprehensive frameworks for quality evaluation, each with a unique approach, perspective, and goal. Two of these build comprehensive lists of criteria with associated questions to be chosen based on the needs of the particular research project ( Defila and Di Giulio 1999 ; Bergmann et al. 2005 ). Wickson and Carew (2014) develop a reflective heuristic tool with questions to guide researchers through ongoing self-evaluation. They also list criteria for external evaluation and to compare between projects. Spaapen, Dijstelbloem and Wamelink (2007) design an approach to evaluate a research project against its own goals and is not meant to compare between projects. Wickson and Carew (2014) developed a comprehensive rubric for the evaluation of Research and Innovation that builds of their extensive previous work in TDR. Finally, Lang et al. (2012) , Mitchell and Willets (2009) , and Jahn and Keil (2015) develop criteria checklists that can be applied across transdisciplinary projects.

Bergmann et al. (2005) and Carew and Wickson (2010) organize their frameworks into managerial elements of the research project, concerning problem context, participation, management, and outcomes. Lang et al. (2012) and Defila and Di Giulio (1999) focus on the chronological stages in the research process and identify criteria at each stage. Mitchell and Willets (2009) , , with a focus on doctoral s tudies, adapt standard dissertation evaluation criteria to accommodate broader, pluralistic, and more complex studies. Spaapen, Dijstelbloem and Wamelink (2007) focus on evaluating ‘research-in-context’. Wickson and Carew (2014) created a rubric based on criteria that span the research process, stages, and all actors included. Jahn and Keil (2015) organized their quality criteria into three categories of quality including: quality of the research problems, quality of the research process, and quality of the research results.

The remaining papers highlight key themes that must be considered in TDR evaluation. Dominant themes include: engagement with problem context, collaboration and inclusion of stakeholders, heightened need for explicit communication and reflection, integration of epistemologies, recognition of diverse outputs, the focus on having an impact, and reflexivity and adaptation throughout the process. The focus on societal problems in context and the increased engagement of stakeholders in the research process introduces higher levels of complexity that cannot be accommodated by disciplinary standards ( Defila and Di Giulio 1999 ; Bergmann et al. 2005 ; Wickson, Carew and Russell 2006 ; Spaapen, Dijstelbloem and Wamelink 2007 ; Klein 2008 ).

Finally, authors discuss process ( Defila and Di Giulio 1999 ; Bergmann et al. 2005 ; Boix-Mansilla 2006b ; Spaapen, Dijstelbloem and Wamelink 2007 ) and utilitarian values ( Hemlin 2006 ; Ernø-Kjølhede and Hansson 2011 ; Bornmann 2013 ) as essential aspects of quality in TDR. Common themes include: (1) the importance of formative and process-oriented evaluation ( Bergmann et al. 2005 ; Hemlin 2006 ; Stige 2009 ); (2) emphasis on the evaluation process itself (not just criteria or outcomes) and reflexive dialog for learning ( Bergmann et al. 2005 ; Boix-Mansilla 2006b ; Klein 2008 ; Oberg 2008 ; Stige, Malterud and Midtgarden 2009 ; Aagaard-Hansen and Svedin 2009 ; Carew and Wickson 2010 ; Huutoniemi 2010 ); (3) the need for peers who are experienced and knowledgeable about TDR for fair peer review ( Boix-Mansilla 2006a , b ; Klein 2006 ; Hemlin 2006 ; Scott 2007 ; Aagaard-Hansen and Svedin 2009 ); (4) the inclusion of stakeholders in the evaluation process ( Bergmann et al. 2005 ; Scott 2007 ; Andréen 2010 ); and (5) the importance of evaluations that are built in-context ( Defila and Di Giulio 1999 ; Feller 2006 ; Spaapen, Dijstelbloem and Wamelink 2007 ; de Jong et al. 2011 ).

While each reviewed approach offers helpful insights, none adequately fulfills the need for a broad and adaptable framework for assessing TDR quality. Wickson and Carew ( 2014 : 257) highlight the need for quality criteria that achieve balance between ‘comprehensiveness and over-prescription’: ‘any emerging quality criteria need to be concrete enough to provide real guidance but flexible enough to adapt to the specificities of varying contexts’. Based on our experience, such a framework should be:

Comprehensive: It should accommodate the main aspects of TDR, as identified in the review.

Time/phase adaptable: It should be applicable across the project cycle.

Scalable: It should be useful for projects of different scales.

Versatile: It should be useful to researchers and collaborators as a guide to research design and management, and to internal and external reviews and assessors.

Comparable: It should allow comparison of quality between and across projects/programs.

Reflexive: It should encourage and facilitate self-reflection and adaptation based on ongoing learning.

In this section, we synthesize the key principles and criteria of quality in TDR that were identified in the reviewed literature. Principles are the essential elements of high-quality TDR. Criteria are the conditions that need to be met in order to achieve a principle. We conclude by providing a framework for the evaluation of quality in TDR ( Table 3 ) and guidance for its application.

Transdisciplinary research quality assessment framework

CriteriaDefinitionRubric scale
Clearly defined socio-ecological contextThe context is well defined and described and analyzed sufficiently to identify research entry points.The context is well defined, described, and analyzed sufficiently to identify research entry points.
Socially relevant research problem Research problem is relevant to the problem context. The research problem is defined and framed in a way that clearly shows its relevance to the context and that demonstrates that consideration has been given to the practical application of research activities and outputs.
Engagement with problem context Researchers demonstrate appropriate breadth and depth of understanding of and sufficient interaction with the problem context. The documentation demonstrates that the researcher/team has interacted appropriately and sufficiently with the problem context to understand it and to have potential to influence it (e.g. through site visits, meeting participation, discussion with stakeholders, document review) in planning and implementing the research.
Explicit theory of changeThe research explicitly identifies its main intended outcomes and how they are intended/expected to be realized and to contribute to longer-term outcomes and/or impacts.The research explicitly identifies its main intended outcomes and how they are intended/expected to be realized and to contribute to longer-term outcomes and/or impacts.
Relevant research objectives and designThe research objectives and design are relevant, timely, and appropriate to the problem context, including attention to stakeholder needs and values.The documentation clearly demonstrates, through sufficient analysis of key factors, needs, and complexity within the context, that the research objectives and design are relevant and appropriate.
Appropriate project implementationResearch execution is suitable to the problem context and the socially relevant research objectives.The documentation reflects effective project implementation that is appropriate to the context, with reflection and adaptation as needed.
Effective communication Communication during and after the research process is appropriate to the context and accessible to stakeholders, users, and other intended audiences The documentation indicates that the research project planned and achieved appropriate communications with all necessary actors during the research process.
Broad preparationThe research is based on a strong integrated theoretical and empirical foundation that is relevant to the context.The documentation demonstrates critical understanding of an appropriate breadth and depth of literature and theory from across disciplines relevant to the context, and of the context itself
Clear research problem definitionThe research problem is clearly defined, researchable, grounded in the academic literature, and relevant to the context.The research problem is clearly stated and defined, researchable, and grounded in the academic literature and the problem context.
Objectives stated and metResearch objectives are clearly stated.The research objectives are clearly stated, logically and appropriately related to the context and the research problem, and achieved, with any necessary adaptation explained.
Feasible research projectThe research design and resources are appropriate and sufficient to meet the objectives as stated, and sufficiently resilient to adapt to unexpected opportunities and challenges throughout the research process.The research design and resources are appropriate and sufficient to meet the objectives as stated, and sufficiently resilient to adapt to unexpected opportunities and challenges throughout the research process.
Adequate competenciesThe skills and competencies of the researcher/team/collaboration (including academic and societal actors) are sufficient and in appropriate balance (without unnecessary complexity) to succeed.The documentation recognizes the limitations and biases of individuals’ knowledge and identifies the knowledge, skills, and expertise needed to carry out the research and provides evidence that they are represented in the research team in the appropriate measure to address the problem.
Research approach fits purposeDisciplines, perspectives, epistemologies, approaches, and theories are combined appropriately to create an approach that is appropriate to the research problem and the objectivesThe documentation explicitly states the rationale for the inclusion and integration of different epistemologies, disciplines, and methodologies, justifies the approach taken in reference to the context, and discusses the process of integration, including how paradoxes and conflicts were managed.
Appropriate methodsMethods are fit to purpose and well-suited to answering the research questions and achieving the objectives.Methods are clearly described, and documentation demonstrates that the methods are fit to purpose, systematic yet adaptable, and transparent. Novel (unproven) methods or adaptations are justified and explained, including why they were used and how they maintain scientific rigor.
Clearly presented argumentThe movement from analysis through interpretation to conclusions is transparently and logically described. Sufficient evidence is provided to clearly demonstrate the relationship between evidence and conclusions.Results are clearly presented. Analyses and interpretations are adequately explained, with clearly described terminology and full exposition of the logic leading to conclusions, including exploration of possible alternate explanations.
Transferability/generalizability of research findingsAppropriate and rigorous methods ensure the study’s findings are externally valid (generalizable). In some cases, findings may be too context specific to be generalizable in which case research would be judged on its ability to act as a model for future research.Document clearly explains how the research findings are transferable to other contexts OR, in cases that are too context-specific to be generalizable, discusses aspects of the research process or findings that may be transferable to other contexts and/or used as learning cases.
Limitations statedResearchers engage in ongoing individual and collective reflection in order to explicitly acknowledge and address limitations.Limitations are clearly stated and adequately accounted for on an ongoing basis through the research project.
Ongoing monitoring and reflexivity Researchers engage in ongoing reflection and adaptation of the research process, making changes as new obstacles, opportunities, circumstances, and/or knowledge surface.Processes of reflection, individually and as a research team, are clearly documented throughout the research process along with clear descriptions and justifications for any changes to the research process made as a result of reflection.
Disclosure of perspectiveActual, perceived, and potential bias is clearly stated and accounted for. This includes aspects of: researchers’ position, sources of support, financing, collaborations, partnerships, research mandate, assumptions, goals, and bounds placed on commissioned research.The documentation identifies potential or actual bias, including aspects of researchers’ positions, sources of support, financing, collaborations, partnerships, research mandate, assumptions, goals, and bounds placed on commissioned research.
Effective collaborationAppropriate processes are in place to ensure effective collaboration (e.g. clear and explicit roles and responsibilities agreed upon, transparent and appropriate decision-making structures)The documentation explicitly discusses the collaboration process, with adequate demonstration that the opportunities and process for collaboration are appropriate to the context and the actors involved (e.g. clear and explicit roles and responsibilities agreed upon, transparent and appropriate decision-making structures)
Genuine and explicit inclusionInclusion of diverse actors in the research process is clearly defined. Representation of actors' perspectives, values, and unique contexts is ensured through adequate planning, explicit agreements, communal reflection, and reflexivity.The documentation explains the range of participants and perspectives/cultural backgrounds involved, clearly describes what steps were taken to ensure the respectful inclusion of diverse actors/views, and explains the roles and contributions of all participants in the research process.
Research is ethicalResearch adheres to standards of ethical conduct.The documentation describes the ethical review process followed and, considering the full range of stakeholders, explicitly identifies any ethical challenges and how they were resolved.
Research builds social capacityChange takes place in individuals, groups, and at the institutional level through shared learning. This can manifest as a change in knowledge, understanding, and/or perspective of participants in the research project. There is evidence of observed changes in knowledge, behavior, understanding, and/or perspectives of research participants and/or stakeholders as a result of the research process and/or findings.
Contribution to knowledgeResearch contributes to knowledge and understanding in academic and social realms in a timely, relevant, and significant way.There is evidence that knowledge created through the project is being/has been used by intended audiences and end-users.
Practical applicationResearch has a practical application. The findings, process, and/or products of research are used.There is evidence that innovations developed through the research and/or the research process have been (or will be applied) in the real world.
Significant outcomeResearch contributes to the solution of the targeted problem or provides unexpected solutions to other problems. This can include a variety of outcomes: building societal capacity, learning, use of research products, and/or changes in behaviorsThere is evidence that the research has contributed to positive change in the problem context and/or innovations that have positive social or environmental impacts.
CriteriaDefinitionRubric scale
Clearly defined socio-ecological contextThe context is well defined and described and analyzed sufficiently to identify research entry points.The context is well defined, described, and analyzed sufficiently to identify research entry points.
Socially relevant research problem Research problem is relevant to the problem context. The research problem is defined and framed in a way that clearly shows its relevance to the context and that demonstrates that consideration has been given to the practical application of research activities and outputs.
Engagement with problem context Researchers demonstrate appropriate breadth and depth of understanding of and sufficient interaction with the problem context. The documentation demonstrates that the researcher/team has interacted appropriately and sufficiently with the problem context to understand it and to have potential to influence it (e.g. through site visits, meeting participation, discussion with stakeholders, document review) in planning and implementing the research.
Explicit theory of changeThe research explicitly identifies its main intended outcomes and how they are intended/expected to be realized and to contribute to longer-term outcomes and/or impacts.The research explicitly identifies its main intended outcomes and how they are intended/expected to be realized and to contribute to longer-term outcomes and/or impacts.
Relevant research objectives and designThe research objectives and design are relevant, timely, and appropriate to the problem context, including attention to stakeholder needs and values.The documentation clearly demonstrates, through sufficient analysis of key factors, needs, and complexity within the context, that the research objectives and design are relevant and appropriate.
Appropriate project implementationResearch execution is suitable to the problem context and the socially relevant research objectives.The documentation reflects effective project implementation that is appropriate to the context, with reflection and adaptation as needed.
Effective communication Communication during and after the research process is appropriate to the context and accessible to stakeholders, users, and other intended audiences The documentation indicates that the research project planned and achieved appropriate communications with all necessary actors during the research process.
Broad preparationThe research is based on a strong integrated theoretical and empirical foundation that is relevant to the context.The documentation demonstrates critical understanding of an appropriate breadth and depth of literature and theory from across disciplines relevant to the context, and of the context itself
Clear research problem definitionThe research problem is clearly defined, researchable, grounded in the academic literature, and relevant to the context.The research problem is clearly stated and defined, researchable, and grounded in the academic literature and the problem context.
Objectives stated and metResearch objectives are clearly stated.The research objectives are clearly stated, logically and appropriately related to the context and the research problem, and achieved, with any necessary adaptation explained.
Feasible research projectThe research design and resources are appropriate and sufficient to meet the objectives as stated, and sufficiently resilient to adapt to unexpected opportunities and challenges throughout the research process.The research design and resources are appropriate and sufficient to meet the objectives as stated, and sufficiently resilient to adapt to unexpected opportunities and challenges throughout the research process.
Adequate competenciesThe skills and competencies of the researcher/team/collaboration (including academic and societal actors) are sufficient and in appropriate balance (without unnecessary complexity) to succeed.The documentation recognizes the limitations and biases of individuals’ knowledge and identifies the knowledge, skills, and expertise needed to carry out the research and provides evidence that they are represented in the research team in the appropriate measure to address the problem.
Research approach fits purposeDisciplines, perspectives, epistemologies, approaches, and theories are combined appropriately to create an approach that is appropriate to the research problem and the objectivesThe documentation explicitly states the rationale for the inclusion and integration of different epistemologies, disciplines, and methodologies, justifies the approach taken in reference to the context, and discusses the process of integration, including how paradoxes and conflicts were managed.
Appropriate methodsMethods are fit to purpose and well-suited to answering the research questions and achieving the objectives.Methods are clearly described, and documentation demonstrates that the methods are fit to purpose, systematic yet adaptable, and transparent. Novel (unproven) methods or adaptations are justified and explained, including why they were used and how they maintain scientific rigor.
Clearly presented argumentThe movement from analysis through interpretation to conclusions is transparently and logically described. Sufficient evidence is provided to clearly demonstrate the relationship between evidence and conclusions.Results are clearly presented. Analyses and interpretations are adequately explained, with clearly described terminology and full exposition of the logic leading to conclusions, including exploration of possible alternate explanations.
Transferability/generalizability of research findingsAppropriate and rigorous methods ensure the study’s findings are externally valid (generalizable). In some cases, findings may be too context specific to be generalizable in which case research would be judged on its ability to act as a model for future research.Document clearly explains how the research findings are transferable to other contexts OR, in cases that are too context-specific to be generalizable, discusses aspects of the research process or findings that may be transferable to other contexts and/or used as learning cases.
Limitations statedResearchers engage in ongoing individual and collective reflection in order to explicitly acknowledge and address limitations.Limitations are clearly stated and adequately accounted for on an ongoing basis through the research project.
Ongoing monitoring and reflexivity Researchers engage in ongoing reflection and adaptation of the research process, making changes as new obstacles, opportunities, circumstances, and/or knowledge surface.Processes of reflection, individually and as a research team, are clearly documented throughout the research process along with clear descriptions and justifications for any changes to the research process made as a result of reflection.
Disclosure of perspectiveActual, perceived, and potential bias is clearly stated and accounted for. This includes aspects of: researchers’ position, sources of support, financing, collaborations, partnerships, research mandate, assumptions, goals, and bounds placed on commissioned research.The documentation identifies potential or actual bias, including aspects of researchers’ positions, sources of support, financing, collaborations, partnerships, research mandate, assumptions, goals, and bounds placed on commissioned research.
Effective collaborationAppropriate processes are in place to ensure effective collaboration (e.g. clear and explicit roles and responsibilities agreed upon, transparent and appropriate decision-making structures)The documentation explicitly discusses the collaboration process, with adequate demonstration that the opportunities and process for collaboration are appropriate to the context and the actors involved (e.g. clear and explicit roles and responsibilities agreed upon, transparent and appropriate decision-making structures)
Genuine and explicit inclusionInclusion of diverse actors in the research process is clearly defined. Representation of actors' perspectives, values, and unique contexts is ensured through adequate planning, explicit agreements, communal reflection, and reflexivity.The documentation explains the range of participants and perspectives/cultural backgrounds involved, clearly describes what steps were taken to ensure the respectful inclusion of diverse actors/views, and explains the roles and contributions of all participants in the research process.
Research is ethicalResearch adheres to standards of ethical conduct.The documentation describes the ethical review process followed and, considering the full range of stakeholders, explicitly identifies any ethical challenges and how they were resolved.
Research builds social capacityChange takes place in individuals, groups, and at the institutional level through shared learning. This can manifest as a change in knowledge, understanding, and/or perspective of participants in the research project. There is evidence of observed changes in knowledge, behavior, understanding, and/or perspectives of research participants and/or stakeholders as a result of the research process and/or findings.
Contribution to knowledgeResearch contributes to knowledge and understanding in academic and social realms in a timely, relevant, and significant way.There is evidence that knowledge created through the project is being/has been used by intended audiences and end-users.
Practical applicationResearch has a practical application. The findings, process, and/or products of research are used.There is evidence that innovations developed through the research and/or the research process have been (or will be applied) in the real world.
Significant outcomeResearch contributes to the solution of the targeted problem or provides unexpected solutions to other problems. This can include a variety of outcomes: building societal capacity, learning, use of research products, and/or changes in behaviorsThere is evidence that the research has contributed to positive change in the problem context and/or innovations that have positive social or environmental impacts.

a Research problems are the particular topic, area of concern, question to be addressed, challenge, opportunity, or focus of the research activity. Research problems are related to the societal problem but take on a specific focus, or framing, within a societal problem.

b Problem context refers to the social and environmental setting(s) that gives rise to the research problem, including aspects of: location; culture; scale in time and space; social, political, economic, and ecological/environmental conditions; resources and societal capacity available; uncertainty, complexity, and novelty associated with the societal problem; and the extent of agency that is held by stakeholders ( Carew and Wickson 2010 ).

c Words such as ‘appropriate’, ‘suitable’, and ‘adequate’ are used deliberately to allow for quality criteria to be flexible and specific enough to the needs of individual research projects ( Oberg 2008 ).

d Research process refers to the series of decisions made and actions taken throughout the entire duration of the research project and encompassing all aspects of the research project.

e Reflexivity refers to an iterative process of formative, critical reflection on the important interactions and relationships between a research project’s process, context, and product(s).

f In an ex ante evaluation, ‘evidence of’ would be replaced with ‘potential for’.

There is a strong trend in the reviewed articles to recognize the need for appropriate measures of scientific quality (usually adapted from disciplinary antecedants), but also to consider broader sets of criteria regarding the societal significance and applicability of research, and the need for engagement and representation of stakeholder values and knowledge. Cash et al. (2002) nicely conceptualize three key aspects of effective sustainability research as: salience (or relevance), credibility, and legitimacy. These are presented as necessary attributes for research to successfully produce transferable, useful information that can cross boundaries between disciplines, across scales, and between science and society. Many of the papers also refer to the principle that high-quality TDR should be effective in terms of contributing to the solution of problems. These four principles are discussed in the following sections.

4.1.1 Relevance

Relevance is the importance, significance, and usefulness of the research project's objectives, process, and findings to the problem context and to society. This includes the appropriateness of the timing of the research, the questions being asked, the outputs, and the scale of the research in relation to the societal problem being addressed. Good-quality TDR addresses important social/environmental problems and produces knowledge that is useful for decision making and problem solving ( Cash et al. 2002 ; Klein 2006 ). As Erno-Kjolhede and Hansson ( 2011 : 140) explain, quality ‘is first and foremost about creating results that are applicable and relevant for the users of the research’. Researchers must demonstrate an in-depth knowledge of and ongoing engagement with the problem context in which their research takes place ( Wickson, Carew and Russell 2006 ; Stige, Malterud and Midtgarden 2009 ; Mitchell and Willets 2009 ). From the early steps of problem formulation and research design through to the appropriate and effective communication of research findings, the applicability and relevance of the research to the societal problem must be explicitly stated and incorporated.

4.1.2 Credibility

Credibility refers to whether or not the research findings are robust and the knowledge produced is scientifically trustworthy. This includes clear demonstration that the data are adequate, with well-presented methods and logical interpretations of findings. High-quality research is authoritative, transparent, defensible, believable, and rigorous. This is the traditional purview of science, and traditional disciplinary criteria can be applied in TDR evaluation to an extent. Additional and modified criteria are needed to address the integration of epistemologies and methodologies and the development of novel methods through collaboration, the broad preparation and competencies required to carry out the research, and the need for reflection and adaptation when operating in complex systems. Having researchers actively engaged in the problem context and including extra-scientific actors as part of the research process helps to achieve relevance and legitimacy of the research; it also adds complexity and heightened requirements of transparency, reflection, and reflexivity to ensure objective, credible research is carried out.

Active reflexivity is a criterion of credibility of TDR that may seem to contradict more rigid disciplinary methodological traditions ( Carew and Wickson 2010 ). Practitioners of TDR recognize that credible work in these problem-oriented fields requires active reflexivity, epitomized by ongoing learning, flexibility, and adaptation to ensure the research approach and objectives remain relevant and fit-to-purpose ( Lincoln 1995 ; Bergmann et al. 2005 ; Wickson, Carew and Russell 2006 ; Mitchell and Willets 2009 ; Andreén 2010 ; Carew and Wickson 2010 ; Wickson and Carew 2014 ). Changes made during the research process must be justified and reported transparently and explicitly to maintain credibility.

The need for critical reflection on potential bias and limitations becomes more important to maintain credibility of research-in-context ( Lincoln 1995 ; Bergmann et al. 2005 ; Mitchell and Willets 2009 ; Stige, Malterud and Midtgarden 2009 ). Transdisciplinary researchers must ensure they maintain a high level of objectivity and transparency while actively engaging in the problem context. This point demonstrates the fine balance between different aspects of quality, in this case relevance and credibility, and the need to be aware of tensions and to seek complementarities ( Cash et al. 2002 ).

4.1.3 Legitimacy

Legitimacy refers to whether the research process is perceived as fair and ethical by end-users. In other words, is it acceptable and trustworthy in the eyes of those who will use it? This requires the appropriate inclusion and consideration of diverse values, interests, and the ethical and fair representation of all involved. Legitimacy may be achieved in part through the genuine inclusion of stakeholders in the research process. Whereas credibility refers to technical aspects of sound research, legitimacy deals with sociopolitical aspects of the knowledge production process and products of research. Do stakeholders trust the researchers and the research process, including funding sources and other sources of potential bias? Do they feel represented? Legitimate TDR ‘considers appropriate values, concerns, and perspectives of different actors’ ( Cash et al. 2002 : 2) and incorporates these perspectives into the research process through collaboration and mutual learning ( Bergmann et al. 2005 ; Chataway, Smith and Wield 2007 ; Andrén 2010 ; Huutoneimi 2010 ). A fair and ethical process is important to uphold standards of quality in all research. However, there are additional considerations that are unique to TDR.

Because TDR happens in-context and often in collaboration with societal actors, the disclosure of researcher perspective and a transparent statement of all partnerships, financing, and collaboration is vital to ensure an unbiased research process ( Lincoln 1995 ; Defila and Di Giulio 1999 ; Boaz and Ashby 2003 ; Barker and Pistrang 2005 ; Bergmann et al. 2005 ). The disclosure of perspective has both internal and external aspects, on one hand ensuring the researchers themselves explicitly reflect on and account for their own position, potential sources of bias, and limitations throughout the process, and on the other hand making the process transparent to those external to the research group who can then judge the legitimacy based on their perspective of fairness ( Cash et al. 2002 ).

TDR includes the engagement of societal actors along a continuum of participation from consultation to co-creation of knowledge ( Brandt et al. 2013 ). Regardless of the depth of participation, all processes that engage societal actors must ensure that inclusion/engagement is genuine, roles are explicit, and processes for effective and fair collaboration are present ( Bergmann et al. 2005 ; Wickson, Carew and Russell 2006 ; Spaapen, Dijstelbloem and Wamelink 2007 ; Hellstrom 2012 ). Important considerations include: the accurate representation of those involved; explicit and agreed-upon roles and contributions of actors; and adequate planning and procedures to ensure all values, perspectives, and contexts are adequately and appropriately incorporated. Mitchell and Willets (2009) consider cultural competence as a key criterion that can support researchers in navigating diverse epistemological perspectives. This is similar to what Morrow terms ‘social validity’, a criterion that asks researchers to be responsive to and critically aware of the diversity of perspectives and cultures influenced by their research. Several authors highlight that in order to develop this critical awareness of the diversity of cultural paradigms that operate within a problem situation, researchers should practice responsive, critical, and/or communal reflection ( Bergmann et al. 2005 ; Wickson, Carew and Russell 2006 ; Mitchell and Willets 2009 ; Carew and Wickson 2010 ). Reflection and adaptation are important quality criteria that cut across multiple principles and facilitate learning throughout the process, which is a key foundation to TD inquiry.

4.1.4 Effectiveness

We define effective research as research that contributes to positive change in the social, economic, and/or environmental problem context. Transdisciplinary inquiry is rooted in the objective of solving real-word problems ( Klein 2008 ; Carew and Wickson 2010 ) and must have the potential to ( ex ante ) or actually ( ex post ) make a difference if it is to be considered of high quality ( Erno-Kjolhede and Hansson 2011 ). Potential research effectiveness can be indicated and assessed at the proposal stage and during the research process through: a clear and stated intention to address and contribute to a societal problem, the establishment of the research process and objectives in relation to the problem context, and the continuous reflection on the usefulness of the research findings and products to the problem ( Bergmann et al. 2005 ; Lahtinen et al. 2005 ; de Jong et al. 2011 ).

Assessing research effectiveness ex post remains a major challenge, especially in complex transdisciplinary approaches. Conventional and widely used measures of ‘scientific impact’ count outputs such as journal articles and other publications and citations of those outputs (e.g. H index; i10 index). While these are useful indicators of scholarly influence, they are insufficient and inappropriate measures of research effectiveness where research aims to contribute to social learning and change. We need to also (or alternatively) focus on other kinds of research and scholarship outputs and outcomes and the social, economic, and environmental impacts that may result.

For many authors, contributing to learning and building of societal capacity are central goals of TDR ( Defila and Di Giulio 1999 ; Spaapen, Dijstelbloem and Wamelink 2007 ; Carew and Wickson 2010 ; Erno-Kjolhede and Hansson 2011 ; Hellstrom 2011 ), and so are considered part of TDR effectiveness. Learning can be characterized as changes in knowledge, attitudes, or skills and can be assessed directly, or through observed behavioral changes and network and relationship development. Some evaluation methodologies (e.g. Outcome Mapping ( Earl, Carden and Smutylo 2001 )) specifically measure these kinds of changes. Other evaluation methodologies consider the role of research within complex systems and assess effectiveness in terms of contributions to changes in policy and practice and resulting social, economic, and environmental benefits ( ODI 2004 , 2012 ; White and Phillips 2012 ; Mayne et al. 2013 ).

4.2 TDR quality criteria

TDR quality criteria and their definitions (explicit or implicit) were extracted from each article and summarized in an Excel database. These criteria were classified into themes corresponding to the four principles identified above, sorted and refined to develop sets of criteria that are comprehensive, mutually exclusive, and representative of the ideas presented in the reviewed articles. Within each principle, the criteria are organized roughly in the sequence of a typical project cycle (e.g. with research design following problem identification and preceding implementation). Definitions of each criterion were developed to reflect the concepts found in the literature, tested and refined iteratively to improve clarity. Rubric statements were formulated based on the literature and our own experience.

The complete set of principles, criteria, and definitions is presented as the TDR Quality Assessment Framework ( Table 3 ).

4.3 Guidance on the application of the framework

4.3.1 timing.

Most criteria can be applied at each stage of the research process, ex ante , mid term, and ex post , using appropriate interpretations at each stage. Ex ante (i.e. proposal) assessment should focus on a project’s explicitly stated intentions and approaches to address the criteria. Mid-term indicators will focus on the research process and whether or not it is being implemented in a way that will satisfy the criteria. Ex post assessment should consider whether the research has been done appropriately for the purpose and that the desired results have been achieved.

4.3.2 New meanings for familiar terms

Many of the terms used in the framework are extensions of disciplinary criteria and share the same or similar names and perhaps similar but nuanced meaning. The principles and criteria used here extend beyond disciplinary antecedents and include new concepts and understandings that encapsulate the unique characteristics and needs of TDR and allow for evaluation and definition of quality in TDR. This is especially true in the criteria related to credibility. These criteria are analogous to traditional disciplinary criteria, but with much stronger emphasis on grounding in both the scientific and the social/environmental contexts. We urge readers to pay close attention to the definitions provided in Table 3 as well as the detailed descriptions of the principles in Section 4.1.

4.3.3 Using the framework

The TDR quality framework ( Table 3 ) is designed to be used to assess TDR research according to a project’s purpose; i.e. the criteria must be interpreted with respect to the context and goals of an individual research activity. The framework ( Table 3 ) lists the main criteria synthesized from the literature and our experience, organized within the principles of relevance, credibility, legitimacy, and effectiveness. The table presents the criteria within each principle, ordered to approximate a typical process of identifying a research problem and designing and implementing research. We recognize that the actual process in any given project will be iterative and will not necessarily follow this sequence, but this provides a logical flow. A concise definition is provided in the second column to explain each criterion. We then provide a rubric statement in the third column, phrased to be applied when the research has been completed. In most cases, the same statement can be used at the proposal stage with a simple tense change or other minor grammatical revision, except for the criteria relating to effectiveness. As discussed above, assessing effectiveness in terms of outcomes and/or impact requires evaluation research. At the proposal stage, it is only possible to assess potential effectiveness.

Many rubrics offer a set of statements for each criterion that represent progressively higher levels of achievement; the evaluator is asked to select the best match. In practice, this often results in vague and relative statements of merit that are difficult to apply. We have opted to present a single rubric statement in absolute terms for each criterion. The assessor can then rank how well a project satisfies each criterion using a simple three-point Likert scale. If a project fully satisfies a criterion—that is, if there is evidence that the criterion has been addressed in a way that is coherent, explicit, sufficient, and convincing—it should be ranked as a 2 for that criterion. A score of 2 means that the evaluator is persuaded that the project addressed that criterion in an intentional, appropriate, explicit, and thorough way. A score of 1 would be given when there is some evidence that the criterion was considered, but it is lacking completion, intention, and/or is not addressed satisfactorily. For example, a score of 1 would be given when a criterion is explicitly discussed but poorly addressed, or when there is some indication that the criterion has been considered and partially addressed but it has not been treated explicitly, thoroughly, or adequately. A score of 0 indicates that there is no evidence that the criterion was addressed or that it was addressed in a way that was misguided or inappropriate.

It is critical that the evaluation be done in context, keeping in mind the purpose, objectives, and resources of the project, as well as other contextual information, such as the intended purpose of grant funding or relevant partnerships. Each project will be unique in its complexities; what is sufficient or adequate in one criterion for one research project may be insufficient or inappropriate for another. Words such as ‘appropriate’, ‘suitable’, and ‘adequate’ are used deliberately to encourage application of criteria to suit the needs of individual research projects ( Oberg 2008 ). Evaluators must consider the objectives of the research project and the problem context within which it is carried out as the benchmark for evaluation. For example, we tested the framework with RRU masters theses. These are typically small projects with limited scope, carried out by a single researcher. Expectations for ‘effective communication’ or ‘competencies’ or ‘effective collaboration’ are much different in these kinds of projects than in a multi-year, multi-partner CIFOR project. All criteria should be evaluated through the lens of the stated research objectives, research goals, and context.

The systematic review identified relevant articles from a diverse literature that have a strong central focus. Collectively, they highlight the complexity of contemporary social and environmental problems and emphasize that addressing such issues requires combinations of new knowledge and innovation, action, and engagement. Traditional disciplinary research has often failed to provide solutions because it cannot adequately cope with complexity. New forms of research are proliferating, crossing disciplinary and academic boundaries, integrating methodologies, and engaging a broader range of research participants, as a way to make research more relevant and effective. Theoretically, such approaches appear to offer great potential to contribute to transformative change. However, because these approaches are new and because they are multidimensional, complex, and often unique, it has been difficult to know what works, how, and why. In the absence of the kinds of methodological and quality standards that guide disciplinary research, there are no generally agreed criteria for evaluating such research.

Criteria are needed to guide and to help ensure that TDR is of high quality, to inform the teaching and learning of new researchers, and to encourage and support the further development of transdisciplinary approaches. The lack of a standard and broadly applicable framework for the evaluation of quality in TDR is perceived to cause an implicit or explicit devaluation of high-quality TDR or may prevent quality TDR from being done. There is a demonstrated need for an operationalized understanding of quality that addresses the characteristics, contributions, and challenges of TDR. The reviewed articles approach the topic from different perspectives and fields of study, using different terminology for similar concepts, or the same terminology for different concepts, and with unique ways of organizing and categorizing the dimensions and quality criteria. We have synthesized and organized these concepts as key TDR principles and criteria in a TDR Quality Framework, presented as an evaluation rubric. We have tested the framework on a set of masters’ theses and found it to be broadly applicable, usable, and useful for analyzing individual projects and for comparing projects within the set. We anticipate that further testing with a wider range of projects will help further refine and improve the definitions and rubric statements. We found that the three-point Likert scale (0–2) offered sufficient variability for our purposes, and rating is less subjective than with relative rubric statements. It may be possible to increase the rating precision with more points on the scale to increase the sensitivity for comparison purposes, for example in a review of proposals for a particular grant application.

Many of the articles we reviewed emphasize the importance of the evaluation process itself. The formative, developmental role of evaluation in TDR is seen as essential to the goals of mutual learning as well as to ensure that research remains responsive and adaptive to the problem context. In order to adequately evaluate quality in TDR, the process, including who carries out the evaluations, when, and in what manner, must be revised to be suitable to the unique characteristics and objectives of TDR. We offer this review and synthesis, along with a proposed TDR quality evaluation framework, as a contribution to an important conversation. We hope that it will be useful to researchers and research managers to help guide research design, implementation and reporting, and to the community of research organizations, funders, and society at large. As underscored in the literature review, there is a need for an adapted research evaluation process that will help advance problem-oriented research in complex systems, ultimately to improve research effectiveness.

This work was supported by funding from the Canada Research Chairs program. Funding support from the Canadian Social Sciences and Humanities Research Council (SSHRC) and technical support from the Evidence Based Forestry Initiative of the Centre for International Forestry Research (CIFOR), funded by UK DfID are also gratefully acknowledged.

Supplementary data is available here

The authors thank Barbara Livoreil and Stephen Dovers for valuable comments and suggestions on the protocol and Gillian Petrokofsky for her review of the protocol and a draft version of the manuscript. Two anonymous reviewers and the editor provided insightful critique and suggestions in two rounds that have helped to substantially improve the article.

Conflict of interest statement . None declared.

1. ‘Stakeholders’ refers to individuals and groups of societal actors who have an interest in the issue or problem that the research seeks to address.

2. The terms ‘quality’ and ‘excellence’ are often used in the literature with similar meaning. Technically, ‘excellence’ is a relative concept, referring to the superiority of a thing compared to other things of its kind. Quality is an attribute or a set of attributes of a thing. We are interested in what these attributes are or should be in high-quality research. Therefore, the term ‘quality’ is used in this discussion.

3. The terms ‘science’ and ‘research’ are not always clearly distinguished in the literature. We take the position that ‘science’ is a more restrictive term that is properly applied to systematic investigations using the scientific method. ‘Research’ is a broader term for systematic investigations using a range of methods, including but not restricted to the scientific method. We use the term ‘research’ in this broad sense.

Aagaard-Hansen J. Svedin U. ( 2009 ) ‘Quality Issues in Cross-disciplinary Research: Towards a Two-pronged Approach to Evaluation’ , Social Epistemology , 23 / 2 : 165 – 76 . DOI: 10.1080/02691720902992323

Google Scholar

Andrén S. ( 2010 ) ‘A Transdisciplinary, Participatory and Action-Oriented Research Approach: Sounds Nice but What do you Mean?’ [unpublished working paper] Human Ecology Division: Lund University, 1–21. < https://lup.lub.lu.se/search/publication/1744256 >

Australian Research Council (ARC) ( 2012 ) ERA 2012 Evaluation Handbook: Excellence in Research for Australia . Australia : ARC . < http://www.arc.gov.au/pdf/era12/ERA%202012%20Evaluation%20Handbook_final%20for%20web_protected.pdf >

Google Preview

Balsiger P. W. ( 2004 ) ‘Supradisciplinary Research Practices: History, Objectives and Rationale’ , Futures , 36 / 4 : 407 – 21 .

Bantilan M. C. et al.  . ( 2004 ) ‘Dealing with Diversity in Scientific Outputs: Implications for International Research Evaluation’ , Research Evaluation , 13 / 2 : 87 – 93 .

Barker C. Pistrang N. ( 2005 ) ‘Quality Criteria under Methodological Pluralism: Implications for Conducting and Evaluating Research’ , American Journal of Community Psychology , 35 / 3-4 : 201 – 12 .

Bergmann M. et al.  . ( 2005 ) Quality Criteria of Transdisciplinary Research: A Guide for the Formative Evaluation of Research Projects . Central report of Evalunet – Evaluation Network for Transdisciplinary Research. Frankfurt am Main, Germany: Institute for Social-Ecological Research. < http://www.isoe.de/ftp/evalunet_guide.pdf >

Boaz A. Ashby D. ( 2003 ) Fit for Purpose? Assessing Research Quality for Evidence Based Policy and Practice .

Boix-Mansilla V. ( 2006a ) ‘Symptoms of Quality: Assessing Expert Interdisciplinary Work at the Frontier: An Empirical Exploration’ , Research Evaluation , 15 / 1 : 17 – 29 .

Boix-Mansilla V. . ( 2006b ) ‘Conference Report: Quality Assessment in Interdisciplinary Research and Education’ , Research Evaluation , 15 / 1 : 69 – 74 .

Bornmann L. ( 2013 ) ‘What is Societal Impact of Research and How can it be Assessed? A Literature Survey’ , Journal of the American Society for Information Science and Technology , 64 / 2 : 217 – 33 .

Brandt P. et al.  . ( 2013 ) ‘A Review of Transdisciplinary Research in Sustainability Science’ , Ecological Economics , 92 : 1 – 15 .

Cash D. Clark W.C. Alcock F. Dickson N. M. Eckley N. Jäger J . ( 2002 ) Salience, Credibility, Legitimacy and Boundaries: Linking Research, Assessment and Decision Making (November 2002). KSG Working Papers Series RWP02-046. Available at SSRN: http://ssrn.com/abstract=372280 .

Carew A. L. Wickson F. ( 2010 ) ‘The TD Wheel: A Heuristic to Shape, Support and Evaluate Transdisciplinary Research’ , Futures , 42 / 10 : 1146 – 55 .

Collaboration for Environmental Evidence (CEE) . ( 2013 ) Guidelines for Systematic Review and Evidence Synthesis in Environmental Management . Version 4.2. Environmental Evidence < www.environmentalevidence.org/Documents/Guidelines/Guidelines4.2.pdf >

Chandler J. ( 2014 ) Methods Research and Review Development Framework: Policy, Structure, and Process . < http://methods.cochrane.org/projects-developments/research >

Chataway J. Smith J. Wield D. ( 2007 ) ‘Shaping Scientific Excellence in Agricultural Research’ , International Journal of Biotechnology 9 / 2 : 172 – 87 .

Clark W. C. Dickson N. ( 2003 ) ‘Sustainability Science: The Emerging Research Program’ , PNAS 100 / 14 : 8059 – 61 .

Consultative Group on International Agricultural Research (CGIAR) ( 2011 ) A Strategy and Results Framework for the CGIAR . < http://library.cgiar.org/bitstream/handle/10947/2608/Strategy_and_Results_Framework.pdf?sequence=4 >

Cloete N. ( 1997 ) ‘Quality: Conceptions, Contestations and Comments’, African Regional Consultation Preparatory to the World Conference on Higher Education , Dakar, Senegal, 1-4 April 1997 .

Defila R. DiGiulio A. ( 1999 ) ‘Evaluating Transdisciplinary Research,’ Panorama: Swiss National Science Foundation Newsletter , 1 : 4 – 27 . < www.ikaoe.unibe.ch/forschung/ip/Specialissue.Pano.1.99.pdf >

Donovan C. ( 2008 ) ‘The Australian Research Quality Framework: A Live Experiment in Capturing the Social, Economic, Environmental, and Cultural Returns of Publicly Funded Research. Reforming the Evaluation of Research’ , New Directions for Evaluation , 118 : 47 – 60 .

Earl S. Carden F. Smutylo T. ( 2001 ) Outcome Mapping. Building Learning and Reflection into Development Programs . Ottawa, ON : International Development Research Center .

Ernø-Kjølhede E. Hansson F. ( 2011 ) ‘Measuring Research Performance during a Changing Relationship between Science and Society’ , Research Evaluation , 20 / 2 : 130 – 42 .

Feller I. ( 2006 ) ‘Assessing Quality: Multiple Actors, Multiple Settings, Multiple Criteria: Issues in Assessing Interdisciplinary Research’ , Research Evaluation 15 / 1 : 5 – 15 .

Gaziulusoy A. İ. Boyle C. ( 2013 ) ‘Proposing a Heuristic Reflective Tool for Reviewing Literature in Transdisciplinary Research for Sustainability’ , Journal of Cleaner Production , 48 : 139 – 47 .

Gibbons M. et al.  . ( 1994 ) The New Production of Knowledge: The Dynamics of Science and Research in Contemporary Societies . London : Sage Publications .

Hellstrom T. ( 2011 ) ‘Homing in on Excellence: Dimensions of Appraisal in Center of Excellence Program Evaluations’ , Evaluation , 17 / 2 : 117 – 31 .

Hellstrom T. . ( 2012 ) ‘Epistemic Capacity in Research Environments: A Framework for Process Evaluation’ , Prometheus , 30 / 4 : 395 – 409 .

Hemlin S. Rasmussen S. B . ( 2006 ) ‘The Shift in Academic Quality Control’ , Science, Technology & Human Values , 31 / 2 : 173 – 98 .

Hessels L. K. Van Lente H. ( 2008 ) ‘Re-thinking New Knowledge Production: A Literature Review and a Research Agenda’ , Research Policy , 37 / 4 , 740 – 60 .

Huutoniemi K. ( 2010 ) ‘Evaluating Interdisciplinary Research’ , in Frodeman R. Klein J. T. Mitcham C. (eds) The Oxford Handbook of Interdisciplinarity , pp. 309 – 20 . Oxford : Oxford University Press .

de Jong S. P. L. et al.  . ( 2011 ) ‘Evaluation of Research in Context: An Approach and Two Cases’ , Research Evaluation , 20 / 1 : 61 – 72 .

Jahn T. Keil F. ( 2015 ) ‘An Actor-Specific Guideline for Quality Assurance in Transdisciplinary Research’ , Futures , 65 : 195 – 208 .

Kates R. ( 2000 ) ‘Sustainability Science’ , World Academies Conference Transition to Sustainability in the 21st Century 5/18/00 , Tokyo, Japan .

Klein J. T . ( 2006 ) ‘Afterword: The Emergent Literature on Interdisciplinary and Transdisciplinary Research Evaluation’ , Research Evaluation , 15 / 1 : 75 – 80 .

Klein J. T . ( 2008 ) ‘Evaluation of Interdisciplinary and Transdisciplinary Research: A Literature Review’ , American Journal of Preventive Medicine , 35 / 2 Supplment S116–23. DOI: 10.1016/j.amepre.2008.05.010

Royal Netherlands Academy of Arts and Sciences, Association of Universities in the Netherlands, Netherlands Organization for Scientific Research (KNAW) . ( 2009 ) Standard Evaluation Protocol 2009-2015: Protocol for Research Assessment in the Netherlands . Netherlands : KNAW . < www.knaw.nl/sep >

Komiyama H. Takeuchi K. ( 2006 ) ‘Sustainability Science: Building a New Discipline’ , Sustainability Science , 1 : 1 – 6 .

Lahtinen E. et al.  . ( 2005 ) ‘The Development of Quality Criteria For Research: A Finnish approach’ , Health Promotion International , 20 / 3 : 306 – 15 .

Lang D. J. et al.  . ( 2012 ) ‘Transdisciplinary Research in Sustainability Science: Practice , Principles , and Challenges’, Sustainability Science , 7 / S1 : 25 – 43 .

Lincoln Y. S . ( 1995 ) ‘Emerging Criteria for Quality in Qualitative and Interpretive Research’ , Qualitative Inquiry , 1 / 3 : 275 – 89 .

Mayne J. Stern E. ( 2013 ) Impact Evaluation of Natural Resource Management Research Programs: A Broader View . Australian Centre for International Agricultural Research, Canberra .

Meyrick J . ( 2006 ) ‘What is Good Qualitative Research? A First Step Towards a Comprehensive Approach to Judging Rigour/Quality’ , Journal of Health Psychology , 11 / 5 : 799 – 808 .

Mitchell C. A. Willetts J. R. ( 2009 ) ‘Quality Criteria for Inter and Trans - Disciplinary Doctoral Research Outcomes’ , in Prepared for ALTC Fellowship: Zen and the Art of Transdisciplinary Postgraduate Studies ., Sydney : Institute for Sustainable Futures, University of Technology .

Morrow S. L . ( 2005 ) ‘Quality and Trustworthiness in Qualitative Research in Counseling Psychology’ , Journal of Counseling Psychology , 52 / 2 : 250 – 60 .

Nowotny H. Scott P. Gibbons M. ( 2001 ) Re-Thinking Science . Cambridge : Polity .

Nowotny H. Scott P. Gibbons M. . ( 2003 ) ‘‘Mode 2’ Revisited: The New Production of Knowledge’ , Minerva , 41 : 179 – 94 .

Öberg G . ( 2008 ) ‘Facilitating Interdisciplinary Work: Using Quality Assessment to Create Common Ground’ , Higher Education , 57 / 4 : 405 – 15 .

Ozga J . ( 2007 ) ‘Co - production of Quality in the Applied Education Research Scheme’ , Research Papers in Education , 22 / 2 : 169 – 81 .

Ozga J . ( 2008 ) ‘Governing Knowledge: research steering and research quality’ , European Educational Research Journal , 7 / 3 : 261 – 272 .

OECD ( 2012 ) Frascati Manual 6th ed. < http://www.oecd.org/innovation/inno/frascatimanualproposedstandardpracticeforsurveysonresearchandexperimentaldevelopment6thedition >

Overseas Development Institute (ODI) ( 2004 ) ‘Bridging Research and Policy in International Development: An Analytical and Practical Framework’, ODI Briefing Paper. < http://www.odi.org/sites/odi.org.uk/files/odi-assets/publications-opinion-files/198.pdf >

Overseas Development Institute (ODI) . ( 2012 ) RAPID Outcome Assessment Guide . < http://www.odi.org/sites/odi.org.uk/files/odi-assets/publications-opinion-files/7815.pdf >

Pullin A. S. Stewart G. B. ( 2006 ) ‘Guidelines for Systematic Review in Conservation and Environmental Management’ , Conservation Biology , 20 / 6 : 1647 – 56 .

Research Excellence Framework (REF) . ( 2011 ) Research Excellence Framework 2014: Assessment Framework and Guidance on Submissions. Reference REF 02.2011. UK: REF. < http://www.ref.ac.uk/pubs/2011-02/ >

Scott A . ( 2007 ) ‘Peer Review and the Relevance of Science’ , Futures , 39 / 7 : 827 – 45 .

Spaapen J. Dijstelbloem H. Wamelink F. ( 2007 ) Evaluating Research in Context: A Method for Comprehensive Assessment . Netherlands: Consultative Committee of Sector Councils for Research and Development. < http://www.qs.univie.ac.at/fileadmin/user_upload/qualitaetssicherung/PDF/Weitere_Aktivit%C3%A4ten/Eric.pdf >

Spaapen J. Van Drooge L. ( 2011 ) ‘Introducing “Productive Interactions” in Social Impact Assessment’ , Research Evaluation , 20 : 211 – 18 .

Stige B. Malterud K. Midtgarden T. ( 2009 ) ‘Toward an Agenda for Evaluation of Qualitative Research’ , Qualitative Health Research , 19 / 10 : 1504 – 16 .

td-net ( 2014 ) td-net. < www.transdisciplinarity.ch/e/Bibliography/new.php >

Tertiary Education Commission (TEC) . ( 2012 ) Performance-based Research Fund: Quality Evaluation Guidelines 2012. New Zealand: TEC. < http://www.tec.govt.nz/Documents/Publications/PBRF-Quality-Evaluation-Guidelines-2012.pdf >

Tijssen R. J. W. ( 2003 ) ‘Quality Assurance: Scoreboards of Research Excellence’ , Research Evaluation , 12 : 91 – 103 .

White H. Phillips D. ( 2012 ) ‘Addressing Attribution of Cause and Effect in Small n Impact Evaluations: Towards an Integrated Framework’. Working Paper 15. New Delhi: International Initiative for Impact Evaluation .

Wickson F. Carew A. ( 2014 ) ‘Quality Criteria and Indicators for Responsible Research and Innovation: Learning from Transdisciplinarity’ , Journal of Responsible Innovation , 1 / 3 : 254 – 73 .

Wickson F. Carew A. Russell A. W. ( 2006 ) ‘Transdisciplinary Research: Characteristics, Quandaries and Quality,’ Futures , 38 / 9 : 1046 – 59

Month: Total Views:
November 2016 7
December 2016 36
January 2017 51
February 2017 109
March 2017 124
April 2017 72
May 2017 45
June 2017 30
July 2017 70
August 2017 84
September 2017 114
October 2017 76
November 2017 81
December 2017 320
January 2018 522
February 2018 326
March 2018 518
April 2018 661
May 2018 652
June 2018 463
July 2018 411
August 2018 528
September 2018 537
October 2018 361
November 2018 420
December 2018 344
January 2019 374
February 2019 465
March 2019 610
April 2019 456
May 2019 418
June 2019 437
July 2019 346
August 2019 377
September 2019 451
October 2019 376
November 2019 392
December 2019 326
January 2020 436
February 2020 383
March 2020 691
April 2020 444
May 2020 316
June 2020 435
July 2020 376
August 2020 379
September 2020 625
October 2020 443
November 2020 329
December 2020 356
January 2021 418
February 2021 402
March 2021 648
April 2021 519
May 2021 487
June 2021 435
July 2021 449
August 2021 421
September 2021 658
October 2021 537
November 2021 444
December 2021 379
January 2022 428
February 2022 534
March 2022 603
April 2022 688
May 2022 551
June 2022 366
July 2022 375
August 2022 497
September 2022 445
October 2022 457
November 2022 374
December 2022 303
January 2023 364
February 2023 327
March 2023 499
April 2023 404
May 2023 335
June 2023 350
July 2023 340
August 2023 419
September 2023 444
October 2023 595
November 2023 585
December 2023 498
January 2024 691
February 2024 728
March 2024 667
April 2024 611
May 2024 422
June 2024 382
July 2024 377
August 2024 427

Email alerts

Citing articles via.

  • Recommend to your Library

Affiliations

  • Online ISSN 1471-5449
  • Print ISSN 0958-2029
  • Copyright © 2024 Oxford University Press
  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

  • Tes Explains

What makes a good-quality research study?

What makes a good-quality research study?

Judging the quality of a study is not an exact science, but there are certain hallmarks that we can look for to figure out how much faith we should place in the findings of a piece of research.

There are two aspects that it’s important to think about here: the design of the study, and whether this makes the research capable of answering the question that you are interested in; and any problems with the study that might represent “threats” to the validity of the findings.

When trying to judge a study, you are ultimately trying to assess whether it has been well designed, well conducted and well reported. It should be based on a clear research question, and it should use a sound research methodology, with limitations fully acknowledged. The data should be collected and analysed carefully, and all the analyses that were conducted should be reported transparently.

How does it work in the classroom?

Speaking to Tes , Professor Becky Francis, chief executive of the Education Endowment Foundation, said that in order for research evidence to have a “positive and empowering effect”, it is vital that all of those involved in education - from teachers and school leaders to policymakers - take the time to really “question, critique and discriminate” between the many claims of evidence-informed approaches that they encounter, rather than simply accepting them at face value.

As responsible professionals, teachers must “exercise professional judgement when assessing the rigour with which research evidence has been produced”, she said, and take time to consider the relevance of a research study to the particular context. Thinking about the feasibility of implementing evidenced approaches in your environment is one way of doing this, she offered, and so it may also be helpful to ask some simple but significant questions:

  • Will the approach need to be adapted to fit my local context?
  • How much organisational capacity might it require to embed the practice? Can we afford to make this commitment?
  • Are teachers and others likely to want to adopt the practice?  

Further reading:

  • What makes good evidence? The EEF explains
  • Why we use action research as CPD
  • Five ways to create a great professional learning library
  • Why it’s risky to generalise education research
  • Education research is great but never forget teaching is a complex art form
  • Why we need to step out of our evidence echo chambers

The Education Endowment Foundation (EEF) is an independent charity dedicated to breaking the link between family income and educational achievement.

To achieve this, it summarises the best available evidence for teachers; its Teaching and Learning Toolkit, for example, is used by 70 per cent of secondary schools.

The charity also generates new evidence of “what works” to improve teaching and learning, by funding independent evaluations of high-potential projects, and supports teachers and senior leaders to use the evidence to achieve the maximum possible benefit for young people.

How far can we apply research to a new context?

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • v.328(7430); 2004 Jan 3

Assessing the quality of research

Paul glasziou.

1 Department of Primary Health Care, University of Oxford, Oxford OX3 7LF

Jan Vandenbroucke

2 Leiden University Medical School, Leiden 9600 RC, Netherlands

Iain Chalmers

3 James Lind Initiative, Oxford OX2 7LG

Associated Data

Short abstract.

Inflexible use of evidence hierarchies confuses practitioners and irritates researchers. So how can we improve the way we assess research?

The widespread use of hierarchies of evidence that grade research studies according to their quality has helped to raise awareness that some forms of evidence are more trustworthy than others. This is clearly desirable. However, the simplifications involved in creating and applying hierarchies have also led to misconceptions and abuses. In particular, criteria designed to guide inferences about the main effects of treatment have been uncritically applied to questions about aetiology, diagnosis, prognosis, or adverse effects. So should we assess evidence the way Michelin guides assess hotels and restaurants? We believe five issues should be considered in any revision or alternative approach to helping practitioners to find reliable answers to important clinical questions.

Different types of question require different types of evidence

Ever since two American social scientists introduced the concept in the early 1960s, 1 hierarchies have been used almost exclusively to determine the effects of interventions. This initial focus was appropriate but has also engendered confusion. Although interventions are central to clinical decision making, practice relies on answers to a wide variety of types of clinical questions, not just the effect of interventions. 2 Other hierarchies might be necessary to answer questions about aetiology, diagnosis, disease frequency, prognosis, and adverse effects. 3 Thus, although a systematic review of randomised trials would be appropriate for answering questions about the main effects of a treatment, it would be ludicrous to attempt to use it to ascertain the relative accuracy of computerised versus human reading of cervical smears, the natural course of prion diseases in humans, the effect of carriership of a mutation on the risk of venous thrombosis, or the rate of vaginal adenocarcinoma in the daughters of pregnant women given diethylstilboesterol. 4 ​ 4

An external file that holds a picture, illustration, etc.
Object name is glap95695.f1.jpg

To answer their everyday questions, practitioners need to understand the “indications and contraindications” for different types of research evidence. 5 Randomised trials can give good estimates of treatment effects but poor estimates of overall prognosis; comprehensive non-randomised inception cohort studies with prolonged follow up, however, might provide the reverse.

Systematic reviews of research are always preferred

With rare exceptions, no study, whatever the type, should be interpreted in isolation. Systematic reviews are required of the best available type of study for answering the clinical question posed. 6 A systematic review does not necessarily involve quantitative pooling in a meta-analysis.

Although case reports are a less than perfect source of evidence, they are important in alerting us to potential rare harms or benefits of an effective treatment. 7 Standardised reporting is certainly needed, 8 but too few people know about a study showing that more than half of suspected adverse drug reactions were confirmed by subsequent, more detailed research. 9 For reliable evidence on rare harms, therefore, we need a systematic review of case reports rather than a haphazard selection of them. 10 Qualitative studies can also be incorporated in reviews—for example, the systematic compilation of the reasons for non-compliance with hip protectors derived from qualitative research. 11

Level alone should not be used to grade evidence

The first substantial use of a hierarchy of evidence to grade health research was by the Canadian Task Force on the Preventive Health Examination. 12 Although such systems are preferable to ignoring research evidence or failing to provide justification for selecting particular research reports to support recommendations, they have three big disadvantages. Firstly, the definitions of the levels vary within hierarchies so that level 2 will mean different things to different readers. Secondly, novel or hybrid research designs are not accommodated in these hierarchies—for example, reanalysis of individual data from several studies or case crossover studies within cohorts. Thirdly, and perhaps most importantly, hierarchies can lead to anomalous rankings. For example, a statement about one intervention may be graded level 1 on the basis of a systematic review of a few, small, poor quality randomised trials, whereas a statement about an alternative intervention may be graded level 2 on the basis of one large, well conducted, multicentre, randomised trial.

This ranking problem arises because of the objective of collapsing the multiple dimensions of quality (design, conduct, size, relevance, etc) into a single grade. For example, randomisation is a key methodological feature in research into interventions, 13 but reducing the quality of evidence to a single level reflecting proper randomisation ignores other important dimensions of randomised clinical trials. These might include:

  • Other design elements, such as the validity of measurements and blinding of outcome assessments
  • Quality of the conduct of the study, such as loss to follow up and success of blinding
  • Absolute and relative size of any effects seen
  • Confidence intervals around the point estimates of effects.

None of the current hierarchies of evidence includes all these dimensions, and recent methodological research suggests that it may be difficult for them to do so. 14 Moreover, some dimensions are more important for some clinical problems and outcomes than for others, which necessitates a tailored approach to appraising evidence. 15 Thus, for important recommendations, it may be preferable to present a brief summary of the central evidence (such as “double-blind randomised controlled trials with a high degree of follow up over three years showed that...”), coupled with a brief appraisal of why particular quality dimensions are important. This broader approach to the assessment of evidence applies not only to randomised trials but also to observational studies. In the final recommendations, there will also be a role for other types of scientific evidence—for example, on aetiological and pathophysiological mechanisms—because concordance between theoretical models and the results of empirical investigations will increase confidence in the causal inferences. 16 , 17

What to do when systematic reviews are not available

Although hierarchies can be misleading as a grading system, they can help practitioners find the best relevant evidence among a plethora of studies of diverse quality. For example, to answer a therapeutic question, the hierarchy would suggest first looking for a systematic review of randomised controlled trials. However, only a fraction of the hundreds of thousands of reports of randomised trials have been considered for possible inclusion in systematic reviews. 18 So when there is no existing review, a busy clinician might next try to identify the best of several randomised trials. If the search fails to identify any randomised trials, non-randomised cohort studies might be informative. For non-therapeutic questions, however, search strategies should accommodate the need for observational designs that answer questions about aetiology, prognosis, or adverse effects. 19 Whatever evidence is found, this should be clearly described rather than simply assigned to a level. Such considerations have led the authors of the BMJ 's Clinical Evidence to use a hierarchy for finding evidence but to forgo grading evidence into levels. Instead, they make explicit the type of evidence on which their conclusions are based.

Balanced assessments should draw on a variety of types of research

For interventions, the best available evidence for each outcome of potential importance to patients is needed. 20 Often this will require systematic reviews of several different types of study. As an example, consider a woman interested in oral contraceptives. Evidence is available from controlled trials showing their contraceptive effectiveness. Although contraception is the main intended beneficial effect, some women will also be interested in the effects of oral contraceptives on acne or dysmenorrhoea. These may have been assessed in short term randomised controlled trials comparing different contraceptives. Any beneficial intended effect needs to be weighed against possible harms, such as increases in thromboembolism and breast cancer. The best evidence for such potential harms is likely to come from non-randomised cohort studies or case-control studies. For example, fears about negative consequences on fertility after long term use of oral contraceptives were allayed by such non-randomised studies. The figure gives an example of how all this information might be amalgamated into a balance sheet. 21 , 22

An external file that holds a picture, illustration, etc.
Object name is glap95695.f2.jpg

Example of possible evidence table for short and long term effects of oral contraceptives. (Absolute effects will vary with age and other risk factors such as smoking and blood pressure. RCT = randomised controlled trial)

Sometimes, rare, dramatic adverse effects detected with case reports or case control studies prompt further investigation and follow up of existing randomised cohorts to detect related but less severe adverse effects. For example, the case reports and case-control studies showing that intrauterine exposure to diethylstilboestrol could cause vaginal adenocarcinoma led to further investigation and follow up of the mothers and children (male as well as female) who had participated in the relevant randomised trials. These investigations showed several less serious but more frequent adverse effects of diethylstilboestrol that would have otherwise been difficult to detect. 4

Conclusions

Given the flaws in evidence hierarchies that we have described, how should we proceed? We suggest that there are two broad options: firstly, to extend, improve, and standardise current evidence hierarchies 22 ; and, secondly, to abolish the notion of evidence hierarchies and levels of evidence, and concentrate instead on teaching practitioners general principles of research so that they can use these principles to appraise the quality and relevance of particular studies. 5

We have been unable to reach a consensus on which of these approaches is likely to serve the current needs of practitioners more effectively. Practitioners who seek immediate answers cannot embark on a systematic review every time a new question arises in their practice. Clinical guidelines are increasingly prepared professionally—for example, by organisations of general practitioners and of specialist physicians or the NHS National Institute for Clinical Excellence—and this work draws on the results of systematic reviews of research evidence. Such organisations might find it useful to reconsider their approach to evidence and broaden the type of problems that they examine, especially when they need to balance risks and benefits. Most importantly, however, the practitioners who use their products should understand the approach used and be able to judge easily whether a review or a guideline has been prepared reliably.

Evidence hierarchies with the randomised trial at the apex have been pivotal in the ascendancy of numerical reasoning in medicine over the past quarter century. 17 Now that this principle is widely appreciated, however, we believe that it is time to broaden the scope by which evidence is assessed, so that the principles of other types of research, addressing questions on aetiology, diagnosis, prognosis, and unexpected effects of treatment, will become equally widely understood. Indeed, maybe we do have something to learn from Michelin guides: they have separate grading systems for hotels and restaurants, provide the details of the several quality dimensions behind each star rating, and add a qualitative commentary ( www.viamichelin.com ).

Summary points

Different types of research are needed to answer different types of clinical questions

Irrespective of the type of research, systematic reviews are necessary

Adequate grading of quality of evidence goes beyond the categorisation of research design

Risk-benefit assessments should draw on a variety of types of research

Clinicians need efficient search strategies for identifying reliable clinical research

Supplementary Material

We thank Andy Oxman and Mike Rawlins for helpful suggestions.

Contributors and sources: As a general practitioner, PG uses the his own and others' evidence assessments, and as a teacher of evidence based medicine helps others find and appraise research. JV is an internist and epidemiologist by training; he has extensively collaborated in clinical research, which made him strongly aware of the diverse types of evidence that clinicians use and need. IC's interest in these issues arose from witnessing the harm done to patients from eminence based medicine.

Competing interests: None declared.

  • - Google Chrome

Intended for healthcare professionals

  • My email alerts
  • BMA member login
  • Username * Password * Forgot your log in details? Need to activate BMA Member Log In Log in via OpenAthens Log in via your institution

Home

Search form

  • Advanced search
  • Search responses
  • Search blogs
  • Assessing the quality...

Assessing the quality of research

  • Related content
  • Peer review

This article has a correction. Please see:

  • Errata - September 09, 2004
  • Paul Glasziou ([email protected]) , reader 1 ,
  • Jan Vandenbroucke , professor of clinical epidemiology 2 ,
  • Iain Chalmers , editor, James Lind library 3
  • 1 Department of Primary Health Care, University of Oxford, Oxford OX3 7LF
  • 2 Leiden University Medical School, Leiden 9600 RC, Netherlands
  • 3 James Lind Initiative, Oxford OX2 7LG
  • Correspondence to: P Glasziou
  • Accepted 20 October 2003

Inflexible use of evidence hierarchies confuses practitioners and irritates researchers. So how can we improve the way we assess research?

The widespread use of hierarchies of evidence that grade research studies according to their quality has helped to raise awareness that some forms of evidence are more trustworthy than others. This is clearly desirable. However, the simplifications involved in creating and applying hierarchies have also led to misconceptions and abuses. In particular, criteria designed to guide inferences about the main effects of treatment have been uncritically applied to questions about aetiology, diagnosis, prognosis, or adverse effects. So should we assess evidence the way Michelin guides assess hotels and restaurants? We believe five issues should be considered in any revision or alternative approach to helping practitioners to find reliable answers to important clinical questions.

Different types of question require different types of evidence

Ever since two American social scientists introduced the concept in the early 1960s, 1 hierarchies have been used almost exclusively to determine the effects of interventions. This initial focus was appropriate but has also engendered confusion. Although interventions are central to clinical decision making, practice relies on answers to a wide variety of types of clinical questions, not just the effect of interventions. 2 Other hierarchies might be necessary to answer questions about aetiology, diagnosis, disease frequency, prognosis, and adverse effects. 3 Thus, although a systematic review of randomised trials would be appropriate for answering questions about the main effects of a treatment, it would be ludicrous to attempt to use it to ascertain the relative accuracy of computerised versus human reading of cervical smears, the natural course of prion diseases in humans, the effect of carriership of a mutation on the risk of venous thrombosis, or the rate of vaginal adenocarcinoma in the daughters of pregnant women given diethylstilboesterol. 4

To answer their everyday questions, practitioners …

Log in using your username and password

BMA Member Log In

If you have a subscription to The BMJ, log in:

  • Need to activate
  • Log in via institution
  • Log in via OpenAthens

Log in through your institution

Subscribe from £184 *.

Subscribe and get access to all BMJ articles, and much more.

* For online subscription

Access this article for 1 day for: £50 / $60/ €56 ( excludes VAT )

You can download a PDF version for your personal record.

Buy this article

the quality of a research study is primarily assessed on

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 03 September 2024

Evaluating coupling coordination between urban smart performance and low-carbon level in China’s pilot cities with mixed methods

  • Xiongwei Zhu 1 ,
  • Dezhi Li 1 , 2 ,
  • Shenghua Zhou 1 ,
  • Shiyao Zhu 3 &
  • Lugang Yu 1  

Scientific Reports volume  14 , Article number:  20461 ( 2024 ) Cite this article

Metrics details

  • Climate-change adaptation
  • Climate-change impacts
  • Environmental impact
  • Sustainability

The construction models of smart cities and low-carbon cities are crucial for advancing global urbanization, enhancing urban governance, and addressing major urban challenges. Despite significant advancements in smart and low-carbon city research, a consensus on their coupling coordination remains elusive. This study employs mixed-method research, combining qualitative and quantitative analyses, to investigate the coupling coordination between urban smart performance (SCP) and low-carbon level (LCL) across 52 typical smart and low-carbon pilot cities in China. Independent evaluation models for SCP and LCL qualitatively assess the current state of smart and low-carbon city construction. Additionally, an Entropy–TOPSIS–Pearson correlation–Coupling coordination degree (ETPC) analysis model quantitatively examines their relationship. The results reveal that smart city initiatives in China significantly outperform low-carbon city development, with notable disparities in SCP and LCL between eastern, non-resource-based, and central cities versus western, resource-dependent, and peripheral cities. A strong positive correlation exists between urban SCP and overall LCL, with significant correlations in management, society, and economy, and moderate to weak correlations in environmental quality and culture. As SCP levels improve, the coupling coordination degree between the urban SCP and LCL systems also increases, driven primarily by economic, management, and societal factors. Conversely, the subsystems of low-carbon culture and environmental quality show poorer integration. Based on these findings, this study proposes an evaluation system for smart and low-carbon coupling coordination development, outlining pathways for future development from the perspective of urban complex systems.

Similar content being viewed by others

the quality of a research study is primarily assessed on

An empirical analysis of the coupling and coordinated development of new urbanization and ecological welfare performance in China’s Chengdu–Chongqing economic circle

the quality of a research study is primarily assessed on

Decomposing the comprehensive efficiency of major cities into divisions on governance, ICT and sustainability: network slack-based measure model

the quality of a research study is primarily assessed on

Impact of urban spatial structure elements on carbon emissions efficiency in growing megacities: the case of Chengdu

Introduction.

Cities, as centers of population and economy, play crucial roles in cultural exchange, social integration, transportation, communication, and disaster response in modern societal development 1 , 2 . According to the United Nations Human Settlements program’s “2022 World Cities Report”, as of 2021, the global urbanization rate has reached 56%, and it is projected that by 2050, an additional 2.2 billion people will live in cities, increasing the urbanization rate to 68% 3 . North America and European countries are approaching urbanization saturation, with little fluctuation expected, while urbanization in Asia and Africa will accelerate notably 4 . Particularly in China, the world’s second-largest economy, as of 2022, the urbanization rate is only 64.7%, ranking 96th globally, indicating significant potential for growth compared to developed countries like the USA and the UK 5 . The Chinese government places high importance on urbanization development. It was clearly stated in the “2020 State Council Government Work Report” that new urbanization is a key measure for achieving China’s modernization. Moreover, in the “14th Five-Year Plan (2021–2025) and the Long-Range Objectives Through the Year 2035”, detailed strategies are outlined for optimizing the urban layout and promoting urban–rural integration, among other policies to advance urbanization 6 . However, urbanization, as a process of continuous concentration of population and industrial elements in cities, while bringing opportunities for economic growth and social development, also presents a series of challenges such as environmental pressure, resource constraints, and increased demand for services 7 , 8 .

In 2008, the American company IBM introduced the concept of a “Smart Planet”, which garnered widespread attention globally 9 . The concept of a smart city, as a specific application within this framework, aims to enhance urban management and service efficiency through the integration and innovative application of Information and Communication Technology (ICT), thereby improving the quality of life for residents, optimizing resource use, reducing environmental impact, and promoting economic development and social progress 10 , 11 . Currently, the smart city construction model is seen as one of the effective means to advance global urbanization, improve urban governance, and solve major urban issues 12 . In 2009, IBM released the “Smart Planet: Winning in China” plan, outlining China’s five major thematic tasks in constructing a “Smart Planet” (sustainable economic development, corporate competitiveness, energy efficiency, environmental protection, and social harmony) 13 . The construction of smart cities, as a key measure to achieve these thematic tasks, has received significant attention from the Chinese government. In 2014, the Chinese government elevated smart city construction to a “national strategy”, considering it a cornerstone of China’s future economic and urban development strategies. By 2016, over 500 Chinese cities had initiated or announced smart city pilot construction plans, accounting for nearly half of all such projects planned or underway globally 14 . In recent years, with the continuous release of policy benefits related to smart city construction in China and substantial capital investment, China has become a leader in driving global smart city initiatives 15 . However, an undeniable fact is that while smart city construction models promote economic development and improve the quality of life for residents, the new infrastructure supporting the operation of smart cities, such as big data centers, 5G shared base stations, and Beidou ground-based augmentation stations, result in substantial energy consumption and significant carbon emissions 16 . Research shows that in 2018, the total electricity consumption of data centers in China supporting IT infrastructure reached 160.9 billion kilowatt-hours, exceeding the total electricity consumption of Shanghai for that year and accounting for about 2% of China’s total electricity consumption, with carbon emissions nearing 100 million tons 17 . The Environmental Defense Fund (EDF) predicts that by 2035, the total electricity consumption of China’s data centers and 5G base stations will reach 695.1–782 billion kilowatt-hours, accounting for 5–7% of China’s total electricity consumption, with total carbon emissions reaching 230–310 million tons 18 .

In 2022, global energy-related CO 2 emissions increased by 0.9%, reaching a record high of over 36.8 Gt. Concurrently, atmospheric CO 2 concentrations continued to rise, averaging 417.06 parts per million, marking the eleventh consecutive year with an increase exceeding 2 ppm 19 . According to the World Meteorological Organization (WMO), the global surface temperature in September 2023 was 1.44 °C higher than the twentieth century average, setting a new historical record 20 . The continuous rise in global temperatures has led to frequent occurrences of disastrous events such as extreme heat, torrential rains, floods, forest fires, and hurricanes in recent years, causing significant loss of life and property damage 21 . World Health Organization (WHO) data indicates that in 2022, there were at least 29 weather disaster events globally causing billions of dollars in losses, with approximately 61,672 deaths in Europe due to heatwave-related causes 22 . As global climate issues become increasingly severe, the call for global carbon emission reduction is growing louder. Cities, as highly concentrated areas of population and economic activities, according to the Global Report by the United Nations Human Settlements Programme (UN-Habitat), consume 60–80% of the global energy and contribute to over 75% of global CO 2 emissions 23 . As the largest global emitter of carbon, China’s CO 2 emissions in 2022 accounted for 27% of the global total 24 . Given China’s influence in the global economy, technological innovation, and international cooperation, international organizations and global climate policies generally believe that China’s efforts in carbon reduction are crucial to achieving the global 1.5 °C climate goal 25 . In recent years, the Chinese government has actively promoted the construction of low-carbon pilot cities. To date, three batches of low-carbon pilot cities have been implemented in China, bringing the total number of such cities to 81 26 .

However, the report “China’s Digital Infrastructure Decarburization Path: Data Centers and 5G Carbon Reduction Potential and Challenges (2020–2035)” indicates that compared to peak carbon emissions expected around 2025 in key sectors like steel, building materials, and non-ferrous metals in China, the “lock-in effect” of carbon emissions from digital infrastructure poses a significant challenge to achieving China’s peak carbon and carbon neutrality goals 27 , 28 , 29 . Given the urgency of global climate change, it raises the question of the correlation between smart cities and low-carbon cities: is it positive, negative, or non-existent? Should the pace of smart city development be slowed to achieve sustainable urban development goals, considering the significant carbon dioxide emissions resulting from current technological choices, social habits, and policy frameworks? To address these practical issues, it is first essential to conduct an objective and accurate assessment of urban SCP and LCL. However, due to the complexity and diversity of urban carbon emissions sources, current measurement and estimation techniques fail to capture all emission types. This limitation hampers the ability to obtain comprehensive, accurate, and timely city-level carbon emission data 30 , 31 . To address this challenge, this paper decomposes smart cities and low-carbon cities into their interdependent and interactive subsystems (i.e., economic, political, cultural, social, and ecological) viewed through the lens of urban complex systems. It then develops evaluation models for both city types and conducts empirical analyses in 52 representative Chinese pilot cities. Based on these analyses, the paper elucidates the coupling coordination degree between SCP and LCL and proposes a specific pathway for their coordinated development.

This paper is therefore structured as follows: “ Literature review ” section offers an overview of the relevant literature, laying the foundation for the introduction of SCP and LCL. Subsequently, SCP and LCL are identified clearly, and measurement based on a mixed method for the coupling coordination degree is established in “ Methodology ” section, followed by a case demonstration for the introduced method in “ Results ” section and the demonstration results analysis in “ Discussions and implications ” section. Finally, “ Conclusions ” section summarizes the study’s main findings and contributions, discusses its limitations, and suggests directions for future research.

Literature review

Evaluation of smart city: contents, methods, and subjects.

The evaluation of smart cities is a central research area within the smart city development field. Developing standardized evaluation criteria serves the dual purpose of defining smart city development boundaries and scientifically measuring its effectiveness. This, in turn, facilitates the achievement of development goals centered on evaluation-driven construction, improvement, and management 32 . We conducted data collection on “smart city*” AND “evaluation”, resulting in the selection of 82 articles. This involved an extensive search of the Wos Core Collection database for articles published in the period from January 2019 to January 2024.

To facilitate a clearer understanding for readers of current research on smart city evaluation, we have categorized it by evaluation contents , evaluation methods , and evaluation subjects .

Cluster1-evaluation contents (what to evaluate), including smart city evaluation dimensions and indicators. By analyzing the article content, it’s clear that most smart city evaluation approaches align with six core dimensions: economy, quality of life, governance, people, mobility, and environment 13 , 15 . Centered around these six dimensions, international organizations (ISO, ETSI, UN, and ITU) and scholars have established various sets of smart city evaluation indicators, considering the interdependencies among urban economic, environmental, and social factors, all in alignment with the goals of sustainable urban development 32 , 33 , 34 . Notably, Sharifi 35 compiled a comprehensive list of indicators incorporating a wide range of assessment schemes. This list not only covers the scope of the evaluation indicators (project/community/city) and their data types (primary/secondary) but also considers the stages of smart city development (planning/operation) and stakeholder involvement 36 . Subsequent research predominantly utilizes the same criteria as Sharifi 35 to identify indicator sets, taking into account the specific needs of each city and defining the spatial and temporal scales of the indicator sets 37 .

Cluster 2-evaluation methods (How to evaluate) , including smart city evaluation methods and tools. Research in this field focuses on three main areas: identifying evaluation indicators for smart cities, computing composite index, and developing evaluation models 38 , 39 . Methods for indicator identification mainly include literature review, case studies, brainstorming, the Delphi method, and data-driven techniques 40 , 41 . The Analytic Hierarchy Process (AHP) is commonly used for calculating composite indices, yet it faces issues like subjective biases and data size limitations 42 . Alternative methods, such as the Analytical Network Process (ANP) and the Decision-Making Trial and Evaluation Laboratory (DEMATEL), are used to address these drawbacks by simulating inter-indicator interactions. Additionally, techniques like Principal Component Analysis (PCA) and Data Envelopment Analysis (DEA) are applied for indicator weighting. Finally, smart city evaluation models are constructed to aggregate various dimensions and indicators into a unified score, facilitating project comparison and ranking, and highlighting areas needing improvement 43 , 44 .

Cluster 3-evaluation subjects (Who performs the evaluation) , including smart city stakeholders and participants. Smart city evaluations involve various stakeholders and participants. These complex processes see each entity, including government agencies, international organizations, academic institutions, industry sectors, and NGOs, contributing to the smart cities’ planning, development, and management 45 , 46 . Key organizations in this realm are the International Organization for Standardization (ISO), International Telecommunication Union (ITU), United Nations Human Settlements Programme (UN-Habitat), Smart Cities Council, European Institute of Innovation and Technology (EIT Urban Mobility), and World Council on City Data (WCCD). Additionally, numerous countries have established their own smart city evaluation standards to direct and review smart city progress 11 . Notable examples are the “One New York: The Plan for a Strong and Just City” in the USA, the “BSI PAS 180” in the UK, Singapore's “Smart Nation Initiative”, and China’s “National New-type Smart City Evaluation Indicator System”.

Evaluation of low-carbon city: contents, methods, and subjects

As more countries integrate low-carbon city development into their national strategies and plans, conducting scientific evaluations of cities’ current low-carbon development levels to encourage them to adopt corresponding measures for improvement has become a key strategy in advancing cities towards a low-carbon future 47 . In the Wos Core Collection database, we conducted a search for studies spanning January 2018 to January 2023 with “low-carbon city*” AND “evaluation” as keywords, subsequently identifying 98 pertinent articles through two rounds of screening.

This section, maintaining the research framework of “ Evaluation of smart city: contents, methods, and subjects ” section ( evaluation contents, methods, and subjects ), organizes low-carbon city research to enable comparison with smart city evaluations.

Cluster 1-evaluation contents (what to evaluate), including low-carbon city evaluation systems, dimensions, and indicators. Current research focusing on low-carbon cities primarily spans six key domains: urban low-carbon scale, energy, behavior, policy, mobility, and carbon sinks. The evaluation dimensions for low-carbon cities are mainly divided into two types: single-criterion systems concentrating on specific low-carbon aspects (such as low-carbon economy, low-carbon energy, etc.), and comprehensive multi-criteria systems assessing the overall urban low-carbon development 48 , 49 . Compared to single-criterion evaluation systems, comprehensive and multi-criteria evaluation systems are increasingly gaining attention from scholars. These scholars share the view that low-carbon city construction is a diverse, dynamic, interconnected process that requires comprehensive consideration of various urban aspects, including economy, society, and environment, and involves coordinating the actions of different stakeholders to achieve sustainable urban development 50 , 51 . Additionally, international institutions and many national governments have also published low-carbon city evaluation frameworks from the perspective of comprehensive and multi-criteria evaluation systems. The most notable examples include the United Nations Commission on Sustainable Development, which set 30 indicators from four dimensions: social, environmental, economic, and institutional, to evaluate the level of urban low-carbon development. The Chinese Academy of Social Sciences proposed the “China Low Carbon City Indicator System”, covering 8 dimensions such as economy, energy, facilities, and 25 specific indicators including energy intensity, per capita carbon emissions, and forest coverage rate.

Cluster 2-evaluation methods (How to evaluate) , including low-carbon city evaluation methods and tools. Firstly, identifying evaluation indicators as the initial step in constructing a low-carbon city evaluation model, current research methods not only include traditional methods like literature review and expert interviews but also increasingly involve scholars using dynamic perspectives based on urban complex systems, applying models like DPSR (Driving forces-Pressures-State-Response), STIRPA (Stochastic Impacts by Regression on Population, Affluence, and Technology), the Environmental Kuznets Curve (EKC), and STEEP (Social, Technological, Economic, Ecological, and Political) for indicator identification 52 , 53 . Secondly, weighting evaluation indicators, an essential part of model construction, typically involves methods like subjective weighting (expert scoring, Delphi method, AHP) 54 , objective weighting (PCA, Entropy weight method, variance analysis), and combined weighting (DEA) 55 . Each method has its characteristics and suitable scenarios and should be selected according to specific circumstances. Additionally, quantitative assessment of regional carbon emissions using methods like carbon footprint analysis, baseline emission comparison, and Life Cycle Assessment (LCA) is also becoming a research focus 56 .

Cluster 3-evaluation subjects (Who performs the evaluation) , including low-carbon city stakeholders and participants. The evaluation of low-carbon cities also involves multiple stakeholders (government, enterprises, residents, etc.) 57 . Among them, international organizations like the International Organization for Standardization (ISO), the International Energy Agency (IEA), and the World Meteorological Organization (WMO) have played significant roles in establishing low-carbon city evaluation standards and promoting global low-carbon city development. Additionally, due to economic, policy, and perception factors, current low-carbon city construction relies primarily on government financial input, with social capital and public participation in low-carbon city construction noticeably lacking 58 . Therefore, how to enhance the awareness of enterprises and residents as main actors in low-carbon city construction has become a current research focus.

Coupling coordination analysis between SCP and LCL

Smart cities and low-carbon cities, as important urban development models for the future, have seen an increasing focus on their interrelation by scholars in recent years, becoming an emerging research hotspot in the field. In the Wos Core Collection database, we searched for studies from January 2018 to January 2024 using the keywords “smart city*” “low-carbon city*” “correlation analysis” “coupling coordination analysis” and “urban sustainability”. After two rounds of screening, 24 related studies were selected for analysis.

From the perspective of research results, the current research conclusions about the correlation between low-carbon cities and smart cities primarily include two main points: (i) SCP and LCL cannot achieve coupling coordination development. Some scholars argue that SCP and LCL differ in their focus: SCP emphasizes urban technological and economic development, while LCL focuses more on urban ecological construction 17 . Particularly, De Jong identified 12 urban development concepts, including smart city, low-carbon city, eco-city, and green city. He believes that a clear distinction must be made in the conceptual definition of these types of cities to more accurately guide future urban planning 59 . Furthermore, some scholars argue that the relationship between SMC and LCC is negatively correlated. Deakin believes that the direct environmental benefits of IoT technology are insufficient to achieve urban sustainability goals 60 . Barr et al. argue that the logic of smart cities often leads city administrations to prioritize superficial changes and promote individual behavioral shifts, detracting from the crucial task of reconfiguring urban infrastructure for low-carbon lifestyles 61 , 62 . (ii) SCP and LCL can achieve coupling coordination development. Some scholars believe there is a positive correlation between SCP and LCL, with SCP potentially promoting the development of LCL. Specifically, the intelligent systems built by SCP can effectively match urban energy supply and demand, reducing urban carbon emissions, such as through smart grids and intelligent transportation networks 18 . It is worth noting that most of the studies on the coupling coordination relationship between urban SCP and LCL are based on perspectives of individual urban subsystems such as technology, economy, management, industrial structure, and society. They lack a comprehensive consideration of the city as a complex system 59 , 61 , 63 .

From the perspective of research methodologies, coupling coordination analysis is a fundamental statistical approach for examining relationships between two or more variables. This analysis typically employs techniques such as Pearson’s correlation coefficient, Spearman’s rank correlation coefficient, Kendall’s tau, partial correlation, point-biserial correlation, and multiple correlations. Each technique offers unique insights into the nature and strength of the interdependencies among variables 61 . The selection of an appropriate method depends on the data type (continuous, ordinal, or categorical), its distribution (e.g., normal distribution), and the specific objectives of the research.

In summary, although existing research has made significant contributions to the independent evaluation and advancement of smart cities and low-carbon cities, including their relevant construction content, main actors, as well as some specific measures such as empowering cities with data intelligence for low-carbon economic development and transitioning industrial structure to low-carbon, there are still some important knowledge gaps. On the one hand, current research primarily analyzes the coupling coordination relationship between urban SCP and LCL from the micro-perspective of individual urban subsystems such as economic and energy systems. This approach lacks a macroscopic perspective from the complex urban system, which is detrimental to the comprehensive development of cities 60 , 64 , 65 . On the other hand, current studies often only conduct basic qualitative comparisons of the relationship between the development levels of urban SCP and LCL from a quantitative or qualitative perspective. They lack a comprehensive analytical approach that integrates both qualitative and quantitative analyses for further exploration of the coupling coordination relationship between urban SCP and LCL. This shortfall hinders the sustainable development of cities.

To fill these knowledge gaps, this study employs a mixed-methods approach, combining qualitative and quantitative analyses, to examine the model of coupling coordination between urban SCP and LCL. It also develops recommendations to enhance this coupling coordination, aiming to support sustainable development goals. Furthermore, this research selects 52 typical low-carbon and smart pilot cities in China as case studies, ensuring both scientific validity and practical applicability of the findings. Additionally, to enhance the logical coherence and readability of this study, we posit that a coupling coordination relationship exists between urban SCP and LCL and thus propose Hypothesis 1 .

Hypothesis 1

There is a substantial degree of coupling coordination between the overall urban system’s SCP and LCL, yet there are disparities in this coordination degree among the subsystems of economy, society, politics, culture, and ecology.

Methodology

Research framework.

The construction of low-carbon and smart cities, as key pathways to urban sustainability, necessitates examining their interplay and fostering their collaborative development for achieving sustainability goals 66 . This research employs a sequential framework, including Conceptual, Data, Analysis, and Decision-making Layers, to methodically explore the coupling coordination relationship between SCP and LCL, with the framework illustrated in Fig.  1 .

figure 1

Research framework.

Firstly , in the Conceptual Layer, this study aligns with the United Nations’ objectives for sustainable cities, encompassing economic growth, social equity, better life conditions, and improved urban environments. Integrating these with China’s “Five-Sphere Integrated Plan (economy, politics, culture, society, and ecological environment construction)” for urban development, the research dissects the components of smart city systems (such as information infrastructure, information security, public welfare services) and low-carbon city systems (including low-carbon construction, transportation, and industry), with the aim to collect indicators. Secondly , in the Data Layer, this research develops smart city and low-carbon city evaluation systems, grounded in national standards and official statistics, to qualitatively examine the correlation between SCP and LCL from a macro perspective. Thirdly, in the Analysis Layer, this study selects 52 cities, both smart and low-carbon pilot cities in China, as samples for quantitative analysis. The process involves standardizing indicators, scoring and ranking the cities based on their smart performance and low-carbon levels, followed by employing Pearson’s correlation coefficient and coupling coordination degree model to scientifically analyze the correlation between SCP and LCL. Finally, in the Decision-making Layer, the study examines the coupling coordination relationship between urban smart performance, the overall low-carbon level, and the low-carbon level across five dimensions, which is key for us to test Hypothesis 1 . It also formulates development paths for the coupling coordination of smart and low-carbon cities.

SCP index system construction

Since the concept of smart cities was introduced in 2008, many national governments have established smart city evaluation standards. Due to varying national conditions, SCP evaluation indicators differ across countries. As the sample cities in this study are Chinese smart pilot cities, the selection of SCP evaluation indicators primarily references relevant Chinese national standards. As a global pioneer in smart city development, China released the “Evaluation indicators for new-type smart cities (GB/T 33356-2016)” in 2016 and revised it in 2022. This national standard, with its evaluative indicators, clearly defines the key construction content and development direction of new smart cities, aiming to specifically enhance the effectiveness and level of smart city construction, gaining significant recognition within the industry.

This study, grounded in the concept of a city’s “Five-in-One” sustainable development, is guided by three principles of “Inclusive well-being & Ecological harmony”, “Digital space & Physical space”, and “New IT technologies & Comprehensive services”. It also adheres to the “people-oriented concept” and adopts an “urban complex dynamic perspective” in the process of smart city construction. Additionally, it follows the principle of “similar attributes of evaluation objects”. Based on these foundations, the study establishes three criteria for selecting evaluation indicators, including scientific, coordination, and representation. Drawing on the Chinese government’s smart city evaluation standards and utilizing a literature review methodology, this research constructs an SCP evaluation indicator system for cities, as detailed in Supplementary Appendix Table A1 . The SCP index system includes six primary indicators, including smart public service (SPE), precise governance (PG), information infrastructure (II), digital economy (DE), innovative development environment (IDE), and citizen satisfaction (SCS). It also features 24 secondary indicators, such as traffic information services, grassroots smart governance, and spatio-temporal information platforms. Importantly, to explore the correlation between smart cities and low-carbon cities more effectively, the study deliberately omits “Internet + Green Ecology” related indicators from the smart city evaluation system. To ensure the accuracy and representativeness of these indicators, they were validated through expert consultation, public participation, and comprehensive statistical methods.

LCL index system construction

Current international organizations and academic perspectives on low-carbon city evaluation systems are predominantly based on the urban complex systems approach, considering the interplay and interaction of aspects such as low-carbon society, economy, and technology. Consistent with the principles for selecting SCP evaluation indicators, the choice of LCL evaluation indicators in this study primarily adheres to relevant Chinese national standards and related literature.

As a proactive practitioner in global low-carbon city development, in 2021, the Chinese government released the “Sustainable Cities and Communities—Guides for low-carbon development evaluation (GB/T 41152-2021)”. This national standard evaluates the level of urban low-carbon development, clarifying the key directions for such development, and serves as a current guide for low-carbon city construction in China. Thus, this study, grounded in the “Five-in-One” sustainable urban development framework and guided by the principles of “carbon reduction & pollution reduction”, “green economic growth”, and “enhanced carbon sequestration capacity”, combines the previously established principles of scientific, coordination, and representative for selecting evaluation indicators. It establishes an LCL index system based on the Chinese government’s evaluation standards and relevant literature. Specifically, the LCL evaluation index system constructed in this study includes five primary indicators, including low-carbon economic (LCE), low-carbon society (LCS), low-carbon environmental quality (LCEQ), low-carbon management (LCM), and low-carbon culture (LCC), as well as 22 secondary indicators such as energy consumption per unit of GDP and carbon emission intensity, as shown in Supplementary Appendix Table A2 . Similarly, to ensure the accuracy and representativeness of the indicators, the specific indicators were validated through expert consultation, public participation, and comprehensive statistical methods.

Analysis model construction

In this study, an Entropy-TOPSIS-Pearson correlation-Coupling coordination degree (ETPC) analysis model is constructed to quantitatively analyze the coupling coordination relationship between Urban SCP and LCL. The entropy method is first applied for objective weighting of evaluation indices, ensuring data objectivity and reducing subjective bias, thus enhancing the model’s accuracy and fairness. Next, the TOPSIS method is used to rank sample cities based on their smart performance and low-carbon levels, providing a straightforward and intuitive ranking mechanism. The Pearson correlation method then examines the correlation between SCP and LCL, offering data-driven insights into the dynamic relationships between these variables. Finally, the coupling coordination model calculates the degree of coordination between SCP and LCL, providing a theoretical basis for subsequent enhancement pathways and policy recommendations. The ETPC model constructed in this study has several advantages and complementarities, allowing for a comprehensive analysis and evaluation of the research question from various perspectives. Additionally, the ETPC model can be broadly applied to other multidimensional evaluation and decision analysis issues, such as the coupling coordination between various public health interventions and community health levels, and the comprehensive effects of different economic policies on regional economic development and environmental impact. Specific analysis steps are outlined as follows.

Step 1: Conduct the data normalization process.

where x ij and y ij represent respectively the original and standardized value for the indicator j in referring to the sample case i ( i  = 1,2,3,…, m; j  = 1,2,3,…, n ), max (x j ) and min (x j ) denote respectively the largest and smallest value among all m samples for the indicator j , P ij represents the value proportion of indicator j in the sample case i to the summation value of the indicator from all cases.

Step 2: Calculate the weight and measure the comprehensive level based on entropy method.

The entropy weight method, an objective approach deriving weights from sample characteristics, mitigates expert bias, enhancing the objectivity and credibility of indicator weighting 67 . This study employs this method, determining weights through the calculation of each indicator’s information entropy, and measure the comprehensive level of the subsystem.

where m is the total number of sample cases, \({e}_{j}\) demonstrates the entropy value of the j indicator and \({\omega }_{j}\) denotes the weight of indicator j , and V represent the comprehensive level.

Step 3: Conduct a ranking of evaluation objects based on TOPSIS method.

A key limitation of the entropy method is its tendency to neglect the significance of indicators. The TOPSIS method, addressing this issue, is an ideal-solution-based ranking technique that aids in multi-objective decision-making among finite options 68 . In this approach, the study first determines positive and negative ideal solutions, measures each objective’s distance to these ideals, and subsequently ranks the subjects by the proximity of each objective to the ideal solution.

where \({ V}^{+}\) and \({V}^{-}\) respectively represent the best ideal solution and the worst ideal solution, \({D}_{i}^{+}\) and \({D}_{i}^{-}\) represent the distances from the objective to the positive and negative ideal solutions, respectively. \({C}_{i}\) indicates the closeness of the evaluation objective to the optimal solution, with \({C}_{i}\in \left[\text{0,1}\right]\) . A larger \({C}_{i}\) value suggests stronger smart and low-carbon development capabilities of the sample city.

Step 4: Analyze the correlation based on Pearson correlation method.

The Pearson correlation method is commonly used to measure the correlation coefficient between two continuous random variables, thereby assessing the degree of correlation between them 69 . In this study, based on the results from Steps 1–3, two sets of data are obtained representing the smart development level and low-carbon development level of sample cities, \(A:\left\{{A}_{1},{A}_{2},\dots ,{A}_{n}\right\}\) and \(B:\left\{{B}_{1},{B}_{2},\dots ,{B}_{n}\right\}\) . The overall means and covariance of both data sets are calculated, resulting in the Pearson correlation coefficient between the two variables.

where \({A}_{i}\) and \({B}_{i}\) respectively represent the SCP and LCL of sample cities. \(E\left(A\right)\) and \(E\left(B\right)\) are the overall means of the two data sets, \({\sigma }_{A}\text{ and }{\sigma }_{B}\) are their respective standard deviations, \(cov(A,B)\) is the covariance, and \({\rho }_{AB}\) is the Pearson correlation coefficient. When the correlation coefficient approaches 0, the relationship weakens, as it nears − 1 or + 1, the correlation strengthens.

Step 5: Analyze the coupling coordination degree based on the coupling coordination model.

The coupling coordination degree characterizes the level of interaction between different systems and serves as a scientific model for measuring the coordinated development level of multiple subsystems or elements 70 . This study has developed a model to measure the coupling coordination degree between two systems.

where C defines the coupling degree, \({f}_{1}\) and \({f}_{2}\) are the evaluation values of SCP and LCL respectively. CPD represents the coupling coordination degree. \(\alpha\) , \(\beta\) are the coefficient to be determined, indicating the importance of the systems. This study assumes that each system is equally important. Thus \(\alpha =\beta =1/2.\)

In this study, building upon the framework established by a preceding study, a classification system for the coupling coordination degree was developed. This system delineates the various types of coupling-coordinated development among SCP, LCL, LCS, LCM, LCEQ, and LCC. Current research on the division of coupling coordination degree intervals often uses an average distribution within the [0, 1] range 70 . However, due to the large sample size and the wide distribution range of coupling coordination degrees in this study, we have categorized these types into ten distinct levels based on their rank, as detailed in Table 1 .

Selection of sample cities and data collection

The Chinese government has prioritized the development of smart and low-carbon cities. Since 2010, it has launched 290 smart city pilots and 81 low-carbon city pilots across various regions, reflecting different levels of development, resource allocations, and operational foundations. To maintain the scientific integrity of our study, we established stringent criteria for selecting sample cities: (i) each city must be concurrently identified as both a smart and a low-carbon city pilot, and (ii) their government agencies must have issued data on key performance indicators for these initiatives. Following these criteria, our research has ultimately selected 52 cities as samples, as detailed in Fig.  2 . It is noteworthy that these 52 typical case cities are almost all provincial capitals in China, mostly located within the Yangtze River Delta, Pearl River Delta, Jingjinji (Beijing–Tianjin–Hebei), and Western Triangle economic regions. Additionally, according to the “Globalization and World Cities Research Network (GaWC) World Cities Roster 2022 (GaWC2022)”, these cities are ranked within the top 200 globally. Therefore, given the scope of this research, these case cities offer significant representativeness and can serve as valuable models for promoting development in other urban areas. The data for this paper were sourced from the “China Low-Carbon Yearbook (2010–2023)”, the “China Environmental Statistics Yearbook (2010–2023)”, and low-carbon city data published by the governments of the sample cities. Additionally, this study addressed any missing data by averaging the data from adjacent years and applying exponential smoothing.

figure 2

52 sample cities and their geographic locations.

Weighting values between evaluation indicators

The entropy weighting values between the 20 indicators of SCP and the 19 indicators of LCL are calculated by applying the data described in “ Weighting values between evaluation indicators ” section to formula ( 1 )–( 5 ), and the results are shown in Supplementary Appendix Tables A3 and A4 . Specifically, within the SCP evaluation framework, SPE and II are assigned the highest weights, while LCS and LCM are allocated the highest weights within the LCL evaluation framework. Conversely, SCS and LCC have attributed the lowest weights in their respective contexts.

Evaluation of SCP and LCL in sample cities

Utilizing the data from “ Selection of sample cities and data collection ” section and the weighting values derived in “ Weighting values between evaluation indicators ” section, we can determine the SCP and LCL of sample cities using the TOPSIS method, as outlined in formulas ( 6 )–( 9 ). The results are illustrated in Supplementary Appendix Table A5 and Fig.  3 . In this study, the value of the closeness coefficient (C i ) is used to indicate the relative closeness of a particular sample city to the negative ideal point 71 . The negative ideal point represents the worst solution of the ideal, where the individual attribute values reach their worst in each alternative. Therefore, a larger value of closeness indicates better smart city performance or a lower carbon level of a sample city 72 . C LCL and C SCP respectively represent the low-carbon level closeness coefficient and the smart city performance closeness coefficient. In referring to Supplementary Appendix Table A5 , the best three cities of SCP are Shenzhen, Shanghai, and Hangzhou, whilst the worst three cities are Yan’an, Jincheng, and Xining. Furthermore, Chengdu, Qingdao, and Beijing are the best there low-carbon level performers. Whilst Jincheng, Urumqi, and Huhehaote are the three worst.

figure 3

TOPSIS-based analysis of SCP with LCL in 52 sample cities.

In referencing Fig.  3 , this study considers SCP data of sample cities as the control variable and ranks them in ascending order based on TOPSIS results. We then examine changes in LCL data to ascertain the correlation between these variables, yielding two key research conclusions: on one hand, analysis of 52 sample cities demonstrates a general ascending trend in both SCP and LCL data curves. This trend suggests a positive correlation between these two parameters. On the other hand, the LCL data, in contrast to the consistent rise in SCP, exhibits notable fluctuations and wider dispersion. This indicates that the positive correlation between SCP and LCL, while present, is not markedly robust.

Correlation results of SCP and LCL in sample cities

Correlation analysis of urban SCP and overall-LCL. This analysis employs the closeness coefficient (C i ) to assess SCP and overall-LCL in sample cities for Hypothesis 1 in Eqs. ( 10 ) and ( 11 ). The results are presented in Table 2 . Additionally, a linear regression analysis is conducted to determine the presence and magnitude of the relationship between SCP and LCL in these cities, as shown in Fig.  4 .

figure 4

The scatter and regression of SCP and LCL: ( A ) SCP & Overall-LCL; ( B ) SCP & LCM; ( C ) SCP & LCS; ( D ) SCP & LCE; ( E ) SCP & LCQE; ( F ) SCP & LCC.

Considering the closeness coefficient range, correlation is categorized into five levels: very weak ( \(\left|{\rho }_{AB}\right|<0\) .1), weak ( \(0.1\le \left|{\rho }_{AB}\right|<0\) .3), moderate ( \(0.3\le \left|{\rho }_{AB}\right|<0\) .5), strong ( \(0.5\le \left|{\rho }_{AB}\right|<0\) .7), and very strong ( \(0.7\le \left|{\rho }_{AB}\right|<1.0\) ) 73 . Table 1 indicates a strong positive correlation between SCP and overall LCL. Linear regression analysis in Fig.  4 A demonstrates a significant correlation between SCP and urban LCL ( R 2  = 0.42, p  < 0.001), with notable differences exist among cities, consistent with Hypothesis 1 .

Correlation analysis of SCP and each low-carbon dimension. Pearson correlation analysis effectively measures the strength of linear relationships between two variables, but it does not identify causal relationships between them. To address this limitation and explore the interaction between the two variables, this study sets and solves the closeness coefficient for each low-carbon dimension, which are low-carbon economy (C LCE ), low-carbon society (C LCS ), low-carbon environmental quality (C LCEQ ), low-carbon management (C LCM ), and low-carbon culture (C LCC ). It then calculates the correlation analysis results for SCP and each low-carbon dimension for Hypothesis 1 , as shown in Table 1 . Furthermore, the results of the linear regression analysis are presented in Fig.  4 .

In detail, strong correlations exist between SCP and LCM, LCS, and LCEQ. The correlation is moderate with LCE and weak with LCC. Furthermore, linear regression analysis shows that the links between SCP and low-carbon levels across five dimensions are significant with minimal variance. Cities with higher SCP typically show higher values in LCM ( R 2  = 0.38, p  = 0.000), LCS ( R 2  = 0.35, p  = 0.000), and LCE ( R 2  = 0.32, p  = 0.000) as depicted in Fig.  4 B–D. However, this trend is less pronounced in LCEQ ( R 2  = 0.17, p  = 0.000) and LCC ( R 2  = 0.06, p  = 0.001), which exhibit greater dispersion as shown in Fig.  4 E,F. The lower R 2 values for LCEQ and LCC compared to other dimensions suggest a greater influence of factors not included in the model. Furthermore, to ensure the credibility and reliability of the research findings, this study conducted a sensitivity analysis by identifying and removing outliers from the sample dataset using the Z-score method, in addition to the previously mentioned Pearson correlation analysis. The Pearson correlation coefficient for the original dataset of city SCP and LCL is 0.65, with a significant P-value. After removing the outliers, the Pearson correlation coefficient is 0.61, and the P-value remained significant. Therefore, the correlation between city SCP and LCL proposed in Research Hypothesis 1 is robust.

Coupling coordination degree of SCP and LCL in sample cities

The degree of coupling coordination comprehensively considers multiple aspects of urban complex systems, including economic, social, and environmental dimensions. By systematically evaluating the coordinated development level of urban SCP and LCL, this approach enables the analysis of the coupling and coordination relationships between SCP and LCL, as well as among various subsystems such as LCM, LCS, LCE, LCEQ, and LCC. This reveals the dynamic interactions and causality between SCP and LCL within urban complex systems. The coupling coordination degrees of SCP and LCL, along with their subsystems, in 52 typical smart and low-carbon pilot cities in China, are illustrated in Fig.  5 .

figure 5

Coupled coordination degree of SCP and LCL, LCS, LCEQ, LCE, LCM, LCC.

Characteristics of objective changes in the coupled coordination degree between SCP and LCL. Based on the coupling coordination model and Eqs. ( 12 ) to ( 14 ), the coupling coordination degree of the urban complex system in SCP and LCL regions is calculated for Hypothesis 1 , as illustrated in Fig.  5 .

From the holistic perspective of urban complex systems, as the level of urban SCP continuously improves, the coupling coordination degree between SCP and LCL among 52 pilot cities in China shows an upward trend. This indicates that as the functional indices of urban SCP and LCL both strengthen, their interaction and coordination also enhance. Among these, Jincheng has the lowest coupled coordination degree at 0.5201, while Beijing boasts the highest at 0.8622. Within the 52 pilot cities, 5.78% exhibit a barely coupling coordination level, 51.93% display a primary coupling coordination level, 25% achieve an intermediate coupling coordination level, and 17.31% reach a good coupling coordination level. Moreover, the average coupling coordination degree of the 52 pilot cities is 0.598, suggesting that the SCP and LCL of the pilot cities can achieve coupled coordinated development.

Characteristics of objective changes in the coupled coordination degree among SCP, LCM, LCS, LCE, LCEQ, and LCC for Hypothesis 1 are illustrated in Fig.  5 .

From the perspective of urban subsystems, the coupling coordination degrees of LCS & SCP, LCE & SCP, and LCM & SCP all exhibit characteristics of steady fluctuations with an upward trend, while the coupling coordination degree of LCC & SCP shows greater volatility in its upward trend. The coupling coordination degree of LCEQ & SCP demonstrates a trend of initially rising and then declining. Furthermore, the average values of the coupling coordination degrees for LCS & SCP, LCE & SCP, LCM & SCP, LCEQ & SCP, and LCC & SCP are 0.478, 0.761, 0.779, 0.710, and 0.485, respectively. Among these, the pilot cities’ subsystems of LCE, LCM, and LCEQ with SCP exhibit an intermediate level of coupling coordination, while the coupling coordination degrees of LCS and LCC with SCP are on the verge of a dysfunctional recession. This indicates that the causal relationships between urban SCP and the subsystems of urban LCM, LCS, LCE, LCEQ, and LCC vary. Overall, Hypothesis 1 holds true both from the perspective of the city's overall system and from the perspective of its various subsystems.

Discussions and implications

Relationship between scp and lcl of different cities.

Considering the evaluation results of the urban SCP and LCL, four grades of the overall points can be classified, namely, excellent (0.7–1.0), average (0.5–0.7), below average (0.4–0.5), and poor (0–0.4). Subsequently, the sample cities in Supplementary Appendix Table A5 were classified based on these gradations. In the sample, cities with excellent SCP constitute 9.62%, about double the proportion with excellent LCL. Cities with average SCP account for 48.08%, whereas those at average LCL represent only 26.92%. Notably, cities with poor LCL comprise 26.92%, nearly triple the rate of those with poor SCP. The findings suggest that China’s SCP currently outperforms its low-carbon city initiatives, largely attributable to the rapid advancement of the Internet and Information and Communication Technology (ICT) in recent years. What’s more, Fig.  4 illustrates that urban SCP significantly positively influences the urban LCL, though substantial variations exist among different cities. The relevant types can be summarized into the following four categories.

Quadrant I-high SCP and high LCL, including only six cities (Shenzhen, Shanghai, Beijing, Ningbo, Xiamen, and Qingdao). These cities are not only among China’s earliest smart city pilots but also recent focus areas for the government’s “Carbon Peak Pioneer Cities” initiative. By actively exploring innovative models, systems, and technologies for smart and low-carbon co-development, these cities provide valuable practical experiences for others. For instance, Shenzhen has developed a multi-level, multi-component greenhouse gas monitoring network and technology system for “carbon flux, carbon concentration, carbon emissions”, while Ningbo has constructed a “smart zero-carbon” comprehensive demonstration port area.

Quadrant II-poor SCP and poor LCL, numerous cities in Fig.  4 A, such as Jincheng, Lhasa, and Urumqi, exhibit poor SCP and LCL. Despite China having the most smart and low-carbon city pilots globally, its development level in these areas still lags significantly behind typical developed countries. While China’s infrastructure like networking and computing power has reached a certain scale, issues persist with insufficient integration and intensity in infrastructure construction and operation, as well as problems with aging infrastructure and low levels of intelligence. Furthermore, although China’s low-carbon pilot cities have made positive progress in promoting low-carbon development, most still have incomplete carbon emission statistical systems and inadequate operational mechanisms, leading to generally poor overall low-carbon development levels.

Quadrant III-high LCL but poor SCP, such as Kunming, Xining, and Guiyang. These cities possess resources conducive to low-carbon development, such as Kunming and Guiyang with their rich forest carbon sinks, and Xining with abundant clean energy sources like solar and wind power. However, they are mostly situated in China’s central and southwest areas with underdeveloped physical and economic conditions. Leveraging their abundant low-carbon resources, and utilizing big data and IoT technology, achieving sustainable green economic growth through carbon credits and trading markets, as well as green finance, represents a significant future development direction for these cities.

Quadrant IV-high SCP but poor LCL, including Suzhou, and Jinhua Zhongshan, decoupling economic development from carbon emissions presents a significant development challenge for these cities. Specifically, for Suzhou, one of the world’s largest industrial cities, the main challenge is achieving decarburization in the energy sector and transitioning high-emission manufacturing industries to low-carbon alternatives.

What’s more, as illustrated in Fig.  5 , the degree of interaction between SCP and LCL across the 52 pilot cities in China positively impacts the balanced and comprehensive performance of these cities. This, in turn, fosters the coordinated development of urban systems as a whole. Moreover, the continual increase in the coupled coordination degree between SCP and LCL with the enhancement of SCP in pilot cities indicates that smart city construction contributes to urban low-carbon development. Future urban development in China should fully leverage the industrial upgrading effect, carbon sequestration effect, and energy utilization effect of smart city construction. However, the increasing slope of the SCP & LCL coupled coordination degree curve in Fig.  5 suggests significant regional differences in the level of SCP & LCL coupled coordination development across Chinese cities. Smart city construction has a more pronounced decarburization effect in central and western cities, southern cities, non-environmentally focused cities, and resource-based cities, with cities in the northwest showing notably poorer levels of SCP & LCL coupled coordination development. This serves as a warning for future urban development in China.

Relationships between SCP and LCL in each urban subsystem

The relationship between urban SCP and LCL across five dimensions is illustrated in Fig.  4 B–F. There is a strong positive correlation between SCP and LCM, LCS, and LCE, while a moderate correlation is observed with LCEQ, and a weak correlation with LCC. Furthermore, the degree of coupling coordination between SCP and subsystems such as LCS, LCEQ, LCE, LCM, and LCC is examined in Fig.  5 . The results of the coupling coordination vividly illustrate the synergistic interactions and developmental harmony between urban SCP and various systems.

Among these, the coupling coordination degree curve fluctuation between SCP & LCM is stable, situated at an intermediate coupling coordination level, indicating the dominant role of the Chinese government in the construction of smart cities and low-carbon cities, as well as the effectiveness of policy implementation. However, this also suggests that in promoting urban smart and low-carbon construction, China faces the risk of adopting “one-size-fits-all” mandatory policies, neglecting to advance construction in phases with emphasis, tailored to the city's resource endowment and economic development status. The coupling coordination degree curve changes between SCP&LCE and SCP&LCL show the highest degree of fit, indicating that low-carbon economic development brought about by digital empowerment and upgrading of the urban industrial structure is a key driving factor for promoting the coupled coordination development of urban smart and low-carbon initiatives. Transforming traditional industrial structures and pursuing low-carbon upgrades of the economic structure present challenges for urban development in China today. The coupled coordination degree of SCP & LCS is on the verge of a dysfunctional recession, highlighting the imbalance in the development between China's SCP and LCS, especially in terms of new infrastructure construction, such as smart transportation and logistics facilities, smart energy systems, smart environmental resources facilities, etc. The current construction of new infrastructure in China is far from meeting the living needs of the broad masses of people.

It is noteworthy that with the continuous improvement of the SCP in sample cities, the coupling performance degree between SCP and LCEQ exhibits two phases: an initial stage of synergistic enhancement followed by a stage of diminished synergy. In the early phase of synergistic development, the SCP and LCEQ systems of cities, driven by shared goals of sustainable urban development, strategy adjustments, resource sharing, and technological progress, facilitated effective collaboration and integration between systems. However, upon reaching a certain stage, intensified resource competition, declining management efficiency, and environmental changes led to internal system fatigue, resulting in weakened synergy. This indicates that once the technological effects generated by smart city construction reach a certain level, it becomes crucial to enhance the city's capacity for autonomous innovation. Addressing the bottleneck issues of core technologies and transforming the development mode of smart low-carbon technology from “imitative innovation” represent significant breakthroughs for further promoting the coupled coordination of SCP and LCEQ in China’s future.

Moreover, as the SCP of sample cities continuously improves, the coupled coordination degree between SCP and LCC shows two phases: initial stable fluctuations and subsequent rapid growth. The turning point in the curve change occurs at a coupled coordination degree of 0.6, denoted as the primary coupling coordination point. Among these, the low-carbon awareness rate of urban residents, as a key indicator of LCC, shows that the majority of urban residents in China are still in the cognitive awakening stage regarding low-carbon consciousness. At this stage, residents begin to recognize the severity of climate change and environmental degradation, along with the importance of smart low-carbon lifestyles in mitigating these issues. The government continuously promotes this awareness through media reports, educational activities, official propaganda, and community initiatives. As residents gain a deeper understanding of the issues, their attitudes shift from initial indifference or skepticism to a stronger identification with and support for the values and concepts of smart low-carbon living. This shift encourages residents to experiment with new smart low-carbon lifestyles, gradually finding suitable smart low-carbon behavioral patterns that become habitual. Ultimately, when smart low-carbon lifestyles are fully internalized as part of residents’ values, they not only practice smart low-carbon living at the individual level but also actively participate in promoting society’s smart low-carbon construction. Therefore, this study posits that the emergence of the coupled coordination degree turning point between SCP and LCC is not only a process of individual behavioral change but also a reflection of social and cultural transformation. This process is time-consuming and influenced by multiple factors, including policy guidance, economic incentives, educational dissemination, and the social atmosphere.

Implications for promoting coupling coordination development between urban SCP and LCL

Low-carbon and smartness are vital features of modern, sustainable urban development and key supports for it. This study posits that urban low-carbon and smart development should not be disjointed but rather synergistic and complementary. To better achieve sustainable urban development goals, a model should be constructed with “low-carbon” as the cornerstone of sustainable development and “smartness” as the technological assurance for low-carbon growth. Specifically, this study proposes the “urban smart low-carbon co-development model”, which entails a deep integration of intelligent technologies such as the Internet of Things (IoT) and big data with urban construction, governance services, and economic development. This model leverages digitalization to facilitate decarburization, thereby achieving urban sustainable development goals such as energy-efficient and green urbanization, ecological and livable environments, and streamlined governance services.

Furthermore, to better coordinate smart development with low-carbon city construction, enhance low-carbon city building through digitalization, and explore exemplary practices and models of smart low-carbon city construction, this study finds it necessary to establish an evaluation system for smart and low-carbon urban co-development. Therefore, based on the aforementioned urban SCP and LCL evaluation indicator system, this study initially conducted a literature review of past research, selecting 5 primary indicators and 20 secondary indicators from 48 articles to evaluate the degree of coupling coordination development between urban SCP and LCL. Subsequently, the Delphi method was employed to finalize the list of evaluation indicators, with 10 experts from various regions and diverse backgrounds in China refining the list and determining the weights of each indicator, as shown in Supplementary Appendix Table A6 . The final Smart Low-Carbon City Coupling Coordination Development Evaluation Indicator System, as presented in Table 3 , comprises 5 primary indicators and 18 secondary indicators. This evaluation system aims to emphasize the utilization of next-generation information technologies such as 5G, artificial intelligence, cloud computing, and blockchain to expand urban green ecological spaces, strengthen ecological environment governance, and enhance the level of intelligent urban governance, meeting the development needs of smart low-carbon cities.

The policy implications from the analysis results suggest that actions should be taken by government departments in China to reduce the uneven performance between urban SCP and LCL across various cities. These actions include, for example: Firstly, guiding the innovative development of urban SCP and LCL through policies, such as enhancing government digital services and administrative platforms, continuously promoting the development of emerging industries and the upgrading of traditional industries, and actively promoting green energy technologies. Secondly, categorizing and advancing the coordinated development of smart and low-carbon cities—comprehensive development should be pursued simultaneously in large cities in eastern and central China, while in smaller cities in western China, priorities should include enhancing urban innovation capabilities and improving infrastructure to lay a solid foundation for the coupled coordination of urban SCP and LCL. Thirdly, constructing a multi-stakeholder governance system to maximize the leading role of the government, the main role of enterprises, and the active participation of residents. By fostering a positive social atmosphere and cultural attributes, this will enhance the sense of participation and achievement among different social groups, creating a sustainable development model for urban SCP and LCL coordination. Lastly, emphasizing the development of SCP and LCL coordination in county-level cities is crucial. While large Chinese cities have already begun to form a pattern of coordinated SCP and LCL development, county-level cities, though with weaker infrastructures, possess tremendous potential. Focusing on low-carbon production, circulation, and consumption, and strengthening smart and low-carbon constructions in county-level cities will be a vital task for future urban development in China.

Conclusions

The global urbanization process brings opportunities for economic growth and social development, but also presents a series of challenges, such as environmental pressures and resource constraints 3 . The evaluation of urban SCP and LCL creates a link between the policy-making in urban resources environment management and the objectives of sustainable development goals (SDGs 11.4, 11.6, and 11.b) at the city level 74 . Currently, there is no unified consensus on the coupling coordination development between urban SCP and LCL. This study proposes a method combining qualitative and quantitative analysis from the perspective of urban complex systems to analyze the coupling coordination relationship between SCP and LCL. This new method clearly interprets a strong positive correlation between urban smart performance and the overall low-carbon level. Specifically, there are strong correlations between SMC and LCM, LCS, and LCE, with a moderate correlation to LCQE and a weak correlation with LCC. Several innovative insights for this method are highlighted: (i) sustainable development based on SCP and LCL assessment; (ii) emphasizing the “people-centric” concept in urban development; (iii) analyzing from the perspective of urban complex systems.

This study selected 52 typical smart and low-carbon pilot cities in China as sample cities to analyze the coupled coordination relationship between urban SCP and LCL. And the main findings from this analysis can be summarized as follows: (i) smart city initiatives outperform low-carbon city development, with notable differences in SCP and LCL effectiveness across eastern, central, and non-resource-based cities versus western, peripheral, and resource-dependent ones in China. (ii) A strong positive link between urban SCP and low-carbon levels, especially between SCP and LCM, LCS, and LCE, with moderate and weak correlations to LCEQ and LCC, respectively. (iii) An increasing urban SCP levels enhance the coupling coordination within the urban SCP and LCL system. SCP & LCE, SCP & LCM, and SCP & LCS subsystems align well with the overall system, driving the coupled coordination of urban SCP and LCL. In contrast, SCP & LCC and SCP & LCEQ have lesser alignment, affected by factors like technology, policy, economic incentives, education, and societal attitudes. Based on the evaluation results, this study posits that the development of urban low-carbon and smart initiatives should not be disjointed but rather synergistic and complementary. This study constructs an evaluation indicator system for the co-development of smart low-carbon cities aimed at better guiding the future coupling coordination development of smart and low-carbon cities.

The novelty of this study not only addresses the practical dilemma of obtaining comprehensive, accurate, and timely urban-level carbon emission data, a challenge due to existing measurement and estimation technologies being unable to capture all types of carbon emissions, but also assesses the urban SCP and LCL. Simultaneously, by combining qualitative and quantitative analysis methods, it fills the research gap on the nature of the coupled coordination relationship between urban SCP and LCL. Moreover, from the perspective of urban complex systems, this study dissects the urban low-carbon level into LCC, LC, LCE, LCEQ, and LCS, exploring their respective coupled coordination relationships with SCP. This clarifies the impact mechanism between SCP and LCL, providing a theoretical basis for smart low-carbon city co-development. The limitations of the study are also appreciated. Firstly, the study only selected a sample of cities in China, and the limited number of samples may not fully substantiate the research conclusions. Secondly, the indicator system constructed by this study is still not perfect, leading to certain inaccuracies in the evaluation results. In this regard, future studies are recommended to conduct a more comprehensive comparison analysis on the coupled coordination relationship between SCP and LCL at city, regional, and national levels, which would be beneficial in better guiding the practice of urban sustainability.

Data availability

All data generated or analysed during this study are included in this published article [and its Supplementary Information files].

Zheng, H. W., Shen, G. Q. & Wang, H. A review of recent studies on sustainable urban renewal. Habit. Int. 41 , 272–279. https://doi.org/10.1016/j.habitatint.2013.08.006 (2014).

Article   CAS   Google Scholar  

Bibri, S. E. & Krogstie, J. Smart sustainable cities of the future: An extensive interdisciplinary literature review. Sustain. Cities Soc. 31 , 183–212. https://doi.org/10.1016/j.scs.2017.02.016 (2017).

Article   Google Scholar  

Chen, M., Liu, W. & Lu, D. Challenges and the way forward in China’s new-type urbanization. Land Use Policy 55 , 334–339. https://doi.org/10.1016/j.landusepol.2015.07.025 (2016).

Liang, W. & Yang, M. Urbanization, economic growth and environmental pollution: Evidence from China. Sustain. Comput. Inform. Syst. 21 , 1–9. https://doi.org/10.1016/j.suscom.2018.11.007 (2019).

Guan, X., Wei, H., Lu, S., Dai, Q. & Su, H. Assessment on the urbanization strategy in China: Achievements, challenges and reflections. Habit. Int. 71 , 97–109. https://doi.org/10.1016/j.habitatint.2017.11.009 (2018).

Wu, H., Hao, Y. & Weng, J.-H. How does energy consumption affect China’s urbanization? New evidence from dynamic threshold panel models. Energy Policy 127 , 24–38. https://doi.org/10.1016/j.enpol.2018.11.057 (2019).

Liu, H., Cui, W. & Zhang, M. Exploring the causal relationship between urbanization and air pollution: Evidence from China. Sustain. Cities Soc. 80 , 783. https://doi.org/10.1016/j.scs.2022.103783 (2022).

Tang, F. et al. Spatio-temporal variation and coupling coordination relationship between urbanisation and habitat quality in the Grand Canal, China. Land Use Policy 117 , 6119. https://doi.org/10.1016/j.landusepol.2022.106119 (2022).

Kim, J. Smart city trends: A focus on 5 countries and 15 companies. Cities 123 , 551. https://doi.org/10.1016/j.cities.2021.103551 (2022).

Silva, B. N., Khan, M. & Han, K. Towards sustainable smart cities: A review of trends, architectures, components, and open challenges in smart cities. Sustain. Cities Soc. 38 , 697–713. https://doi.org/10.1016/j.scs.2018.01.053 (2018).

Yigitcanlar, T., Kankanamge, N. & Vella, K. How are smart city concepts and technologies perceived and utilized? A systematic geo-twitter analysis of smart cities in Australia. J. Urban Technol. 28 , 135–154. https://doi.org/10.1080/10630732.2020.1753483 (2021).

Yigitcanlar, T. et al. Can cities become smart without being sustainable? A systematic review of the literature. Sustain. Cities Soc. 45 , 348–365. https://doi.org/10.1016/j.scs.2018.11.033 (2019).

Guo, Q. & Zhong, J. The effect of urban innovation performance of smart city construction policies: Evaluate by using a multiple period difference-in-differences model. Technol. Forecast. Soc. Change 184 , 2003. https://doi.org/10.1016/j.techfore.2022.122003 (2022).

Ismagilova, E., Hughes, L., Dwivedi, Y. K. & Raman, K. R. Smart cities: Advances in research—An information systems perspective. Int. J. Inf. Manag. 47 , 88–100. https://doi.org/10.1016/j.ijinfomgt.2019.01.004 (2019).

Caragliu, A. & Del Bo, C. F. Smart innovative cities: The impact of Smart City policies on urban innovation. Technol. Forecast. Soc. Change 142 , 373–383. https://doi.org/10.1016/j.techfore.2018.07.022 (2019).

Yigitcanlar, T. et al. Understanding ‘smart cities’: Intertwining development drivers with desired outcomes in a multidimensional framework. Cities 81 , 145–160. https://doi.org/10.1016/j.cities.2018.04.003 (2018).

Liu, Z. et al. Decision optimization of low-carbon dual-channel supply chain of auto parts based on smart city architecture. Complexity 2020 , 5951. https://doi.org/10.1155/2020/2145951 (2020).

Guo, Q., Wang, Y. & Dong, X. Effects of smart city construction on energy saving and CO 2 emission reduction: Evidence from China. Appl. Energy 313 , 879. https://doi.org/10.1016/j.apenergy.2022.118879 (2022).

Cheng, J., Yi, J., Dai, S. & Xiong, Y. Can low-carbon city construction facilitate green growth? Evidence from China’s pilot low-carbon city initiative. J. Clean. Prod. 231 , 1158–1170. https://doi.org/10.1016/j.jclepro.2019.05.327 (2019).

Sun, W. & Huang, C. Predictions of carbon emission intensity based on factor analysis and an improved extreme learning machine from the perspective of carbon emission efficiency. J. Clean. Prod. 338 , 414. https://doi.org/10.1016/j.jclepro.2022.130414 (2022).

Shi, B., Li, N., Gao, Q. & Li, G. Market incentives, carbon quota allocation and carbon emission reduction: Evidence from China’s carbon trading pilot policy. J. Environ. Manag. 319 , 650. https://doi.org/10.1016/j.jenvman.2022.115650 (2022).

Sun, L. et al. Carbon emission transfer strategies in supply chain with lag time of emission reduction technologies and low-carbon preference of consumers. J. Clean. Prod. 264 , 664. https://doi.org/10.1016/j.jclepro.2020.121664 (2020).

Matsumura, E. M., Prakash, R. & Vera-Munoz, S. C. Firm-value effects of carbon emissions and carbon disclosures. Acc. Rev. 89 , 695–724. https://doi.org/10.2308/accr-50629 (2014).

Lv, M. & Bai, M. Evaluation of China’s carbon emission trading policy from corporate innovation. Financ. Res. Lett. 39 , 565. https://doi.org/10.1016/j.frl.2020.101565 (2021).

Jia, Z. & Lin, B. Rethinking the choice of carbon tax and carbon trading in China. Technol. Forecast. Soc. Change 159 , 187. https://doi.org/10.1016/j.techfore.2020.120187 (2020).

Huo, T., Xu, L., Liu, B., Cai, W. & Feng, W. China’s commercial building carbon emissions toward 2060: An integrated dynamic emission assessment model. Appl. Energy 325 , 828. https://doi.org/10.1016/j.apenergy.2022.119828 (2022).

Lin, B. & Huang, C. Analysis of emission reduction effects of carbon trading: Market mechanism or government intervention? Sustain. Prod. Consump. 33 , 28–37. https://doi.org/10.1016/j.spc.2022.06.016 (2022).

Zhang, M. & Liu, Y. Influence of digital finance and green technology innovation on China’s carbon emission efficiency: Empirical analysis based on spatial metrology. Sci. Total Environ. 838 , 463. https://doi.org/10.1016/j.scitotenv.2022.156463 (2022).

Zhu, X. & Li, D. How to promote the construction of low-carbon cities in China? An urban complex ecosystem perspective. Sustain. Dev. https://doi.org/10.1002/sd.2897 (2024).

He, C., Zhang, D., Huang, Q. & Zhao, Y. Assessing the potential impacts of urban expansion on regional carbon storage by linking the LUSD-urban and InVEST models. Environ. Model. Softw. 75 , 44–58. https://doi.org/10.1016/j.envsoft.2015.09.015 (2016).

Nowak, D. J., Greenfield, E. J., Hoehn, R. E. & Lapoint, E. Carbon storage and sequestration by trees in urban and community areas of the United States. Environ. Pollut. 178 , 229–236. https://doi.org/10.1016/j.envpol.2013.03.019 (2013).

Article   CAS   PubMed   Google Scholar  

Wang, T. et al. Mobility based trust evaluation for heterogeneous electric vehicles network in smart cities. IEEE Trans. Intell. Transp. Syst. 22 , 1797–1806. https://doi.org/10.1109/tits.2020.2997377 (2021).

Huovila, A., Bosch, P. & Airaksinen, M. Comparative analysis of standardized indicators for Smart sustainable cities: What indicators and standards to use and when? Cities 89 , 141–153. https://doi.org/10.1016/j.cities.2019.01.029 (2019).

Nizetic, S., Djilali, N., Papadopoulos, A. & Rodrigues, J. J. P. C. Smart technologies for promotion of energy efficiency, utilization of sustainable resources and waste management. J. Clean. Prod. 231 , 565–591. https://doi.org/10.1016/j.jclepro.2019.04.397 (2019).

Sharifi, S., Saman, W. & Alemu, A. Identification of overheating in the top floors of energy-efficient multilevel dwellings. Energy Build. https://doi.org/10.1016/j.enbuild.2019.109452 (2019).

Shafiq, M., Tian, Z., Sun, Y., Du, X. & Guizani, M. Selection of effective machine learning algorithm and Bot–IoT attacks traffic identification for internet of things in smart city. Future Gener. Comput. Syst. Int. J. Esci. 107 , 433–442. https://doi.org/10.1016/j.future.2020.02.017 (2020).

Huang, S., Liu, A., Zhang, S., Wang, T. & Xiong, N. N. BD-VTE: A novel baseline data based verifiable trust evaluation scheme for smart network systems. IEEE Trans. Netw. Sci. Eng. 8 , 2087–2105. https://doi.org/10.1109/tnse.2020.3014455 (2021).

Reed, M. S. et al. Evaluating impact from research: A methodological framework. Res. Policy 50 , 147. https://doi.org/10.1016/j.respol.2020.104147 (2021).

Venable, J., Pries-Heje, J. & Baskerville, R. FEDS: A framework for evaluation in design science research. Eur. J. Inf. Syst. 25 , 77–89. https://doi.org/10.1057/ejis.2014.36 (2016).

Kristan, M. et al. A novel performance evaluation methodology for single-target trackers. IEEE Trans. Pattern Anal. Mach. Intell. 38 , 2137–2155. https://doi.org/10.1109/tpami.2016.2516982 (2016).

Article   PubMed   Google Scholar  

Li, H. Research progress on evaluation methods and factors influencing shale brittleness: A review. Energy Rep. 8 , 4344–4358. https://doi.org/10.1016/j.egyr.2022.03.120 (2022).

Lyu, H.-M., Zhou, W.-H., Shen, S.-L. & Zhou, A.-N. Inundation risk assessment of metro system using AHP and TFN-AHP in Shenzhen. Sustain. Cities Soc. 56 , 103. https://doi.org/10.1016/j.scs.2020.102103 (2020).

Buyukozkan, G. & Guleryuz, S. An integrated DEMATEL-ANP approach for renewable energy resources selection in Turkey. Int. J. Prod. Econ. 182 , 435–448. https://doi.org/10.1016/j.ijpe.2016.09.015 (2016).

Ervural, B. C., Zaim, S., Demirel, O. F., Aydin, Z. & Delen, D. An ANP and fuzzy TOPSIS-based SWOT analysis for Turkey’s energy planning. Renew. Sustain. Energy Rev. 82 , 1538–1550. https://doi.org/10.1016/j.rser.2017.06.095 (2018).

Gao, Z. et al. EEG-based spatio-temporal convolutional neural network for driver fatigue evaluation. IEEE Trans. Neural Netw. Learn. Syst. 30 , 2755–2763. https://doi.org/10.1109/tnnls.2018.2886414 (2019).

Manzano, A. The craft of interviewing in realist evaluation. Evaluation 22 , 342–360. https://doi.org/10.1177/1356389016638615 (2016).

Zeng, S., Jin, G., Tan, K. & Liu, X. Can low-carbon city construction reduce carbon intensity? Empirical evidence from low-carbon city pilot policy in China. J. Environ. Manag. 332 , 363. https://doi.org/10.1016/j.jenvman.2023.117363 (2023).

Liu, X., Li, Y., Chen, X. & Liu, J. Evaluation of low carbon city pilot policy effect on carbon abatement in China: An empirical evidence based on time-varying DID model. Cities 123 , 582. https://doi.org/10.1016/j.cities.2022.103582 (2022).

Tan, S. et al. A holistic low carbon city indicator framework for sustainable development. Appl. Energy 185 , 1919–1930. https://doi.org/10.1016/j.apenergy.2016.03.041 (2017).

Article   ADS   Google Scholar  

Shi, X. & Xu, Y. Evaluation of China’s pilot low-carbon city program: A perspective of industrial carbon emission efficiency. Atmos. Pollut. Res. 13 , 446. https://doi.org/10.1016/j.apr.2022.101446 (2022).

Yang, S., Pan, Y. & Zeng, S. Decision making framework based Fermatean fuzzy integrated weighted distance and TOPSIS for green low-carbon port evaluation. Eng. Appl. Artif. Intell. 114 , 5048. https://doi.org/10.1016/j.engappai.2022.105048 (2022).

Fang, G., Gao, Z., Tian, L. & Fu, M. What drives urban carbon emission efficiency?—Spatial analysis based on nighttime light data. Appl. Energy 312 , 772. https://doi.org/10.1016/j.apenergy.2022.118772 (2022).

Yang, S., Jahanger, A. & Hossain, M. R. How effective has the low-carbon city pilot policy been as an environmental intervention in curbing pollution? Evidence from Chinese industrial enterprises. Energy Econ. 118 , 523. https://doi.org/10.1016/j.eneco.2023.106523 (2023).

Huang, G., Li, D., Zhu, X. & Zhu, J. Influencing factors and their influencing mechanisms on urban resilience in China. Sustain. Cities Soc. 74 , 210. https://doi.org/10.1016/j.scs.2021.103210 (2021).

Li, W. et al. Carbon emission and economic development trade-offs for optimizing land-use allocation in the Yangtze River Delta, China. Ecol. Indic. 147 , 950. https://doi.org/10.1016/j.ecolind.2023.109950 (2023).

Wu, H. et al. Exploring the impact of urban form on urban land use efficiency under low-carbon emission constraints: A case study in China’s Yellow River Basin. J. Environ. Manag. 311 , 866. https://doi.org/10.1016/j.jenvman.2022.114866 (2022).

Zhao, S. et al. Has China’s low-carbon strategy pushed forward the digital transformation of manufacturing enterprises? Evidence from the low-carbon city pilot policy. Environ. Impact Assess. Rev. 102 , 184. https://doi.org/10.1016/j.eiar.2023.107184 (2023).

Pan, A., Zhang, W., Shi, X. & Dai, L. Climate policy and low-carbon innovation: Evidence from low-carbon city pilots in China. Energy Econ. 112 , 129. https://doi.org/10.1016/j.eneco.2022.106129 (2022).

De Jong, M., Joss, S., Schraven, D., Zhan, C. & Weijnen, M. Sustainable-smart-resilient-low carbon-eco-knowledge cities; making sense of a multitude of concepts promoting sustainable urbanization. J. Clean. Prod. 109 , 25–38. https://doi.org/10.1016/j.jclepro.2015.02.004 (2015).

He, B.-J. et al. Co-benefits approach: Opportunities for implementing sponge city and urban heat island mitigation. Land Use Policy 86 , 147–157. https://doi.org/10.1016/j.landusepol.2019.05.003 (2019).

Nizetic, S., Solic, P., Lopez-de-Ipina, D. & Patrono, L. Internet of Things (IoT): Opportunities, issues and challenges towards a smart and sustainable future. J. Clean. Prod. 274 , 877. https://doi.org/10.1016/j.jclepro.2020.122877 (2020).

Abduljabbar, R. L., Liyanage, S. & Dia, H. The role of micro-mobility in shaping sustainable cities: A systematic literature review. Transp. Res. D Transp. Environ. 92 , 734. https://doi.org/10.1016/j.trd.2021.102734 (2021).

Anh Tuan, H., Van Viet, P. & Xuan Phuong, N. Integrating renewable sources into energy system for smart city as a sagacious strategy towards clean and sustainable process. J. Clean. Prod. 305 , 7161. https://doi.org/10.1016/j.jclepro.2021.127161 (2021).

March, H. & Ribera-Fumaz, R. Smart contradictions: The politics of making Barcelona a self-sufficient city. Eur. Urban Reg. Stud. 23 , 816–830. https://doi.org/10.1177/0969776414554488 (2016).

Yigitcanlar, T. & Lee, S. H. Korean ubiquitous-eco-city: A smart-sustainable urban form or a branding hoax? Technol. Forecast. Soc. Change 89 , 100–114. https://doi.org/10.1016/j.techfore.2013.08.034 (2014).

Kumar, S., Sharma, D., Rao, S., Lim, W. M. & Mangla, S. K. Past, present, and future of sustainable finance: Insights from big data analytics through machine learning of scholarly research. Ann. Oper. Res. https://doi.org/10.1007/s10479-021-04410-8 (2022).

Article   PubMed   PubMed Central   Google Scholar  

Li, Y., Gao, P., Tang, B., Yi, Y. & Zhang, J. Double feature extraction method of ship-radiated noise signal based on slope entropy and permutation entropy. Entropy 24 , 22. https://doi.org/10.3390/e24010022 (2022).

Article   ADS   CAS   Google Scholar  

Zavadskas, E. K., Mardani, A., Turskis, Z., Jusoh, A. & Nor, K. M. D. Development of TOPSIS method to solve complicated decision-making problems: An overview on developments from 2000 to 2015. Int. J. Inf. Technol. Decis. Making 15 , 645–682. https://doi.org/10.1142/s0219622016300019 (2016).

Edelmann, D., Mori, T. F. & Szekely, G. J. On relationships between the Pearson and the distance correlation coefficients. Stat. Probab. Lett. 169 , 960. https://doi.org/10.1016/j.spl.2020.108960 (2021).

Article   MathSciNet   Google Scholar  

Wang, S., Kong, W., Ren, L., Zhi, D. & Dai, B. Research on misuses and modification of coupling coordination degree model in China. J. Nat. Resour. 36 , 793–810 (2021).

CAS   Google Scholar  

Baak, M., Koopman, R., Snoek, H. & Klous, S. A new correlation coefficient between categorical, ordinal and interval variables with Pearson characteristics. Comput. Stat. Data Anal. 152 , 7043. https://doi.org/10.1016/j.csda.2020.107043 (2020).

Saadatmorad, M., Talookolaei, R.-A.J., Pashaei, M.-H., Khatir, S. & Wahab, M. A. Pearson correlation and discrete wavelet transform for crack identification in steel beams. Mathematics 10 , 689. https://doi.org/10.3390/math10152689 (2022).

De Winter, J. C. F., Gosling, S. D. & Potter, J. Comparing the Pearson and Spearman correlation coefficients across distributions and sample sizes: A tutorial using simulations and empirical data. Psychol. Methods 21 , 273–290. https://doi.org/10.1037/met0000079 (2016).

Belz, F. M. & Binder, J. K. Sustainable entrepreneurship: A convergent process model. Bus. Strategy Environ. 26 , 1–17. https://doi.org/10.1002/bse.1887 (2017).

Download references

The funding provided by Government of Jiangsu Province (BE2022606 and BM2022035).

Author information

Authors and affiliations.

Department of Construction and Real Estate, School of Civil Engineering, Southeast University, Nanjing, 210018, China

Xiongwei Zhu, Dezhi Li, Shenghua Zhou & Lugang Yu

Engineering Research Center of Building Equipment, Energy, and Environment, Ministry of Education, Southeast University, Nanjing, 210018, China

Department of Wood Science, The University of British Columbia, Vancouver, V6T 1Z4, Canada

You can also search for this author in PubMed   Google Scholar

Contributions

X.Z: conceptualization, methodology, formal analysis, investigation, writing-original draft. D. L: supervision, project administration, funding acquisition. S. Z: writing-review & editing. S.Z: writing-review & editing. L.Y: data curation.

Corresponding author

Correspondence to Dezhi Li .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary tables., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Cite this article.

Zhu, X., Li, D., Zhou, S. et al. Evaluating coupling coordination between urban smart performance and low-carbon level in China’s pilot cities with mixed methods. Sci Rep 14 , 20461 (2024). https://doi.org/10.1038/s41598-024-68417-4

Download citation

Received : 28 March 2024

Accepted : 23 July 2024

Published : 03 September 2024

DOI : https://doi.org/10.1038/s41598-024-68417-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Smart city performance (SCP)
  • Low-carbon level (LCL)
  • ETPC analysis model
  • Coupling coordination degree
  • Smart low-carbon coupling coordination development paths

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: Anthropocene newsletter — what matters in anthropocene research, free to your inbox weekly.

the quality of a research study is primarily assessed on

  • Open access
  • Published: 03 September 2024

A scoping review of early childhood caries, poverty and the first sustainable development goal

  • Maha El Tantawi 1 , 2 ,
  • Dina Attia 1 , 2 ,
  • Jorma I. Virtanen 1 , 3 ,
  • Carlos Alberto Feldens 1 , 4 ,
  • Robert J. Schroth 1 , 5 ,
  • Ola B. Al-Batayneh 1 , 6 , 7 ,
  • Arheiam Arheiam 1 , 8 &
  • Morẹ́nikẹ́ Oluwátóyìn Foláyan 1 , 9  

BMC Oral Health volume  24 , Article number:  1029 ( 2024 ) Cite this article

Metrics details

Poverty is a well-known risk factor for poor health. This scoping review (ScR) mapped research linking early childhood caries (ECC) and poverty using the targets and indicators of the Sustainable Development Goal 1 (SDG1).

We searched PubMed, Web of Science, and Scopus in December 2023 using search terms derived from SDG1. Studies were included if they addressed clinically assessed or reported ECC, used indicators of monetary or multidimensional poverty or both, and were published in English with no date restriction. We excluded books and studies where data of children under 6 years of age could not be extracted. We charted the publication year, study location (categorized into income levels and continents), children age, sample size, study design, measures of ECC, types and levels of poverty indicators and adjusted analysis. The publications were also classified based on how the relation between poverty and ECC was conceptualized.

In total, 193 publications were included with 3.4 million children. The studies were published from 1989 to 2023. Europe and North America produced the highest number of publications, predominantly from the UK and the US, respectively. Age-wise, 3–5-year-olds were the most studied (62.2%). Primary studies (83.9%) were the majority, primarily of cross-sectional design (69.8%). Non-primary studies (16.1%) included reviews and systematic reviews. ECC was mainly measured using the dmf indices (79.3%), while poverty indicators varied, with the most common used indicator being income (46.1%). Most studies measured poverty at family (48.7%) and individual (30.1%) levels. The greatest percentage of publications addressed poverty as an exposure or confounder (53.4%), with some studies using poverty to describe groups (11.9%) or report policies or programs addressing ECC in disadvantaged communities (11.4%). In addition, 24.1% of studies requiring adjusted analysis lacked it. Only 13% of publications aligned with SDG1 indicators and targets.

The ScR highlight the need for studies to use indicators that provide a comprehensive understanding of poverty and thoroughly examine the social, political, and economic determinants and impact of ECC. More studies in low and middle-income countries and country-level studies may help design interventions that are setting- and economic context-relevant.

Peer Review reports

Introduction

The National Library of Medicine’s Medical Subject Heading (MeSH) defines poverty as the situation where people have a living standard that is below that of the community because of their income level [ 1 ], thus using income to define poverty. This monetary poverty definition considers a person or a household to be poor if their standard of living is below the national poverty line which captures the ability to meet basic needs in food, shelter, clothing, and other goods obtained through purchase [ 2 , 3 ].

Poverty can be classified based on how it is measured into absolute poverty referring to the percentage of the population having an income below the national or international poverty line and thus, unable to meet their basic needs [ 4 ]; and relative poverty where persons have living standards less than others in their community/ country and thus represents income inequality [ 5 ]. Income inequalities are measured by indices of inequality such as the Gini coefficient or the Lorenz curve [ 6 ]. Monetary poverty equates poverty with financial resources. On the other hand, multi-dimensional poverty uses a wider perspective of poverty [ 7 , 8 ] which includes, in addition to the monetary dimension, education, basic infra structure such as water, sanitation and electricity in addition to health, nutrition and security [ 9 ].

The sustainable development goals (SDGs), established in 2015, replaced the millennial development goals (MDGs) [ 10 ]. Compared to the MDGs, the SDGs targeted all countries and not only developing countries, and addressed economic growth, social inclusion and environmental protection and not only social development [ 11 , 12 ]. The first goal of the SDGs, however, remained the same as for the MDGs: to end poverty.

The SDG1 has seven targets aiming to build the resilience of the poor and those in vulnerable situations by reducing their exposure to environmental, economic, social shocks and disasters [ 12 ]. The seven targets of SDG1 define poverty in the monetary dimension as absolute poverty such as living below the international poverty lines (target 1.1) and below the national poverty lines (target 1.2) indicating extreme poverty. The targets also refer to multidimensional poverty including people in need of social protection systems (target 1.3) and people in need of basic services (target 1.4) [ 12 ]. Two targets, 1.a.2 and 1.b.1 aim to promote governmental spending to support the poor by providing essential services. A central goal in the SDGs is to end extreme poverty in all its forms everywhere by 2030.

Early childhood caries (ECC), which is the presence of any untreated, restored or extracted tooth with caries in a child below the age of 72 months [ 13 ], is linked to poverty. At country level, low and middle-income countries, which are resource-limited economies, have higher percentage of children with ECC than high income countries [ 14 ]. Monetary poverty is also associated with higher risk of ECC with higher ECC prevalence in countries with greater percentage of population living below the poverty line [ 15 ]. Also, multi-dimensional poverty is associated with high prevalence of ECC [ 16 ]. However, an ecological study suggested that not all forms of poverty are similarly linked to the risk of ECC [ 17 ].

The theory of economic, political, and social distortions suggests that poverty is linked to economic, political, and social systems, which limit the opportunities and resources to achieve well-being [ 18 ]. The SDG1 recognises the multiple dimensions of poverty, and how inequitable economic, political, and social structures impact health. The efforts to achieve the SDG1 targets may also have direct and indirect impact on shaping children’s oral health and their risk of ECC [ 14 , 19 ]. The eradication of poverty, institution of social protection systems, efforts to reduce the impact of disasters, increased governmental spending on essential services including health and education, and increased pro poor spending can have a positive impact on reducing the risk of ECC. On the other hand, ECC treatment and cost of care may have great financial implications on families of children with ECC. However, the pathways for this possible bidirectional relationship between ECC and poverty need to be identified.

The aim of this scoping review (ScR), therefore, was to identify research linking ECC and various types of poverty using the SDG1 targets and indicators. This ScR was guided by the question: what is the status of research on the relationship between ECC, poverty and the SDG1?

This ScR was performed in accordance with the Joanna Briggs Institute guidelines [ 20 ] and reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews guidelines (PRISMA-ScR) [ 21 ].

Eligibility criteria

Studies were included if they addressed ECC which is caries including untreated, filled or extracted primary teeth in children younger than 6 years of age as defined by Drury et al. [ 22 ] and endorsed by the American Association of Pediatric Dentistry [ 13 ] whether assessed clinically or reported by parents. Various ECC indicators were used including the presence of ECC experience (yes/no), the presence of untreated ECC (yes/no), ECC severity measured by the number of teeth/surfaces with ECC experience or the number of teeth/surfaces with untreated ECC. The studies had to include a measure of monetary poverty, multi-dimensional poverty, or both. Studies of any design or publication date were included.

Publications not written in English and books, book chapters and book reviews were excluded. We also excluded studies where data specific to children under 6 years of age could not be extracted. Studies with indicators exclusively associated with socioeconomic status that were addressed by SDG 4 (education), SDG 15 (migrants and refugees), and SDG 11 (urban/rural differences) were excluded. Publications with no full texts were excluded.

Information sources

Three electronic databases were searched: PubMed, Web of Science, and Scopus. The searches were conducted in March and April 2023 and updated in December 2023. The search terms for poverty were based on the terms for the SDG1 defined in Scopus [ 23 ] and adapted to the two other electronic databases. The search strategies can be seen in Appendix 1 . Expert members of the Early Childhood Caries Advocacy Group (ECCAG) were also consulted for other sources that might have been missed by the search strategy.

Selection of sources of evidence

Retrieved publications were exported to the reference management software Mendeley version 1.19.8. Duplicate publications were removed. Title and abstract screening was performed independently by two researchers (MET and DA) using the pre-defined inclusion and exclusion criteria. Full-text review of screened-in publications from the first level was then performed independently by the same two researchers. Uncertainty regarding whether publications met the inclusion criteria was resolved through consensus. The results were shared with two experts (RJS and MOF) for their review. Publications were retained when there was consensus between the experts and the reviewers. The final list of included publications was shared with members of the ECCAG for further confirmation. No external authors or institutions were contacted to identify additional sources.

Data charting

We extracted details on publication year (grouped into decades) and study location, defined as the country in which the study was conducted. The countries were grouped by continent and by income level (low-income countries (LICs), lower middle-income countries (LMICs), upper middle-income countries (UMICs) and high-income countries (HICs)) based on the World bank 2022-23 classification [ 24 ].

We also extracted the age of children included in the study in years. The publications were grouped into studies conducted among 0-2-year-olds, 3-5-year-olds, and 0-5-year-olds. The age grouping was done following the recommendation in our previous publication based on differences in data availability, ECC prevalence and age-related determinants of risk [ 25 ]. Information on sample size was also extracted.

The design of each study was identified and classified into primary and non-primary studies. Primary studies were further classified into cross-sectional, ecological, case control, cohort, randomized clinical trial (RCT) and protocols of any of the above. Non-primary studies included reviews, systematic reviews, scoping reviews, guidelines and opinions.

We extracted information on the measures of ECC and classified them into non-clinically assessed (parent reported, years lived with disability (YLDs)) [ 26 ] and clinically assessed including the presence/absence of untreated decay or caries experience, the number of surfaces/teeth with untreated decay or caries experience whether caries was measured using the dmf index, its variants or ICDAS [ 27 ] as well as pufa for caries complications [ 28 ].

In addition, we extracted information on poverty indicators and classified the publications into those using single or multiple indicators. We identified the publications using indicators of monetary poverty (absolute or relative) and multidimensional poverty. We defined absolute poverty as the percentage of the population below national or international poverty lines [ 4 ] and relative poverty as inequalities in ECC due to socio-economic position measured by the Lorenz curve, the Gini index, or both [ 5 , 6 ]. Multidimensional poverty was defined by at least two indicators of poverty exclusive of monetary poverty. Indicators of multidimensional poverty include crowding, occupation, school type, federal assistance, qualification for school meal, ownership of commodities, and food insecurity. We also assessed if poverty was measured at the individual, family, area, community, school, city and country levels by adapting the Fisher Owens conceptualization of the influences on child’s oral health [ 29 ].

In addition, we adapted Locker’s description of MacIntyre’s method [ 30 ] to classify primary studies according to how the relation between poverty and ECC was conceptualized. The method encompasses a view of poverty that includes economic, social, and cultural dimensions, and frames poverty within a broader historical and societal narrative, acknowledging its roots and evolution over time. This allows poverty-related actions to reach beyond the alleviation of material deprivation to the overall well-being and flourishing of individuals and communities. There were eight ways poverty could be conceptualized in relation to ECC: (1) defining a group or community of people; (2) in a causal relation with ECC (dependent variable) where poverty is an independent or a confounding variable; (3) measured by income level, and used to compare ECC among administrative units/countries in ecological studies; (4) assessing trends or changes across time in ECC based on poverty; (5) assessing a gradient of ECC or effect modification of ECC risk by levels of poverty; (6) the effect of poverty on ECC at different ages in life course studies; (7) evaluating the effect of policies or programs, including fluoridation, on ECC at different levels of poverty; and (8) ECC as independent factor and poverty as dependent or confounding factor. We also identified whether these relations were analyzed using bivariate or multivariable methods to adjust for confounders.

The extracted data were presented using descriptive statistics as numbers, frequencies, and sums. Excel was used for graphical presentation to generate bars/clustered bars, tree maps, and map. A word cloud was created to demonstrate study designs [ 31 ].

The initial search from the three databases yielded 3,377 potentially relevant publications. Thereafter, 614 duplicates were removed, and 2,354 publications were removed after screening the titles and abstracts, leaving 409 for full text screening. Of these, 40 publications could not be retrieved. On reading the full text of the remaining 369 publications, 176 publications did not meet the inclusion criteria and were removed leaving 193 publications to be included in this ScR. Figure  1 is a flowchart of the selection process. The details of the publications included in the ScR are in Appendix 2 .

figure 1

Flowchart of study selection process [ 32 ]

Fifteen (7.8%) papers were published in the 1990s or before, 54 (28.0%) in the 2000s, 81 (42.0%) in the 2010s and 43 (22.3%) from 2020 to December 2023 (Fig.  2 ). The publications increased in number since 2016, the year the SDGs came into effect, except for the year 2020 when the COVID-19 pandemic brought the world to a standstill. The average number of publications till 2015 was 5 and it doubled afterwards from 2016 to become 10.

figure 2

Number of publications on ECC and poverty by year

Twenty-two publications were multi-country, while 171 publications were country-specific (Fig.  3 ). The greatest number of publications were from Europe (49, 25.4%), mostly the UK (31 publications), and North America (46, 23.8%), mostly the US (42 publications), followed by South America (35, 18.1%) mainly Brazil (29 publications), Asia (20, 10.4%) mainly India (5 publications) then Australia and New Zealand (12 and 4 publications) and Africa (5, 2.6%) with three publications from South Africa and one each from Nigeria and Tanzania. Of all publications from specific countries, two (1.0%) were from LICs, 11 (5.7%) from LMICs, 40 (20.7%) from UMICs and 118 (61.1%) from HICs.

figure 3

Number of publications on ECC and poverty by country (darker shade of green indicates more publications from the country)

Twenty (10.4%) of the publications included 0–2-year-old children, 120 (62.2%) included 3–5-year-old children, and 40 (20.7%) included 0-5-year-old children. Also, 13 (6.7%) did not specify the age of the included children and 33 (17.1%) publications did not specify the number of children in the sample. The total number of children included in the remaining 160 publications was 3,404,236 with a median of 892 children per study, a minimum of 24 children and a maximum of 995,003 children. There was a much greater number of 3–5-year-old ( n  = 3,249,261) than 0–2-year-old children ( n  = 21,549).

Figure  4 shows that 162 (83.9%) publications were primary studies and 31 (16.1%) were non-primary studies. Most primary studies were cross-sectional in design (113, 69.8%), and there were 23 (14.2%) cohort studies. Most of the 31 publications about non-primary studies were reviews (14, 45.2%) or systematic reviews (10, 32.3%).

figure 4

Design of studies on the association between poverty and ECC (RCT: randomized controlled trial, CC: case-control, SR: systematic review, GL: guidelines, ScR: scoping review)

Most publications (153, 79.3%) measured ECC using dmf scores and 10 (5.2%) publications used ICDAS. One recent publication used the pufa index. Six publications assessed ECC non clinically, with five relying on parent reporting, and one using YLDs. Four publications used a mixture of indices and 19 publications did not specify how ECC was measured.

Figure  5 shows the poverty indicators used in the publications. There were 113 (58.5%) publications with a single poverty indicator and 71 (36.8%) with more than one indicator, whereas 9 (4.7%) did not specify the indicators used. The most common indicator was income (89, 46.1%) followed by area-based indicators such as the Jarman deprivation index, the Townsend deprivation index and the Carstairs index of deprivation in the United Kingdom, the Socio-Economic Indexes for Areas (SEIFA) in Australia, the area deprivation index in the United States of America (37, 19.2%) and occupation (36, 18.7%). Four (2.1%) publications explicitly used indicators of multidimensional poverty. Relative poverty was measured in 8 (4.2%) publications, including four (2.1%) using the Gini coefficient. Thirteen (6.7%) publications reported the percentage of the population below the national poverty line.

Most (157, 81.4%) publications used poverty indicators measured at one level and the remaining 18.6% assessed it at two ( n  = 28), three ( n  = 7) or four ( n  = 1) levels. Poverty was measured at the family (94, 48.7%), or individual (58, 30.1%) levels followed by area (34, 17.6%) or community (31, 16.1%) levels. Some publications assessed poverty at the school level (16, 8.3%) and only a few measured it at city or country levels (2 and 3 respectively).

figure 5

Word cloud of poverty indicators used in the publications included in the ScR (ABI: area-based index, GDP: gross domestic product, GNI: gross national income, GNP: gross national product, SES: socio economic status, SE: socio economic)

Figure  6 shows that more than half of the publications (103, 53.4%) addressed poverty as an exposure or confounder. Less studies used poverty as descriptor of participants (23, 11.9%), reported on policies, or programs addressing ECC in disadvantaged communities (22, 11.4%), or assessed effect modification by poverty on ECC (21, 10.9%). Five publications reported on the cost implications of ECC. Adjusted analysis was not indicated in 56 (29.0%) of the included studies. In the remaining 137, adjusted analysis was used in 104 (75.9%) while 33 (24.1%) publications that required adjusted analysis did not have this done.

figure 6

Classification of publications according to conceptualization of poverty and use of adjusted analysis (NA = use of adjusted analysis not applicable)

In addition, 25 (13.0%) publications addressed absolute, relative or multidimensional poverty (SDG targets 1.1 to 1.4). Furthermore, from 2016 to 2023, only 11 publications addressed programs and policies on ECC in disadvantaged populations (SDG1 targets 1.a.2 and 1.b.1).

This ScR showed that most studies on ECC and poverty were conducted after 2010, focused on 3–5-year-old children, used a cross-sectional design, and measured ECC clinically. Poverty was mainly assessed using single indicators, indicators of monetary poverty, or measures of poverty at the family level. Studies aligning with the SDG1, including the impact of social protection systems, access to basic services, governmental spending on essential services and pro poor public spending, were few. Most publications linked poverty to ECC as a confounder or an independent variable, some evaluated the impact of policies and programs on ECC, and few studies assessed trends or effect modification. Few studies also assessed the cost of ECC or its impact on oral and general health. Most studies were conducted in HICs and UMICs, and more than half of the studies originated from UK, US and Brazil. There was minimal evidence on the link between country-level poverty indicators and ECC.

There was a large number of publications that used multiple and diverse indicators of ECC and poverty, with different levels at which they were measured and limited use of adjusted analysis. Because of this, we could not provide a summary of the association between ECC and poverty. Therefore, we opted to present this ScR as a preliminary step to a systematic review that would allow better assessment of the quality of included studies and the impact of poverty on ECC at different units and levels.

The strengths of this study include the comprehensive review of publications in English, searching the three main electronic databases, and identifying knowledge gaps that need to be addressed to improve our understanding of the studied association. One of the limitations was the inclusion of publications in English only which may underestimate the number of identified papers. Also, we included reviews and study protocols, and it is possible that there would be some duplication because of this. However, the aim of the scoping review was to map the literature and not to extract estimates to quantify relationships like in a systematic review where estimate duplication would pose a problem. In addition, we did not include all potential databases where publications on ECC and poverty might be indexed. However, the study provides insights about the research on the link between ECC and poverty and the gaps in this area. There are several important findings.

First, this ScR aimed to align the literature on ECC and poverty with the United Nations’ SDG1 targets and indicators. The dental scientific community seems to be responding to the SDGs as more research in this area has been published on average starting from 2016 than before. However, there was can be better alignment with how the SDG1 defined poverty. Thus, despite the increasing number of publications in the field, more alignment is needed to ensure that dental research supports the achievement of SDG1 in relation to ECC.

Second, the current review identified that studies from LICs or LMICs were few. This affects our full understanding of the relationship between ECC and poverty. ECC research in LICs and LMICs faces several challenges including the need to build research capacity [ 33 , 34 ] and secure data sources. Few LICs and LMICs routinely collect nationally representative data on ECC [ 25 ]. The 2022 Global Oral Health Status Report [ 35 ] used advanced computational methodologies to infer disease estimates for countries where there are none based on the Global Burden of Diseases (GBD) studies. However, the GBD studies reported only on untreated caries in primary teeth regardless of age, and did not include the sequalae, such as extraction and filling [ 36 ], which are part of the ECC definition [ 13 ]. Thus, global comparisons using the GBD estimates do not adequately assess the ECC burden. However, the World Health Organization included 5-year-olds as one of the index age groups for surveys up to the 2013 edition of their oral health survey manual [ 37 ]. At the same time, data are routinely collected to monitor maternal and child health in LICs and LMICs using global health surveys such as the Demographic and Health Surveys and the Multiple Indicator Cluster Surveys [ 38 ]. If ECC indicators are incorporated into these surveys, ECC can be embedded in the surveillance systems of these countries. This may require the use of non-clinical indicators of ECC such as self-reporting which reduces the necessity for labour-intensive clinical examinations. However, research is needed to improve the accuracy of these non-clinical indicators. This approach may help address the current gap we noticed wherein only few studies used poverty indicators at country level. When data is available, it would be possible to better understand the impact of poverty on the risk of ECC in Africa and Southeast Asia where most LICs and LMICs are located, where the greatest number of people with caries live [ 35 ], and where the largest number of 0–5-year-old children are resident [ 39 ].

Third, only 10% of studies focused on 0-2-year-old children. Our previous research showed that 0-2- and 3–5-year-old children have different ECC profiles and disease determinants and hence, the need to differentiate between these two age groups [ 25 ]. In addition, 0-2-year-old children may require different, non-clinical ECC indicators due to the relative difficulty of examining children at this young age. Also, different level of measuring poverty may be needed since children at this age are less likely to be in schools [ 40 ]. Greater focus on younger age would enable better understanding of the life course of caries at different age groups and allow the design of interventions for ECC prevention tailored to this age.

Fourth, several studies used multiple indicators to assess monetary poverty at individual, family and to a lesser extent at the community, city or country levels. There were, however, no studies focusing on polices about the impact of poverty on ECC at the sub-regional or country levels. This is a gap that limits the implementation of country specific programs. Policies aiming to control poverty can affect individuals, families, and community level experiences of ECC, and cumulatively impact country-level ECC experience than vice-versa. Future ECC-poverty research needs to align with the SDG1 and enable the promotion of poverty alleviation programs and policies that have greater potential of inducing large scale and sustained impact on populations. Greater country-focused economic development and reduced poverty are expected to reduce the burden of poor oral health including ECC [ 16 ].

Though far-reaching progress has been made to eliminate poverty in countries like China and India [ 41 , 42 ], progress has been slow in South Asia and sub-Saharan Africa, where about 80% of those living in extreme poverty reside and where there is a huge burden of ECC [ 43 , 44 ]. New threats brought on by climate change, conflict, food insecurity and COVID-19 mean that more work needs to be done to bring people out of poverty [ 44 ] and reduce the risk of ECC as a public health threat in deprived settings and developing countries. Socioeconomic inequalities drive higher disease burden in disadvantaged populations within and across societies, and over the life course [ 35 ]. The causes of these inequalities are often complex and related to country-specific cultural, economic, historical, social, or political factors, and reflect the inequitable distribution of resources in society including power, money, and agency among others [ 5 , 45 ]. Research disentangling the complexities of cultural, economic, healthcare system and political factors at country level is needed to guide policy formulation, and to set international agendas for oral health.

Fifth, most studies used poverty as an independent factor affecting ECC, as a descriptor for participants, or a modifier for the association between ECC and other factors. More studies are needed to investigate the impact of ECC on the risk for poverty as untreated ECC can increase the risk of reduced productivity and has high management associated costs for parents [ 46 ]. Studies on the economic impact of ECC can support advocacy efforts that call for greater investment in poverty-alleviation programs controlling ECC. There is currently limited information on the cost of ECC care and the extent to which it contributes to catastrophic oral healthcare expenditure since ECC has a socioeconomic gradient and is concentrated in the most disadvantaged [ 47 ]. ECC places further stressors on already strained healthcare systems due to the increased demand for treatment under general anaesthesia. In addition, information on dental insurance that covers ECC is scarce. This information is critical to advocate for the inclusion of ECC prevention and management within universal healthcare care schemes in settings with high disease burden.

In addition, there has been recent interest in the impact of ECC on oral and general health in children and adults in later life [ 48 , 49 , 50 ]. This line of research sheds light on how ECC affects child growth, nutritional status and wellbeing. Such evidence helps better advocacy for ECC care and bridges the gap separating oral and general healthcare of children. In addition, it directly puts ECC on the agenda of child healthcare.

Sixth, the use of adjusted analysis to control confounders was limited in the studies in this ScR. Confounders obscure the effect of exposures on dependent variables [ 51 ] and if not controlled, produce biased estimates of the relation between ECC and poverty, threatening internal validity and leading to incorrect conclusions either over- or underestimating the effect [ 52 ]. This problem has been previously reported in dental research with differences among dental journals in the use of adjusted analysis [ 53 ]. Also, a ScR of waste in dental research [ 54 ] showed that confounding was ignored in 17% of non-randomized studies and 21% of different types of studies. Journals need to emphasize the use of adjusted analyses.

In conclusion, this ScR suggests that research on the link between poverty and ECC is growing and currently occurs in limited settings and economic contexts. More research is needed using indicators based on a more comprehensive definition of poverty, to assess the social, political and economic determinants and impact of ECC. More studies are also needed in Africa and South Asia where the burden of ECC is high and there are currently very few studies. In addition, country-level studies about poverty and ECC are also needed to support context-specific responses for ECC management.

Data availability

The datasets used and/or analysed for the study are publicly accessible.

MESH, Poverty. (ND). https://www.ncbi.nlm.nih.gov/mesh/68011203 . Accessed 14 April 2024.

Institut National de la Statistique et des Etudes Economique. Monetary Poverty. https://www.insee.fr/en/metadonnees/definition/c1653#:~:text=An%20individual%20(or%20a%20household,Canada)%20adopt%20an%20absolute%20approach . Accessed 14 April 2024.

Eurostat, Glossary. Monetary poverty. https://ec.europa.eu/eurostat/statistics-explained/index.php?title=Glossary:Monetary_poverty . Accessed 16 July 2024.

Eskelinen T. Absolute Poverty. In: Chatterjee DK, editors Encyclopedia of Global Justice. Springer, Dordrecht. 2011. https://doi.org/10.1007/978-1-4020-9160-5_178 . Accessed 16 July 2024.

Niño-Zarazúa M, Roope L, Tarp F. Global inequality: relatively lower, absolutely higher. Rev Income Wealth. 2017;63(4):661–84.

Article   Google Scholar  

Key Differences. Difference Between Absolute and Relative Poverty. https://keydifferences.com/difference-between-absolute-and-relative-poverty.html . Accessed 16 July 2024.

World Bank Blogs. Does monetary poverty capture all aspects of poverty? https://blogs.worldbank.org/developmenttalk/does-monetary-poverty-capture-all-aspects-poverty . Accessed 16 July 2024.

Evans M, Nogales R, Robson M. Monetary and multidimensional poverty: correlations, mismatches and joint distributions. Queen Elizabeth House, University of Oxford; 2020.

World Bank. Beyond Monetary Poverty. In: Poverty and shared prosperity 2018: Piecing together the poverty puzzle. http://documents.worldbank.org/curated/en/104451542202552048/Poverty-and-Shared-Prosperity-2018-Piecing-Together-the-Poverty-Puzzle . Accessed 16 July 2024.

Woodbridge M. From MDGs to SDGs: what are the sustainable development goals. Bonn: ICLEI—Local Governments for Sustainability. 2015. https://www.local2030.org/library/251/From-MDGs-to-SDGs-What-are-the-Sustainable-Development-Goals.pdf . Accessed 16 July 2024.

United Nations. The Sustainable Development Agenda. https://www.un.org/sustainabledevelopment/development-agenda-retired/ . Accessed 14 April 2024.

United Nations. Millennium Development Goals (MDGs). https://unis.unvienna.org/unis/topics/related/2015/millennium-development-goals-mdgs.html#:~:text=Eradicate%20extreme%20poverty%20and%20hunger,Reduce%20child%20mortality . Accessed 14 April 2024.

American Academy of Pediatric Dentistry. Policy on early childhood caries (ECC): classifications, consequences, and preventive strategies. The reference Manual of Pediatric Dentistry. Chicago, Ill.: American Academy of Pediatric Dentistry; 2020. pp. 79–81.

Google Scholar  

Folayan MO, Tantawi ME, Virtanen JI, Feldens CA, Rashwan M, Kemoli AM, et al. An ecological study on the association between universal health service coverage index, health expenditures, and early childhood caries. BMC Oral Health. 2021;21:1–7.

Bagramian RA, Garcia-Godoy F, Volpe AR. The global increase in dental caries. A pending public health crisis. Am J Dent. 2009;22(1):3–8.

PubMed   Google Scholar  

Bencze Z, Mahrouseh N, Andrade CAS, Kovács N, Varga O. The Burden of Early Childhood caries in children under 5 Years Old in the European Union and Associated Risk factors: an ecological study. Nutrients. 2021;29(2):455.

Folayan MO, El Tantawi M, Aly NM, et al. Association between early childhood caries and poverty in low and middle income countries. BMC Oral Health. 2020;6(1):8.

Bradshaw TK. Theories of poverty and Anti-poverty Programs in Community Development. Community Dev J. 2009;1(1):7–25.

Kemoli AM. Paediatric oral health and climate change. Edorium J Dent. 2019;6(100034):D01AK2019.

Peters MDJ, Marnie C, Tricco AC, Pollock D, Munn Z, Alexander L, et al. Updated methodological guidance for the conduct of scoping reviews. JBI Evid Synth. 2020;18(10):2119–26.

Article   PubMed   Google Scholar  

Tricco AC, Lillie E, Zarin W, O’Brien KK, Colquhoun H, Levac D, et al. PRISMA Extension for scoping reviews (PRISMAScR): Checklist and Explanation. Ann Intern Med. 2018;169:467–73.

Drury TF, Horowitz AM, Ismail AI, Maertens MP, Rozier RG, Selwitz RH. Diagnosing and reporting early childhood caries for research purposes. A report of a workshop sponsored by the National Institute of Dental and Craniofacial Research, the Health Resources and Services Administration, and the Health Care Financing Administration. J Public Health Dent. 1999;59(3):192–7.

Article   PubMed   CAS   Google Scholar  

Scopus. What are Sustainable Development Goals (SDGs). https://service.elsevier.com/app/answers/detail/a_id/31662/supporthub/scopuscontent/ . Accessed 16 July 2024.

World bank. New World Bank country classifications by income level: 2022–2023. https://blogs.worldbank.org/opendata/new-world-bank-country-classifications-income-level-2022-2023 . Accessed 14 April 2024.

El Tantawi M, Folayan MO, Mehaina M, Vukovic A, Castillo JL, Gaffar BO, et al. Prevalence and Data Availability of Early Childhood Caries in 193 United Nations Countries, 2007–2017. Am J Public Health. 2018;108(8):1066–72.

IHME. Global Burden of Disease (GBD). https://www.healthdata.org/research-analysis/about-gbd . Accessed 14 April 2024.

Pitts N, Banerjee A, Mazevet M, et al. From ‘ICDAS’ to ‘CariesCare International’: the 20-year journey building international consensus to take caries evidence into clinical practice. Br Dent J. 2021;231:769–74.

Monse B, Heinrich-Weltzien R, Benzian H, Holmgren C, van Palenstein Helderman W. PUFA–an index of clinical consequences of untreated dental caries. Community Dent Oral Epidemiol. 2010;38(1):77–82.

Fisher-Owens SA, Gansky SA, Platt LJ, Weintraub JA, Soobader MJ, Bramlett MD, et al. Influences on children’s oral health: a conceptual model. Pediatrics. 2007;1(3):e510–20.

Locker D. Deprivation and oral health: a review. Community Dent Oral Epidemiol. 2000;28(3):161–9.

TagCrowd. Create your own word cloud. https://tagcrowd.com/ . Accessed 14 April 2024.

Haddaway NR, Page MJ, Pritchard CC, McGuinness LA. PRISMA2020: an R package and Shiny app for producing PRISMA 2020-compliant flow diagrams, with interactivity for optimised digital transparency and open synthesis. Campbell Syst Rev. 2022;18, e1230.

Franzen SR, Chandler C, Lang T. Health research capacity development in low and middle income countries: reality or rhetoric? A systematic meta-narrative review of the qualitative literature. BMJ Open. 2017;7(1):e012332.

Ahmed A, Daily JP, Lescano AG, Golightly LM, Fasina A. Challenges and strategies for Biomedical Researchers returning to low- and Middle-Income countries after Training. Am J Trop Med Hyg. 2020;102(3):494–6.

World Health Organization. Global oral health status report: towards universal health coverage for oral health by 2030. Geneva: World Health Organization; 2022.

Tsai C, Li A, Brown S, Deveridge C, El Gana R, Kucera A, et al. Early childhood caries sequelae and relapse rates in an Australian public dental hospital. Int J Paediatr Dent. 2023;33(1):1–11.

Petersen PE, Baez, Ramon J, World Health Organization. Oral health surveys: basic methods, 5th ed. World Health Organization. 2013. https://iris.who.int/handle/10665/97035 . Accessed July 15th, 2024.

World Health Organization. Technical notes: Potential for improvement in RMNCH interventions. https://cdn.who.int/media/docs/default-source/gho-documents/health-equity/state-of-inequality/technical-notes/health-equity-rmnch-potential-for-improvement.pdf?sfvrsn=1fba8883_2&download=true . Accessed 15 July 2024.

World Health Organization. As more go hungry and malnutrition persists, achieving Zero Hunger by 2030 in doubt, UN report warns. 2020. https://www.who.int/news/item/13-07-2020-as-more-go-hungry-and-malnutrition-persists-achieving-zero-hunger-by-2030-in-doubt-un-report-warns . Accessed July 15, 2024.

Haque F, Folayan MO, Virtanen JI. Preventive behaviour and attitudes towards early childhood caries amongst mothers of toddlers in Bangladesh. Acta Odontol Scan. 2024;83:76–82.

Li S. Income inequality and economic growth in China in the last three decades. Round Table 2016;105(6):641–65.

Sehrawat M, Giri AK. The impact of financial development, economic growth, income inequality on poverty: evidence from India. Empir. Econ. 2018;55(4):1585–602.

Fosua AK. Economic structure, growth, and evolution of inequality and poverty in Africa: an overview. J Afr Econ. 2018;27(1):1–9.

Mainali B, Luukkanen J, Silveira S, Kaivo-oja J. Evaluating synergies and Trade-Offs among Sustainable Development Goals (SDGs): explorative analyses of Development paths in South Asia and Sub-saharan Africa. Sustainability. 2018;10(3):815.

Watt RG. Social determinants of oral health inequalities: implications for action. Community Dent Oral Epidemiol. 2012;40(Suppl 2):44–8.

Vujicic M, Listl S. April. An Economic perspective of the global burden of dental caries. https://www.acffglobal.org/wp-content/uploads/2021/03/An-economic-perspective-on-the-burden-of-dental-caries.pdf . Accessed 14 2024.

Chen J, Duangthip D, Gao SS, Huang F, Anthonappa R, Oliveira BH, et al. Oral health policies to Tackle the Burden of Early Childhood caries: a review of 14 Countries/Regions. Front Oral Health. 2021;9:2:670154.

Nora ÂD, da Silva Rodrigues C, de Oliveira Rocha R, Soares FZM, Minatel Braga M, Lenzi TL. Is Caries Associated with negative impact on oral health-related quality of life of pre-school children? A systematic review and Meta-analysis. Pediatr Dent. 2018;40(7):403–11.

Pakkhesal M, Riyahi E, Naghavi Alhosseini A, Amdjadi P, Behnampour N. Impact of dental caries on oral health related quality of life among preschool children: perceptions of parents. BMC Oral Health. 2021;15(1):68.

Zaror C, Matamala-Santander A, Ferrer M, Rivera-Mendoza F, Espinoza-Espinoza G, Martínez-Zapata MJ. Impact of early childhood caries on oral health-related quality of life: a systematic review and meta-analysis. Int J Dent Hyg. 2022;20(1):120–35.

Jager KJ, Zoccali C, Macleod A, Dekker FW. Confounding: what it is and how to deal with it. Kidney Int. 2008;73(3):256–60.

Skelly AC, Dettori JR, Brodt ED. Assessing bias: the importance of considering confounding. Evid Based Spine Care J. 2012l;3(1):9–12.

Pandis N, Polychronopoulou A, Madianos P, Makou M, Eliades T. Reporting of research quality characteristics of studies published in 6 major clinical dental specialty journals. J Evid Based Dent Pract. 2011;11(2):75–83.

Pandis N, Fleming PS, Katsaros C, Ioannidis JPA. Dental Research Waste in Design, Analysis, and reporting: a scoping review. J Dent Res. 2021;100(3):245–52.

Download references

Acknowledgements

Not applicable.

Author information

Authors and affiliations.

Early Childhood Caries Advocacy Group, Winnipeg, Canada

Maha El Tantawi, Dina Attia, Jorma I. Virtanen, Carlos Alberto Feldens, Robert J. Schroth, Ola B. Al-Batayneh, Arheiam Arheiam & Morẹ́nikẹ́ Oluwátóyìn Foláyan

Department of Pediatric Dentistry and Dental Public Health, Faculty of Dentistry, Alexandria University, Alexandria, Egypt

Maha El Tantawi & Dina Attia

Department of Clinical Dentistry, Faculty of Medicine, University of Bergen, Bergen, Norway

Jorma I. Virtanen

Department of Pediatric Dentistry, Universidade Luterana do Brasil, Canoas, Brazil

Carlos Alberto Feldens

Department of Preventive Dental Science, Rady Faculty of Health Sciences, Dr. Gerald Niznick College of Dentistry, University of Manitoba, Winnipeg, Canada

Robert J. Schroth

Department of Orthodontics, Pediatric and Community Dentistry, College of Dental Medicine, University of Sharjah, PO Box 27272, Sharjah, United Arab Emirates

Ola B. Al-Batayneh

Department of Preventive Dentistry, Faculty of Dentistry, Jordan University of Science and Technology, Irbid, Jordan

Department of Community and Preventive Dentistry, University of Benghazi, Benghazi, Libya

Arheiam Arheiam

Department of Child Dental Health, Obafemi Awolowo University, Ile-Ife, Nigeria

Morẹ́nikẹ́ Oluwátóyìn Foláyan

You can also search for this author in PubMed   Google Scholar

Contributions

MOF conceived the study. The project was managed by MET and MOF. Data curation was done by MET. Data analysis was conducted by MET and DY. MOF and MET developed the first draft of the document. MET, DA, JIV, CAF, RJS, OBA, AA, MOF read the draft manuscript, made inputs prior to the final draft and approved the final manuscript for submission.

Corresponding author

Correspondence to Maha El Tantawi .

Ethics declarations

Ethics approval and consent to participate, consent for publication, competing interests.

Jorma I. Virtanen, Ola B. Al-Batayneh and Arheiam Arheiam are editorial Board members with BMC Oral Health. Morenike Oluwatoyin Folayan and Maha El Tantawi are Senior Editor Board members with BMC Oral Health. All other authors declare no conflict of interest.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Supplementary material 2, supplementary material 3, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Cite this article.

El Tantawi, M., Attia, D., Virtanen, J.I. et al. A scoping review of early childhood caries, poverty and the first sustainable development goal. BMC Oral Health 24 , 1029 (2024). https://doi.org/10.1186/s12903-024-04790-w

Download citation

Received : 19 April 2024

Accepted : 22 August 2024

Published : 03 September 2024

DOI : https://doi.org/10.1186/s12903-024-04790-w

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Material deprivation
  • Oral health
  • Income inequality
  • Sustainable development
  • Dental caries

BMC Oral Health

ISSN: 1472-6831

the quality of a research study is primarily assessed on

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

land-logo

Article Menu

the quality of a research study is primarily assessed on

  • Subscribe SciFeed
  • Recommended Articles
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

Effect of land marketization on the high-quality development of industry in guangdong province, china.

the quality of a research study is primarily assessed on

1. Introduction

2. research hypotheses, 3. methodology and data sources, 3.1. measurement of land marketization, 3.2. indicators and method for measuring industrial hqd, 3.3. theil index and its decomposition, 3.4. econometric model, 3.5. data sources, 4. temporal changes and spatial heterogeneity in land marketization and industrial hqd, 4.1. land marketization, 4.2. industrial hqd, 5. empirical testing and analysis, 5.1. overall effect of land marketization on industrial hqd, 5.2. effect of land marketization on the seven dimensions of industrial hqd, 5.3. mechanisms for the effect of land marketization on industrial hqd, 6. discussion, 6.1. financial risk management and social responsibility should be emphasized in industrial hqd, 6.2. effect of land marketization on china’s economic transformation, 6.3. limitations and future research, 6.4. policy implications, 7. conclusions, author contributions, data availability statement, conflicts of interest.

  • Liu, J.; Huang, X. Spatio-temporal evolution and drivers of high-quality utilization of urban land in Chinese cities. Land 2024 , 13 , 1077. [ Google Scholar ] [ CrossRef ]
  • Henderson, J.V.; Su, D.; Zhang, Q.; Zheng, S. Political manipulation of urban land markets: Evidence from China. J. Public Econ. 2022 , 214 , 104730. [ Google Scholar ] [ CrossRef ]
  • Liu, Z.; Ling, Y. Structural transformation, TFP and high-quality development. China Econ. 2022 , 17 , 70–82. [ Google Scholar ]
  • Wang, X.; Wang, M.; Lu, X.; Guo, L.; Zhao, R.; Ji, R. Spatio-temporal evolution and driving factors of the high-quality development of provincial tourism in China. Chin. Geogr. Sci. 2022 , 32 , 896–914. [ Google Scholar ] [ CrossRef ]
  • Qu, L.; Wang, L.; Ji, H. Research on evaluating high-quality industrial development at regional level in China. J. Quant. Tech. Econ. 2021 , 38 , 45–61. [ Google Scholar ]
  • Ruggerio, C.A. Sustainability and sustainable development: A review of principles and definitions. Sci. Total Environ. 2021 , 786 , 147481. [ Google Scholar ] [ CrossRef ]
  • Benítez-Márquez, M.D.; Sánchez-Teba, E.M.; Coronado-Maldonado, I. An alternative index to the global competitiveness index. PLoS ONE 2022 , 17 , e0265045. [ Google Scholar ] [ CrossRef ]
  • Datta, D.K.; Guthrie, J.P.; Wright, P.M. Human resource management and labor productivity: Does industry matter? Acad. Manag. J. 2005 , 48 , 135–145. [ Google Scholar ] [ CrossRef ]
  • He, J.; Liu, H.; Salvo, A. Severe air pollution and labor productivity: Evidence from industrial towns in China. Am. Econ. J. Appl. Econ. 2019 , 11 , 173–201. [ Google Scholar ] [ CrossRef ]
  • Gao, K.; Yuan, Y. Spatiotemporal pattern assessment of China’s industrial green productivity and its spatial drivers: Evidence from city-level data over 2000–2017. Appl. Energy 2022 , 307 , 118248. [ Google Scholar ] [ CrossRef ]
  • Nikolaou, I.E.; Matrakoukas, S.I. A framework to measure eco-efficiency performance of firms through EMAS reports. Sustain. Prod. Consum. 2016 , 8 , 32–44. [ Google Scholar ] [ CrossRef ]
  • Xia, F.; Xu, J. Green total factor productivity: A re-examination of quality of growth for provinces in China. China Econ. Rev. 2020 , 62 , 101454. [ Google Scholar ] [ CrossRef ]
  • Du, J.; Lu, Y.; Tao, Z. Economic institutions and FDI location choice: Evidence from US multinationals in China. J. Comp. Econ. 2008 , 36 , 412–429. [ Google Scholar ] [ CrossRef ]
  • Jiang, X.; He, J.; Fang, L. Measure, regional difference and promotion path of high-quality development level of manufacturing. Shanghai Econ. Rev. 2019 , 7 , 70–78. [ Google Scholar ]
  • Realistic Conditions and Policy Orientation Research Group of Institute of Industrial Economics of CASS. China’s industry in the process of modernization: Logic of development. China Ind. Econ. 2024 , 3 , 5–23. [ Google Scholar ]
  • Mi, Z.; Coffman, D.M. The sharing economy promotes sustainable societies. Nat. Commun. 2019 , 10 , 1214. [ Google Scholar ] [ CrossRef ]
  • Wang, F.; Wang, R.; He, Z. The impact of environmental pollution and green finance on the high-quality development of energy based on spatial Dubin model. Resour. Policy 2021 , 74 , 102451. [ Google Scholar ] [ CrossRef ]
  • Li, B.; Wang, H. Comprehensive evaluation of urban high-quality development: A case study of Liaoning Province. Environ. Dev. Sustain. 2023 , 25 , 1809–1831. [ Google Scholar ] [ CrossRef ]
  • Wang, D.; Zhang, E.; Liao, H. Does fiscal decentralization affect regional high-quality development by changing peoples’ livelihood expenditure preferences: Provincial evidence from China. Land 2022 , 11 , 1407. [ Google Scholar ] [ CrossRef ]
  • Liu, J.; Zhang, L.; Zhang, N. Analyzing the South-North gap in the high-quality development of China’s urbanization. Sustainability 2022 , 14 , 2178. [ Google Scholar ] [ CrossRef ]
  • Lyu, Y.; Wang, W.; Wu, Y.; Zhang, J. How does digital economy affect green total factor productivity? Evidence from China. Sci. Total Environ. 2023 , 857 , 159428. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Shapiro, J.S.; Walker, R. Why is pollution from US manufacturing declining? The roles of environmental regulation, productivity, and trade. Am. Econ. Rev. 2018 , 108 , 3814–3854. [ Google Scholar ] [ CrossRef ]
  • Zhu, H.; Dai, Z.; Jiang, Z. Industrial agglomeration externalities, city size, and regional economic development: Empirical research based on dynamic panel data of 283 cities and GMM method. Chin. Geogr. Sci. 2017 , 27 , 456–470. [ Google Scholar ] [ CrossRef ]
  • Zhang, S.; Wu, Z.; Wang, Y.; Hao, Y. Fostering green development with green finance: An empirical study on the environmental effect of green credit policy in China. J. Environ. Manag. 2021 , 296 , 113159. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Jin, W.; Zhou, C. Impact of administrative level and land marketization level on urban construction land utilization benefits. Sci. Geogr. Sin. 2023 , 43 , 2080–2090. [ Google Scholar ]
  • Chen, Y.; Lee, C.C. Does technological innovation reduce CO 2 emissions? Cross-country evidence. J. Clean. Prod. 2020 , 263 , 121550. [ Google Scholar ] [ CrossRef ]
  • Cheng, J.; Zhao, J.; Zhu, D.; Jiang, X.; Zhang, H.; Zhang, Y. Land marketization and urban innovation capability: Evidence from China. Habitat Int. 2022 , 122 , 102540. [ Google Scholar ] [ CrossRef ]
  • Jiang, X.; Lu, X.; Gong, M. Land leasing marketization, industrial structure optimization and urban green total factor productivity: An empirical study based on Hubei Province. China Land Sci. 2019 , 33 , 50–59. [ Google Scholar ]
  • Zhao, A.; Ma, X.; Qu, F. Does market reform increase industrial land use efficiency in China? China Popul. Resour. Environ. 2016 , 3 , 118–126. [ Google Scholar ]
  • Albouy, D.; Ehrlich, G. Housing productivity and the social cost of land-use restrictions. J. Urban Econ. 2018 , 107 , 101–120. [ Google Scholar ] [ CrossRef ]
  • Zou, T. Technological innovation promotes industrial upgrading: An analytical framework. Struct. Chang. Econ. Dyn. 2024 , 70 , 150–167. [ Google Scholar ] [ CrossRef ]
  • Padhi, S.P. Importance of employment growth: A perspective on technological progress. Indian J. Labor Econ. 2018 , 61 , 401–409. [ Google Scholar ] [ CrossRef ]
  • Jin, W.; Zhou, C. Effect of land marketization level and land prices on foreign direct investment in China. Land 2022 , 11 , 1433. [ Google Scholar ] [ CrossRef ]
  • Xu, G.; Liu, J.; Zhang, M. Land property rights, spatial form, and land performance: A framework of policy performance evaluation on collective-owned construction land and evidence from rural China. Land 2024 , 13 , 956. [ Google Scholar ] [ CrossRef ]
  • Gong, G.; Wu, Q.; Gao, S. Influence and mechanism of land marketization on regional technological innovation. Urban Probl. 2020 , 3 , 68–78. [ Google Scholar ]
  • Cai, H.; Henderson, V.; Zhang, Q. China’s land market auctions: Evidence of corruption. Rand J. Econ. 2013 , 44 , 488–521. [ Google Scholar ] [ CrossRef ]
  • Wang, K.; Xiong, Z.; Gao, W. Research on the relationship between the conveyance mode of industrial land use rights and the output elasticity of land in the development zones. China Land Sci. 2013 , 27 , 73–76. [ Google Scholar ]
  • Zhao, A.; Ploegmakers, H.; Samsura, A.A.; van der Krabben, E.; Ma, X. Price competition and market concentration: Evidence from the land market in China. Cities 2024 , 144 , 104631. [ Google Scholar ] [ CrossRef ]
  • Othman, A. The role of economic freedom, governance, and business environment in attracting foreign direct investment in the Arab region. J. Econ. Bus. 2022 , 2 , 1–19. [ Google Scholar ]
  • Craiut, L.; Bungau, C.; Bungau, T.; Grava, C.; Otrisal, P.; Radu, A.F. Technology transfer, sustainability, and development, worldwide and in Romania. Sustainability 2022 , 14 , 15728. [ Google Scholar ] [ CrossRef ]
  • Liu, T.; Cao, G.; Yan, Y.; Wang, R. Urban land marketization in China: Central policy, local initiative, and market mechanism. Land Use Policy 2016 , 57 , 265–276. [ Google Scholar ] [ CrossRef ]
  • Huang, Z.; He, C.; Li, H. Local government intervention, firm–government connection, and industrial land expansion in China. J. Urban Aff. 2019 , 41 , 206–222. [ Google Scholar ] [ CrossRef ]
  • Jin, B. Study on the “high-quality development” economics. China Political Econ. 2018 , 1 , 163–180. [ Google Scholar ]
  • Rahman, P.; Zhang, Z.; Musa, M. Do technological innovation, foreign investment, trade and human capital have a symmetric effect on economic growth? Novel dynamic ARDL simulation study on Bangladesh. Econ. Change Restruct. 2023 , 56 , 1327–1366. [ Google Scholar ] [ CrossRef ]
  • Pan, W.; Wang, J.; Lu, Z.; Liu, Y.; Li, Y. High-quality development in China: Measurement system, spatial pattern, and improvement paths. Habitat Int. 2021 , 118 , 102458. [ Google Scholar ] [ CrossRef ]
  • Jihadi, M.; Vilantika, E.; Hashemi, S.M.; Arifin, Z.; Bachtiar, Y.; Sholichah, F. The effect of liquidity, leverage, and profitability on firm value: Empirical evidence from Indonesia. J. Asian Financ. Econ. Bus. 2021 , 8 , 423–431. [ Google Scholar ]
  • Fu, R.; Yang, Z. Spatio-temporal differentiation and influencing factors of high-quality development of cities in China. Acta Geogr. Sin. 2024 , 79 , 819–836. [ Google Scholar ]
  • Jermias, J.; Fu, Y.; Fu, C.; Chen, Y. Budgetary control and risk management institutionalization: A field study of three state-owned enterprises in China. J. Account. Organ. Change 2023 , 19 , 63–88. [ Google Scholar ] [ CrossRef ]
  • Brychko, M.; Bilan, Y.; Lyeonov, S.; Streimikiene, D. Do changes in the business environment and sustainable development really matter for enhancing enterprise development. Sustain. Dev. 2023 , 31 , 587–599. [ Google Scholar ] [ CrossRef ]
  • Ma, H.; Xu, X. High-quality development assessment and spatial heterogeneity of urban agglomeration in the Yellow River Basin. Econ. Geogr. 2020 , 40 , 11–18. [ Google Scholar ]
  • Du, Y.; Huang, C.; Wu, C. The temporal and spatial pattern evolution of industrial high-quality development index in the Yangtze River Economic Belt. Econ. Geogr. 2020 , 40 , 96–103. [ Google Scholar ]
  • Le, T.T.; Tran, P.Q.; Lam, N.P.; Tra, M.N.L.; Uyen, P.H.P. Corporate social responsibility, green innovation, environment strategy and corporate sustainable development. Oper. Manag. Res. 2024 , 17 , 114–134. [ Google Scholar ] [ CrossRef ]
  • Hickel, J.; Kallis, G. Is green growth possible? New Political Econ. 2020 , 25 , 469–486. [ Google Scholar ] [ CrossRef ]
  • Chen, L.; Huo, C. The measurement and influencing factors of high-quality economic development in China. Sustainability 2022 , 14 , 9293. [ Google Scholar ] [ CrossRef ]
  • Zheng, X.; Zhu, M.; Shi, Y.; Pei, H.; Nie, W.; Nan, X.; Zhu, X.; Yang, G.; Bao, Z. Equity analysis of the green space allocation in China’s eight urban agglomerations based on the Theil Index and GeoDetector. Land 2023 , 12 , 795. [ Google Scholar ] [ CrossRef ]
  • Persico, N.; Postlewaite, A.; Silverman, D. The effect of adolescent experience on labor market outcomes: The case of height. J. Political Econ. 2004 , 112 , 1019–1053. [ Google Scholar ] [ CrossRef ]
  • Zhou, Z.; Lei, L.; San, Z. Business environment and high-quality development of enterprises: Mechanism analysis based on the perspective of corporate governance. Public Financ. Res. 2022 , 5 , 111–129. [ Google Scholar ]
  • Li, S.M. The Pearl River Delta: The fifth Asian tiger. In Hong Kong, Macau and the Pearl River Delta: A Geographical Survey ; Wong, K.K., Ed.; Hong Kong Educational Publisher: Hong Kong, China, 2009; pp. 178–211. [ Google Scholar ]
  • Brunnermeier, M.K. Deciphering the liquidity and credit crunch 2007–2008. J. Econ. Perspect. 2009 , 23 , 77–100. [ Google Scholar ] [ CrossRef ]
  • Zodrow, G.R.; Mieszkowski, P. Pigovian taxation, benefit taxation, and the fiscal effects of local public goods. J. Public Econ. 1986 , 29 , 279–293. [ Google Scholar ]
  • Chen, X.; Wang, H. Spatial–temporal evolution and driving factors of industrial land marketization in Chengdu–Chongqing Economic Circle. Land 2024 , 13 , 972. [ Google Scholar ] [ CrossRef ]
  • Wu, H.; Guo, K. Market-oriented allocation of factors, structural transformation, and labor productivity growth from the perspective of the “Dual Circulation” development pattern. Econ. Res. J. 2023 , 58 , 61–78. [ Google Scholar ]
  • Deng, P.; Lu, H. Transnational knowledge transfer or indigenous knowledge transfer: Which channel has more benefits for China’s high-tech enterprises? Eur. J. Innov. Manag. 2022 , 25 , 433–453. [ Google Scholar ] [ CrossRef ]
  • Wang, Z.; Ge, X.; He, Y.; Li, S. Has the Reform of Land Reserve Financing Policy Reduced the Local Governments’ Implicit Debt? Land 2023 , 12 , 2057. [ Google Scholar ] [ CrossRef ]
  • Yan, B.; Jia, H. Industrial land cost, local government behavior and foreign divestment. World Econ. Study 2022 , 3 , 92–108. [ Google Scholar ]

Click here to enlarge figure

Land Transfer MethodCharacteristics
Agreement(1) Non-market-oriented method of land transfer with no competitors
(2) Strong government control over land
(3) Restricts land price and development
Listing(1) Transparent process
(2) Extended listing period that allows for multiple offers, thus promoting rational decision-making and competition among investors
(3) Highest bidder wins, with no limit on the number of bidders
Bidding(1) Incomplete competition
(2) High transparency, fairness, and objectivity
(3) Comprehensive evaluation process
Auction(1) Full competitiveness
(2) Purchase price determines the buyer
(3) Involves multiple rounds of bidding
IndicatorSymbolDescriptionPositive or NegativeWeight
Innovation IN(1) Expenditure on research and development as a share of industrial value added (%)Positive0.0469
(2) Number of authorized patents per capitaPositive0.1531
(3) Number of people engaged in scientific and technological activities as a percentage of the resident population (%)Positive0.1152
EfficiencyEF(4) Contribution of total assets (%)Positive0.0344
(5) Product sales rate (%)Positive0.0067
(6) Total labor productivity (yuan/person)Positive0.0417
(7) Cost–expense margin (%)Positive0.0157
Structural optimizationST(8) New product sales revenue as a percentage of main operating revenue (%)Positive0.0544
(9) Share of industrial value added by high-end manufacturing (%)Positive0.0490
Financial risk controlRI(10) Debt-to-asset ratio (%)Negative0.0352
(11) Current asset turnover (times)Positive0.0608
OpennessOP(12) Proportion of enterprises with foreign investment to the total number of enterprises (%)Positive0.1078
(13) Proportion of total imports and exports with foreign investment to gross domestic product (%)Positive0.1202
Social welfare SO(14) Total profits and taxes of industrial enterprises (in terms of 100 million yuan)Positive0.1327
(15) Average number of employees in industrial enterprisesPositive0.0097
GreennessGR(16) Industrial particulate matter emissions per unit of industrial added value (tons/100 million yuan)Negative0.0058
(17) Industrial sulfur dioxide emissions per unit of industrial added value (tons/100 million yuan)Negative0.0034
(18) Energy consumption per unit of industrial added value (ton of standard coal/10,000 yuan)Negative0.0071
Independent VariablesModel 1 (FE)Model 2 (FE)Model 3 (FE)Model 4 (FE)Model 5 (GMM)Model 6 (FE)Model 7 (GMM)
ln (LM)0.2540 ***
(4.70)
0.2251 ***
(4.21)
0.2175 ***
(4.18)
0.2095 ***
(4.05)
0.0859 *
(1.68)
−0.0691
(−0.44)
−0.2217
(−1.27)
ln (FD) 0.3020 ***
(3.61)
0.2510 ***
(3.05)
0.2676 ***
(3.27)
0.0153
(0.19)
0.2695 ***
(3.31)
0.0195
(0.26)
ln (CI) 0.1813 ***
(4.23)
0.1975 ***
(4.59)
0.0976 **
(2.34)
0.2027 ***
(4.72)
0.1018 **
(2.59)
ln (FI) 0.0671 **
(2.38)
0.0230
(1.25)
0.0638 **
(2.27)
0.0184
(0.98)
ln (LM)t − 1 0.7786 ***
(15.57)
0.7952 ***
(15.06)
ln (LM) × region 0.1965 *
(1.90)
0.2549 *
(1.67)
overall R2 0.27580.38940.4265 0.1323
F233.96129.67107.39109.18 43.83
Arellano–Bond test for AR(1) −3.12 *** −3.00 ***
Arellano–Bond test for AR(2) 0.85 0.88
Sargan test 261.72 180.49
Hausman10.25 ***41.79 ***88.39 ***45.99 *** 41.74 ***
Independent VariablesDependent Variables
ln (IN)ln (EF)ln (ST)ln (RI)ln (OP)ln (SO)ln (GR)
Model 1
(FE)
Model 2
(FE)
Model 3
(FE)
Model 4
(FE)
Model 5
(FE)
Model 6
(FE)
Model 7
(FE)
Model 8
(FE)
Model 9
(FE)
Model 10
(FE)
Model 11
(FE)
Model 12
(FE)
Model 13
(FE)
Model 14
(FE)
ln (LM)0.8560 ***
(4.08)
0.6841
(1.08)
0.2476 ***
(3.68)
−0.0999
(−0.49)
0.1640
(1.39)
−0.5656
(−1.60)
−0.1370 *
(−1.67)
0.1571
(0.64)
−0.6003 ***
(−3.09)
−1.1741 **
(−2.00)
0.3308 ***
(4.55)
0.4767 **
(2.17)
0.4254 ***
(5.84)
1.1327 ***
(5.24)
ln (FD)3.2489 ***
(9.79)
3.2501 ***
(9.78)
−0.0910
(−0.86)
−0.0887
(−0.84)
1.3753 ***
(7.37)
1.3803 ***
(7.44)
−0.6418 ***
(−4.95)
−0.6438 ***
(−4.97)
−1.8511 ***
(−6.03)
−1.8472 ***
(−6.02)
0.0210
(0.18)
0.0200
(0.17)
0.4826 ***
(4.19)
0.4778 ***
(4.22)
ln (CI)−0.3891 **
(−2.23)
−0.3860 **
(−2.20)
0.5340 ***
(9.56)
0.5403 ***
(9.69)
−0.4337 ***
(−4.42)
−0.4203
(−4.30)
0.7082 ***
(10.39)
0.7028 ***
(10.30)
−0.1141
(−0.71)
−0.1036
(−0.64)
1.1709 ***
(19.38)
1.1682 ***
(19.28)
0.1474 **
(2.43)
0.1344 **
(2.25)
ln (FI)0.5019 ***
(4.40)
0.4999 ***
(4.37)
0.1346 ***
(3.69)
0.1307 ***
(3.58)
0.1102 *
(1.72)
0.1017
(1.59)
−0.1280 ***
(−2.87)
−0.1246 ***
(−2.79)
−0.3963 ***
(−3.76)
−0.4030 ***
(−3.81)
0.1426 ***
(3.61)
0.1443 ***
(3.64)
0.0494
(1.25)
0.0576
(1.48)
ln (LM) × region 0.1212
(0.29)
0.2451 *
(1.82)
0.5145 **
(2.18)
−0.2074
(−1.26)
0.4047 ***
(1.04)
−0.1029
(−0.70)
−0.4988 ***
(−3.47)
overall R 0.38590.34040.00050.08320.21000.02910.13470.01710.17580.31590.35490.47030.20910.2374
F24.3613.7767.0158.8851.5933.1742.7240.1646.5829.43270.78161.408.799.08
Hausman79.88 ***53.24 ***8.32 *2.6711.52 **21.98 ***14.97 ***14.48 **60.16 ***35.91 ***30.84 ***22.21 ***26.58 ***43.21 ***
Independent VariablesModel 1
(FE)
Model 2
(GMM)
Model 3
(FE)
Model 4
(FE)
Model 5
(GMM)
Model 6
(FE)
ln (LM)0.0613
(1.24)
0.0255
(0.53)
0.0750
(1.53)
0.0537
(1.12)
0.0180
(0.41)
0.0271
(0.60)
ln (PR)0.1051 ***
(8.65)
0.0338 **
(2.52)
−0.0056
(−0.14)
ln (EV) 0.3387 ***
(9.57)
0.1051 ***
(4.06)
−0.1665 *
(−1.92)
ln (FD)0.1422 *
(1.91)
0.0238 **
(0.29)
0.2120
(2.73)
−0.1506 *
(−1.80)
−0.0540
(−0.67)
−0.0203
(−0.25)
ln (CI)0.1599 ***
(4.13)
0.1143 ***
(2.80)
0.1680
(4.38)
0.1763 ***
(4.68)
0.1076 ***
(3.01)
0.1711 ***
(4.84)
ln (FI)0.0190
(0.74)
0.0326 *
(1.86)
0.0117
(0.46)
0.0193
(0.77)
0.0297
(1.85)
0.0099
(0.42)
ln (LM)t − 1 0.6967 ***
(11.10)
0.6856 ***
(13.16)
ln (PR) × region 0.0665 ***
(2.81)
ln (EV) × region 0.3746 ***
(6.92)
overall R 0.6343 0.84760.1643 0.6569
F92.49 33.00138.59 77.64
Arellano–Bond test for AR(1) −3.19 *** −3.25 ***
Arellano–Bond test for AR(2) 0.97 0.87
Sargan test 282.36 271.12
Hausman81.49 *** 76.65 ***63.47 *** 26.38 ***
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

Jin, W.; Zhang, Q.; Liu, T. Effect of Land Marketization on the High-Quality Development of Industry in Guangdong Province, China. Land 2024 , 13 , 1400. https://doi.org/10.3390/land13091400

Jin W, Zhang Q, Liu T. Effect of Land Marketization on the High-Quality Development of Industry in Guangdong Province, China. Land . 2024; 13(9):1400. https://doi.org/10.3390/land13091400

Jin, Wanfu, Qi Zhang, and Tao Liu. 2024. "Effect of Land Marketization on the High-Quality Development of Industry in Guangdong Province, China" Land 13, no. 9: 1400. https://doi.org/10.3390/land13091400

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

  • Study Protocol
  • Open access
  • Published: 26 August 2024

Learning effect of online versus onsite education in health and medical scholarship – protocol for a cluster randomized trial

  • Rie Raffing 1 ,
  • Lars Konge 2 &
  • Hanne Tønnesen 1  

BMC Medical Education volume  24 , Article number:  927 ( 2024 ) Cite this article

123 Accesses

Metrics details

The disruption of health and medical education by the COVID-19 pandemic made educators question the effect of online setting on students’ learning, motivation, self-efficacy and preference. In light of the health care staff shortage online scalable education seemed relevant. Reviews on the effect of online medical education called for high quality RCTs, which are increasingly relevant with rapid technological development and widespread adaption of online learning in universities. The objective of this trial is to compare standardized and feasible outcomes of an online and an onsite setting of a research course regarding the efficacy for PhD students within health and medical sciences: Primarily on learning of research methodology and secondly on preference, motivation, self-efficacy on short term and academic achievements on long term. Based on the authors experience with conducting courses during the pandemic, the hypothesis is that student preferred onsite setting is different to online setting.

Cluster randomized trial with two parallel groups. Two PhD research training courses at the University of Copenhagen are randomized to online (Zoom) or onsite (The Parker Institute, Denmark) setting. Enrolled students are invited to participate in the study. Primary outcome is short term learning. Secondary outcomes are short term preference, motivation, self-efficacy, and long-term academic achievements. Standardized, reproducible and feasible outcomes will be measured by tailor made multiple choice questionnaires, evaluation survey, frequently used Intrinsic Motivation Inventory, Single Item Self-Efficacy Question, and Google Scholar publication data. Sample size is calculated to 20 clusters and courses are randomized by a computer random number generator. Statistical analyses will be performed blinded by an external statistical expert.

Primary outcome and secondary significant outcomes will be compared and contrasted with relevant literature. Limitations include geographical setting; bias include lack of blinding and strengths are robust assessment methods in a well-established conceptual framework. Generalizability to PhD education in other disciplines is high. Results of this study will both have implications for students and educators involved in research training courses in health and medical education and for the patients who ultimately benefits from this training.

Trial registration

Retrospectively registered at ClinicalTrials.gov: NCT05736627. SPIRIT guidelines are followed.

Peer Review reports

Medical education was utterly disrupted for two years by the COVID-19 pandemic. In the midst of rearranging courses and adapting to online platforms we, with lecturers and course managers around the globe, wondered what the conversion to online setting did to students’ learning, motivation and self-efficacy [ 1 , 2 , 3 ]. What the long-term consequences would be [ 4 ] and if scalable online medical education should play a greater role in the future [ 5 ] seemed relevant and appealing questions in a time when health care professionals are in demand. Our experience of performing research training during the pandemic was that although PhD students were grateful for courses being available, they found it difficult to concentrate related to the long screen hours. We sensed that most students preferred an onsite setting and perceived online courses a temporary and inferior necessity. The question is if this impacted their learning?

Since the common use of the internet in medical education, systematic reviews have sought to answer if there is a difference in learning effect when taught online compared to onsite. Although authors conclude that online learning may be equivalent to onsite in effect, they agree that studies are heterogeneous and small [ 6 , 7 ], with low quality of the evidence [ 8 , 9 ]. They therefore call for more robust and adequately powered high-quality RCTs to confirm their findings and suggest that students’ preferences in online learning should be investigated [ 7 , 8 , 9 ].

This uncovers two knowledge gaps: I) High-quality RCTs on online versus onsite learning in health and medical education and II) Studies on students’ preferences in online learning.

Recently solid RCTs have been performed on the topic of web-based theoretical learning of research methods among health professionals [ 10 , 11 ]. However, these studies are on asynchronous courses among medical or master students with short term outcomes.

This uncovers three additional knowledge gaps: III) Studies on synchronous online learning IV) among PhD students of health and medical education V) with long term measurement of outcomes.

The rapid technological development including artificial intelligence (AI) and widespread adaption as well as application of online learning forced by the pandemic, has made online learning well-established. It represents high resolution live synchronic settings which is available on a variety of platforms with integrated AI and options for interaction with and among students, chat and break out rooms, and exterior digital tools for teachers [ 12 , 13 , 14 ]. Thus, investigating online learning today may be quite different than before the pandemic. On one hand, it could seem plausible that this technological development would make a difference in favour of online learning which could not be found in previous reviews of the evidence. On the other hand, the personal face-to-face interaction during onsite learning may still be more beneficial for the learning process and combined with our experience of students finding it difficult to concentrate when online during the pandemic we hypothesize that outcomes of the onsite setting are different from the online setting.

To support a robust study, we design it as a cluster randomized trial. Moreover, we use the well-established and widely used Kirkpatrick’s conceptual framework for evaluating learning as a lens to assess our outcomes [ 15 ]. Thus, to fill the above-mentioned knowledge gaps, the objective of this trial is to compare a synchronous online and an in-person onsite setting of a research course regarding the efficacy for PhD students within the health and medical sciences:

Primarily on theoretical learning of research methodology and

Secondly on

◦ Preference, motivation, self-efficacy on short term

◦ Academic achievements on long term

Trial design

This study protocol covers synchronous online and in-person onsite setting of research courses testing the efficacy for PhD students. It is a two parallel arms cluster randomized trial (Fig.  1 ).

figure 1

Consort flow diagram

The study measures baseline and post intervention. Baseline variables and knowledge scores are obtained at the first day of the course, post intervention measurement is obtained the last day of the course (short term) and monthly for 24 months (long term).

Randomization is stratified giving 1:1 allocation ratio of the courses. As the number of participants within each course might differ, the allocation ratio of participants in the study will not fully be equal and 1:1 balanced.

Study setting

The study site is The Parker Institute at Bispebjerg and Frederiksberg Hospital, University of Copenhagen, Denmark. From here the courses are organized and run online and onsite. The course programs and time schedules, the learning objective, the course management, the lecturers, and the delivery are identical in the two settings. The teachers use the same introductory presentations followed by training in break out groups, feed-back and discussions. For the online group, the setting is organized as meetings in the online collaboration tool Zoom® [ 16 ] using the basic available technicalities such as screen sharing, chat function for comments, and breakout rooms and other basics digital tools if preferred. The online version of the course is synchronous with live education and interaction. For the onsite group, the setting is the physical classroom at the learning facilities at the Parker Institute. Coffee and tea as well as simple sandwiches and bottles of water, which facilitate sociality, are available at the onsite setting. The participants in the online setting must get their food and drink by themselves, but online sociality is made possible by not closing down the online room during the breaks. The research methodology courses included in the study are “Practical Course in Systematic Review Technique in Clinical Research”, (see course programme in appendix 1) and “Getting started: Writing your first manuscript for publication” [ 17 ] (see course programme in appendix 2). The two courses both have 12 seats and last either three or three and a half days resulting in 2.2 and 2.6 ECTS credits, respectively. They are offered by the PhD School of the Faculty of Health and Medical Sciences, University of Copenhagen. Both courses are available and covered by the annual tuition fee for all PhD students enrolled at a Danish university.

Eligibility criteria

Inclusion criteria for participants: All PhD students enrolled on the PhD courses participate after informed consent: “Practical Course in Systematic Review Technique in Clinical Research” and “Getting started: Writing your first manuscript for publication” at the PhD School of the Faculty of Health and Medical Sciences, University of Copenhagen, Denmark.

Exclusion criteria for participants: Declining to participate and withdrawal of informed consent.

Informed consent

The PhD students at the PhD School at the Faculty of Health Sciences, University of Copenhagen participate after informed consent, taken by the daily project leader, allowing evaluation data from the course to be used after pseudo-anonymization in the project. They are informed in a welcome letter approximately three weeks prior to the course and again in the introduction the first course day. They register their consent on the first course day (Appendix 3). Declining to participate in the project does not influence their participation in the course.

Interventions

Online course settings will be compared to onsite course settings. We test if the onsite setting is different to online. Online learning is increasing but onsite learning is still the preferred educational setting in a medical context. In this case onsite learning represents “usual care”. The online course setting is meetings in Zoom using the technicalities available such as chat and breakout rooms. The onsite setting is the learning facilities, at the Parker Institute, Bispebjerg and Frederiksberg Hospital, The Capital Region, University of Copenhagen, Denmark.

The course settings are not expected to harm the participants, but should a request be made to discontinue the course or change setting this will be met, and the participant taken out of the study. Course participants are allowed to take part in relevant concomitant courses or other interventions during the trial.

Strategies to improve adherence to interventions

Course participants are motivated to complete the course irrespectively of the setting because it bears ECTS-points for their PhD education and adds to the mandatory number of ECTS-points. Thus, we expect adherence to be the same in both groups. However, we monitor their presence in the course and allocate time during class for testing the short-term outcomes ( motivation, self-efficacy, preference and learning). We encourage and, if necessary, repeatedly remind them to register with Google Scholar for our testing of the long-term outcome (academic achievement).

Outcomes are related to the Kirkpatrick model for evaluating learning (Fig.  2 ) which divides outcomes into four different levels; Reaction which includes for example motivation, self-efficacy and preferences, Learning which includes knowledge acquisition, Behaviour for practical application of skills when back at the job (not included in our outcomes), and Results for impact for end-users which includes for example academic achievements in the form of scientific articles [ 18 , 19 , 20 ].

figure 2

The Kirkpatrick model

Primary outcome

The primary outcome is short term learning (Kirkpatrick level 2).

Learning is assessed by a Multiple-Choice Questionnaire (MCQ) developed prior to the RCT specifically for this setting (Appendix 4). First the lecturers of the two courses were contacted and asked to provide five multiple choice questions presented as a stem with three answer options; one correct answer and two distractors. The questions should be related to core elements of their teaching under the heading of research training. The questions were set up to test the cognition of the students at the levels of "Knows" or "Knows how" according to Miller's Pyramid of Competence and not their behaviour [ 21 ]. Six of the course lecturers responded and out of this material all the questions which covered curriculum of both courses were selected. It was tested on 10 PhD students and within the lecturer group, revised after an item analysis and English language revised. The MCQ ended up containing 25 questions. The MCQ is filled in at baseline and repeated at the end of the course. The primary outcomes based on the MCQ is estimated as the score of learning calculated as number of correct answers out of 25 after the course. A decrease of points of the MCQ in the intervention groups denotes a deterioration of learning. In the MCQ the minimum score is 0 and 25 is maximum, where 19 indicates passing the course.

Furthermore, as secondary outcome, this outcome measurement will be categorized as binary outcome to determine passed/failed of the course defined by 75% (19/25) correct answers.

The learning score will be computed on group and individual level and compared regarding continued outcomes by the Mann–Whitney test comparing the learning score of the online and onsite groups. Regarding the binomial outcome of learning (passed/failed) data will be analysed by the Fisher’s exact test on an intention-to-treat basis between the online and onsite. The results will be presented as median and range and as mean and standard deviations, for possible future use in meta-analyses.

Secondary outcomes

Motivation assessment post course: Motivation level is measured by the Intrinsic Motivation Inventory (IMI) Scale [ 22 ] (Appendix 5). The IMI items were randomized by random.org on the 4th of August 2022. It contains 12 items to be assessed by the students on a 7-point Likert scale where 1 is “Not at all true”, 4 is “Somewhat true” and 7 is “Very true”. The motivation score will be computed on group and individual level and will then be tested by the Mann–Whitney of the online and onsite group.

Self-efficacy assessment post course: Self-efficacy level is measured by a single-item measure developed and validated by Williams and Smith [ 23 ] (Appendix 6). It is assessed by the students on a scale from 1–10 where 1 is “Strongly disagree” and 10 is “Strongly agree”. The self-efficacy score will be computed on group and individual level and tested by a Mann–Whitney test to compare the self-efficacy score of the online and onsite group.

Preference assessment post course: Preference is measured as part of the general course satisfaction evaluation with the question “If you had the option to choose, which form would you prefer this course to have?” with the options “onsite form” and “online form”.

Academic achievement assessment is based on 24 monthly measurements post course of number of publications, number of citations, h-index, i10-index. This data is collected through the Google Scholar Profiles [ 24 ] of the students as this database covers most scientific journals. Associations between onsite/online and long-term academic will be examined with Kaplan Meyer and log rank test with a significance level of 0.05.

Participant timeline

Enrolment for the course at the Faculty of Health Sciences, University of Copenhagen, Denmark, becomes available when it is published in the course catalogue. In the course description the course location is “To be announced”. Approximately 3–4 weeks before the course begins, the participant list is finalized, and students receive a welcome letter containing course details, including their allocation to either the online or onsite setting. On the first day of the course, oral information is provided, and participants provide informed consent, baseline variables, and base line knowledge scores.

The last day of scheduled activities the following scores are collected, knowledge, motivation, self-efficacy, setting preference, and academic achievement. To track students' long term academic achievements, follow-ups are conducted monthly for a period of 24 months, with assessments occurring within one week of the last course day (Table  1 ).

Sample size

The power calculation is based on the main outcome, theoretical learning on short term. For the sample size determination, we considered 12 available seats for participants in each course. To achieve statistical power, we aimed for 8 clusters in both online and onsite arms (in total 16 clusters) to detect an increase in learning outcome of 20% (learning outcome increase of 5 points). We considered an intraclass correlation coefficient of 0.02, a standard deviation of 10, a power of 80%, and a two-sided alpha level of 5%. The Allocation Ratio was set at 1, implying an equal number of subjects in both online and onsite group.

Considering a dropout up to 2 students per course, equivalent to 17%, we determined that a total of 112 participants would be needed. This calculation factored in 10 clusters of 12 participants per study arm, which we deemed sufficient to assess any changes in learning outcome.

The sample size was estimated using the function n4means from the R package CRTSize [ 25 ].

Recruitment

Participants are PhD students enrolled in 10 courses of “Practical Course in Systematic Review Technique in Clinical Research” and 10 courses of “Getting started: Writing your first manuscript for publication” at the PhD School of the Faculty of Health Sciences, University of Copenhagen, Denmark.

Assignment of interventions: allocation

Randomization will be performed on course-level. The courses are randomized by a computer random number generator [ 26 ]. To get a balanced randomization per year, 2 sets with 2 unique random integers in each, taken from the 1–4 range is requested.

The setting is not included in the course catalogue of the PhD School and thus allocation to online or onsite is concealed until 3–4 weeks before course commencement when a welcome letter with course information including allocation to online or onsite setting is distributed to the students. The lecturers are also informed of the course setting at this time point. If students withdraw from the course after being informed of the setting, a letter is sent to them enquiring of the reason for withdrawal and reason is recorded (Appendix 7).

The allocation sequence is generated by a computer random number generator (random.org). The participants and the lecturers sign up for the course without knowing the course setting (online or onsite) until 3–4 weeks before the course.

Assignment of interventions: blinding

Due to the nature of the study, it is not possible to blind trial participants or lecturers. The outcomes are reported by the participants directly in an online form, thus being blinded for the outcome assessor, but not for the individual participant. The data collection for the long-term follow-up regarding academic achievements is conducted without blinding. However, the external researcher analysing the data will be blinded.

Data collection and management

Data will be collected by the project leader (Table  1 ). Baseline variables and post course knowledge, motivation, and self-efficacy are self-reported through questionnaires in SurveyXact® [ 27 ]. Academic achievements are collected through Google Scholar profiles of the participants.

Given that we are using participant assessments and evaluations for research purposes, all data collection – except for monthly follow-up of academic achievements after the course – takes place either in the immediate beginning or ending of the course and therefore we expect participant retention to be high.

Data will be downloaded from SurveyXact and stored in a locked and logged drive on a computer belonging to the Capital Region of Denmark. Only the project leader has access to the data.

This project conduct is following the Danish Data Protection Agency guidelines of the European GDPR throughout the trial. Following the end of the trial, data will be stored at the Danish National Data Archive which fulfil Danish and European guidelines for data protection and management.

Statistical methods

Data is anonymized and blinded before the analyses. Analyses are performed by a researcher not otherwise involved in the inclusion or randomization, data collection or handling. All statistical tests will be testing the null hypotheses assuming the two arms of the trial being equal based on corresponding estimates. Analysis of primary outcome on short-term learning will be started once all data has been collected for all individuals in the last included course. Analyses of long-term academic achievement will be started at end of follow-up.

Baseline characteristics including both course- and individual level information will be presented. Table 2 presents the available data on baseline.

We will use multivariate analysis for identification of the most important predictors (motivation, self-efficacy, sex, educational background, and knowledge) for best effect on short and long term. The results will be presented as risk ratio (RR) with 95% confidence interval (CI). The results will be considered significant if CI does not include the value one.

All data processing and analyses were conducted using R statistical software version 4.1.0, 2021–05-18 (R Foundation for Statistical Computing, Vienna, Austria).

If possible, all analysis will be performed for “Practical Course in Systematic Review Technique in Clinical Research” and for “Getting started: Writing your first manuscript for publication” separately.

Primary analyses will be handled with the intention-to-treat approach. The analyses will include all individuals with valid data regardless of they did attend the complete course. Missing data will be handled with multiple imputation [ 28 ] .

Upon reasonable request, public assess will be granted to protocol, datasets analysed during the current study, and statistical code Table 3 .

Oversight, monitoring, and adverse events

This project is coordinated in collaboration between the WHO CC (DEN-62) at the Parker Institute, CAMES, and the PhD School at the Faculty of Health and Medical Sciences, University of Copenhagen. The project leader runs the day-to-day support of the trial. The steering committee of the trial includes principal investigators from WHO CC (DEN-62) and CAMES and the project leader and meets approximately three times a year.

Data monitoring is done on a daily basis by the project leader and controlled by an external independent researcher.

An adverse event is “a harmful and negative outcome that happens when a patient has been provided with medical care” [ 29 ]. Since this trial does not involve patients in medical care, we do not expect adverse events. If participants decline taking part in the course after receiving the information of the course setting, information on reason for declining is sought obtained. If the reason is the setting this can be considered an unintended effect. Information of unintended effects of the online setting (the intervention) will be recorded. Participants are encouraged to contact the project leader with any response to the course in general both during and after the course.

The trial description has been sent to the Scientific Ethical Committee of the Capital Region of Denmark (VEK) (21041907), which assessed it as not necessary to notify and that it could proceed without permission from VEK according to the Danish law and regulation of scientific research. The trial is registered with the Danish Data Protection Agency (Privacy) (P-2022–158). Important protocol modification will be communicated to relevant parties as well as VEK, the Joint Regional Information Security and Clinicaltrials.gov within an as short timeframe as possible.

Dissemination plans

The results (positive, negative, or inconclusive) will be disseminated in educational, scientific, and clinical fora, in international scientific peer-reviewed journals, and clinicaltrials.gov will be updated upon completion of the trial. After scientific publication, the results will be disseminated to the public by the press, social media including the website of the hospital and other organizations – as well as internationally via WHO CC (DEN-62) at the Parker Institute and WHO Europe.

All authors will fulfil the ICMJE recommendations for authorship, and RR will be first author of the articles as a part of her PhD dissertation. Contributors who do not fulfil these recommendations will be offered acknowledgement in the article.

This cluster randomized trial investigates if an onsite setting of a research course for PhD students within the health and medical sciences is different from an online setting. The outcomes measured are learning of research methodology (primary), preference, motivation, and self-efficacy (secondary) on short term and academic achievements (secondary) on long term.

The results of this study will be discussed as follows:

Discussion of primary outcome

Primary outcome will be compared and contrasted with similar studies including recent RCTs and mixed-method studies on online and onsite research methodology courses within health and medical education [ 10 , 11 , 30 ] and for inspiration outside the field [ 31 , 32 ]: Tokalic finds similar outcomes for online and onsite, Martinic finds that the web-based educational intervention improves knowledge, Cheung concludes that the evidence is insufficient to say that the two modes have different learning outcomes, Kofoed finds online setting to have negative impact on learning and Rahimi-Ardabili presents positive self-reported student knowledge. These conflicting results will be discussed in the context of the result on the learning outcome of this study. The literature may change if more relevant studies are published.

Discussion of secondary outcomes

Secondary significant outcomes are compared and contrasted with similar studies.

Limitations, generalizability, bias and strengths

It is a limitation to this study, that an onsite curriculum for a full day is delivered identically online, as this may favour the onsite course due to screen fatigue [ 33 ]. At the same time, it is also a strength that the time schedules are similar in both settings. The offer of coffee, tea, water, and a plain sandwich in the onsite course may better facilitate the possibility for socializing. Another limitation is that the study is performed in Denmark within a specific educational culture, with institutional policies and resources which might affect the outcome and limit generalization to other geographical settings. However, international students are welcome in the class.

In educational interventions it is generally difficult to blind participants and this inherent limitation also applies to this trial [ 11 ]. Thus, the participants are not blinded to their assigned intervention, and neither are the lecturers in the courses. However, the external statistical expert will be blinded when doing the analyses.

We chose to compare in-person onsite setting with a synchronous online setting. Therefore, the online setting cannot be expected to generalize to asynchronous online setting. Asynchronous delivery has in some cases showed positive results and it might be because students could go back and forth through the modules in the interface without time limit [ 11 ].

We will report on all the outcomes defined prior to conducting the study to avoid selective reporting bias.

It is a strength of the study that it seeks to report outcomes within the 1, 2 and 4 levels of the Kirkpatrick conceptual framework, and not solely on level 1. It is also a strength that the study is cluster randomized which will reduce “infections” between the two settings and has an adequate power calculated sample size and looks for a relevant educational difference of 20% between the online and onsite setting.

Perspectives with implications for practice

The results of this study may have implications for the students for which educational setting they choose. Learning and preference results has implications for lecturers, course managers and curriculum developers which setting they should plan for the health and medical education. It may also be of inspiration for teaching and training in other disciplines. From a societal perspective it also has implications because we will know the effect and preferences of online learning in case of a future lock down.

Future research could investigate academic achievements in online and onsite research training on the long run (Kirkpatrick 4); the effect of blended learning versus online or onsite (Kirkpatrick 2); lecturers’ preferences for online and onsite setting within health and medical education (Kirkpatrick 1) and resource use in synchronous and asynchronous online learning (Kirkpatrick 5).

Trial status

This trial collected pilot data from August to September 2021 and opened for inclusion in January 2022. Completion of recruitment is expected in April 2024 and long-term follow-up in April 2026. Protocol version number 1 03.06.2022 with amendments 30.11.2023.

Availability of data and materials

The project leader will have access to the final trial dataset which will be available upon reasonable request. Exception to this is the qualitative raw data that might contain information leading to personal identification.

Abbreviations

Artificial Intelligence

Copenhagen academy for medical education and simulation

Confidence interval

Coronavirus disease

European credit transfer and accumulation system

International committee of medical journal editors

Intrinsic motivation inventory

Multiple choice questionnaire

Doctor of medicine

Masters of sciences

Randomized controlled trial

Scientific ethical committee of the Capital Region of Denmark

WHO Collaborating centre for evidence-based clinical health promotion

Samara M, Algdah A, Nassar Y, Zahra SA, Halim M, Barsom RMM. How did online learning impact the academic. J Technol Sci Educ. 2023;13(3):869–85.

Article   Google Scholar  

Nejadghaderi SA, Khoshgoftar Z, Fazlollahi A, Nasiri MJ. Medical education during the coronavirus disease 2019 pandemic: an umbrella review. Front Med (Lausanne). 2024;11:1358084. https://doi.org/10.3389/fmed.2024.1358084 .

Madi M, Hamzeh H, Abujaber S, Nawasreh ZH. Have we failed them? Online learning self-efficacy of physiotherapy students during COVID-19 pandemic. Physiother Res Int. 2023;5:e1992. https://doi.org/10.1002/pri.1992 .

Torda A. How COVID-19 has pushed us into a medical education revolution. Intern Med J. 2020;50(9):1150–3.

Alhat S. Virtual Classroom: A Future of Education Post-COVID-19. Shanlax Int J Educ. 2020;8(4):101–4.

Cook DA, Levinson AJ, Garside S, Dupras DM, Erwin PJ, Montori VM. Internet-based learning in the health professions: A meta-analysis. JAMA. 2008;300(10):1181–96. https://doi.org/10.1001/jama.300.10.1181 .

Pei L, Wu H. Does online learning work better than offline learning in undergraduate medical education? A systematic review and meta-analysis. Med Educ Online. 2019;24(1):1666538. https://doi.org/10.1080/10872981.2019.1666538 .

Richmond H, Copsey B, Hall AM, Davies D, Lamb SE. A systematic review and meta-analysis of online versus alternative methods for training licensed health care professionals to deliver clinical interventions. BMC Med Educ. 2017;17(1):227. https://doi.org/10.1186/s12909-017-1047-4 .

George PP, Zhabenko O, Kyaw BM, Antoniou P, Posadzki P, Saxena N, Semwal M, Tudor Car L, Zary N, Lockwood C, Car J. Online Digital Education for Postregistration Training of Medical Doctors: Systematic Review by the Digital Health Education Collaboration. J Med Internet Res. 2019;21(2):e13269. https://doi.org/10.2196/13269 .

Tokalić R, Poklepović Peričić T, Marušić A. Similar Outcomes of Web-Based and Face-to-Face Training of the GRADE Approach for the Certainty of Evidence: Randomized Controlled Trial. J Med Internet Res. 2023;25:e43928. https://doi.org/10.2196/43928 .

Krnic Martinic M, Čivljak M, Marušić A, Sapunar D, Poklepović Peričić T, Buljan I, et al. Web-Based Educational Intervention to Improve Knowledge of Systematic Reviews Among Health Science Professionals: Randomized Controlled Trial. J Med Internet Res. 2022;24(8): e37000.

https://www.mentimeter.com/ . Accessed 4 Dec 2023.

https://www.sendsteps.com/en/ . Accessed 4 Dec 2023.

https://da.padlet.com/ . Accessed 4 Dec 2023.

Zackoff MW, Real FJ, Abramson EL, Li STT, Klein MD, Gusic ME. Enhancing Educational Scholarship Through Conceptual Frameworks: A Challenge and Roadmap for Medical Educators. Acad Pediatr. 2019;19(2):135–41. https://doi.org/10.1016/j.acap.2018.08.003 .

https://zoom.us/ . Accessed 20 Aug 2024.

Raffing R, Larsen S, Konge L, Tønnesen H. From Targeted Needs Assessment to Course Ready for Implementation-A Model for Curriculum Development and the Course Results. Int J Environ Res Public Health. 2023;20(3):2529. https://doi.org/10.3390/ijerph20032529 .

https://www.kirkpatrickpartners.com/the-kirkpatrick-model/ . Accessed 12 Dec 2023.

Smidt A, Balandin S, Sigafoos J, Reed VA. The Kirkpatrick model: A useful tool for evaluating training outcomes. J Intellect Dev Disabil. 2009;34(3):266–74.

Campbell K, Taylor V, Douglas S. Effectiveness of online cancer education for nurses and allied health professionals; a systematic review using kirkpatrick evaluation framework. J Cancer Educ. 2019;34(2):339–56.

Miller GE. The assessment of clinical skills/competence/performance. Acad Med. 1990;65(9 Suppl):S63–7.

Ryan RM, Deci EL. Self-Determination Theory and the Facilitation of Intrinsic Motivation, Social Development, and Well-Being. Am Psychol. 2000;55(1):68–78. https://doi.org/10.1037//0003-066X.55.1.68 .

Williams GM, Smith AP. Using single-item measures to examine the relationships between work, personality, and well-being in the workplace. Psychology. 2016;07(06):753–67.

https://scholar.google.com/intl/en/scholar/citations.html . Accessed 4 Dec 2023.

Rotondi MA. CRTSize: sample size estimation functions for cluster randomized trials. R package version 1.0. 2015. Available from: https://cran.r-project.org/package=CRTSize .

Random.org. Available from: https://www.random.org/

https://rambollxact.dk/surveyxact . Accessed 4 Dec 2023.

Sterne JAC, White IR, Carlin JB, Spratt M, Royston P, Kenward MG, et al. Multiple imputation for missing data in epidemiological and clinical research: Potential and pitfalls. BMJ (Online). 2009;339:157–60.

Google Scholar  

Skelly C, Cassagnol M, Munakomi S. Adverse Events. StatPearls Treasure Island: StatPearls Publishing. 2023. Available from: https://www.ncbi.nlm.nih.gov/books/NBK558963/ .

Rahimi-Ardabili H, Spooner C, Harris MF, Magin P, Tam CWM, Liaw ST, et al. Online training in evidence-based medicine and research methods for GP registrars: a mixed-methods evaluation of engagement and impact. BMC Med Educ. 2021;21(1):1–14. Available from:  https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8439372/pdf/12909_2021_Article_2916.pdf .

Cheung YYH, Lam KF, Zhang H, Kwan CW, Wat KP, Zhang Z, et al. A randomized controlled experiment for comparing face-to-face and online teaching during COVID-19 pandemic. Front Educ. 2023;8. https://doi.org/10.3389/feduc.2023.1160430 .

Kofoed M, Gebhart L, Gilmore D, Moschitto R. Zooming to Class?: Experimental Evidence on College Students' Online Learning During Covid-19. SSRN Electron J. 2021;IZA Discussion Paper No. 14356.

Mutlu Aİ, Yüksel M. Listening effort, fatigue, and streamed voice quality during online university courses. Logop Phoniatr Vocol :1–8. Available from: https://doi.org/10.1080/14015439.2024.2317789

Download references

Acknowledgements

We thank the students who make their evaluations available for this trial and MSc (Public Health) Mie Sylow Liljendahl for statistical support.

Open access funding provided by Copenhagen University The Parker Institute, which hosts the WHO CC (DEN-62), receives a core grant from the Oak Foundation (OCAY-18–774-OFIL). The Oak Foundation had no role in the design of the study or in the collection, analysis, and interpretation of the data or in writing the manuscript.

Author information

Authors and affiliations.

WHO Collaborating Centre (DEN-62), Clinical Health Promotion Centre, The Parker Institute, Bispebjerg & Frederiksberg Hospital, University of Copenhagen, Copenhagen, 2400, Denmark

Rie Raffing & Hanne Tønnesen

Copenhagen Academy for Medical Education and Simulation (CAMES), Centre for HR and Education, The Capital Region of Denmark, Copenhagen, 2100, Denmark

You can also search for this author in PubMed   Google Scholar

Contributions

RR, LK and HT have made substantial contributions to the conception and design of the work; RR to the acquisition of data, and RR, LK and HT to the interpretation of data; RR has drafted the work and RR, LK, and HT have substantively revised it AND approved the submitted version AND agreed to be personally accountable for their own contributions as well as ensuring that any questions which relates to the accuracy or integrity of the work are adequately investigated, resolved and documented.

Corresponding author

Correspondence to Rie Raffing .

Ethics declarations

Ethics approval and consent to participate.

The Danish National Committee on Health Research Ethics has assessed the study Journal-nr.:21041907 (Date: 21–09-2021) without objections or comments. The study has been approved by The Danish Data Protection Agency Journal-nr.: P-2022–158 (Date: 04.05.2022).

All PhD students participate after informed consent. They can withdraw from the study at any time without explanations or consequences for their education. They will be offered information of the results at study completion. There are no risks for the course participants as the measurements in the course follow routine procedure and they are not affected by the follow up in Google Scholar. However, the 15 min of filling in the forms may be considered inconvenient.

The project will follow the GDPR and the Joint Regional Information Security Policy. Names and ID numbers are stored on a secure and logged server at the Capital Region Denmark to avoid risk of data leak. All outcomes are part of the routine evaluation at the courses, except the follow up for academic achievement by publications and related indexes. However, the publications are publicly available per se.

Competing interests

The authors declare no competing interests

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary material 1., supplementary material 2., supplementary material 3., supplementary material 4., supplementary material 5., supplementary material 6., supplementary material 7., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Raffing, R., Konge, L. & Tønnesen, H. Learning effect of online versus onsite education in health and medical scholarship – protocol for a cluster randomized trial. BMC Med Educ 24 , 927 (2024). https://doi.org/10.1186/s12909-024-05915-z

Download citation

Received : 25 March 2024

Accepted : 14 August 2024

Published : 26 August 2024

DOI : https://doi.org/10.1186/s12909-024-05915-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Self-efficacy
  • Achievements
  • Health and Medical education

BMC Medical Education

ISSN: 1472-6920

the quality of a research study is primarily assessed on

  • Open access
  • Published: 04 September 2024

Readiness for non-communicable disease service delivery in Ethiopia: an empirical analysis

  • Azeb Gebresilassie Tesema   ORCID: orcid.org/0000-0003-0618-4499 1 , 2 ,
  • Rohina Joshi 1 , 3 ,
  • Seye Abimbola 2 , 4 ,
  • Alemnesh H. Mirkuzie 5 ,
  • Daria Berlina 6 ,
  • Tea Collins 6 &
  • David Peiris 2  

BMC Health Services Research volume  24 , Article number:  1021 ( 2024 ) Cite this article

Metrics details

Ethiopia’s health system is overwhelmed by the growing burden of non-communicable diseases (NCDs). In this study, we assessed the availability of and readiness for NCD services and the interaction of NCD services with other essential and non-NCD services.

The analysis focused on four main NCD services: diabetes mellitus, cardiovascular diseases, chronic respiratory diseases, and cancer screening. We used data from the 2018 Ethiopian Service Availability and Readiness Assessment (SARA) survey. As defined by the World Health Organization, readiness, both general and service-specific, was measured based on the mean percentage availability of the tracer indicators, such as trained staff and guidelines, equipment, diagnostic capacity, and essential medicines and commodities needed for delivering essential health services and NCD-specific services, respectively. The survey comprised 632 nationally representative healthcare facilities, and we applied mixed-effects linear and ordered logit models to identify factors affecting NCD service availability and readiness.

Only 8% of facilities provided all four NCD services. Availability varied for specific services, with cervical cancer screening being the least available service in the country: less than 10% of facilities, primarily higher-level hospitals, provided cervical cancer screening. General service readiness was a strong predictor of NCD service availability. Differences in NCD service availability and readiness between regions and facility types were significant. Increased readiness for specific NCD services was significantly associated with increased readiness for communicable disease services and interacted with the readiness for other NCD services.

NCD service availability has considerable regional variation and is positively associated with general and communicable disease services readiness. Readiness for specific NCD services interacted with one another. The findings suggest an integrated approach to service delivery, focussing holistically on all disease services, is needed. There also needs to be increased attention to reducing resource allocation variation between facility types and locations.

Peer Review reports

Health systems worldwide are faced with increasing burdens from noncommunicable diseases (NCDs), aggravated by underinvestment in health, shortage of skilled healthcare workers, and the COVID-19 pandemic [ 1 ]. Ethiopia, like many low-middle-income countries (LMICs), has been experiencing a significant rise in NCD-related disability and premature mortality in recent years [ 1 , 2 , 3 , 4 ].

Over the past three decades, NCD-related deaths have doubled and NCDs accounted for around one-third of all diseases in Ethiopia in 2019 [ 5 ], while projections suggest this figure may reach two-thirds of all deaths by 2040 [ 5 ]. Cardiovascular diseases, neoplasms, diabetes, chronic respiratory diseases, and chronic kidney diseases are the leading causes of NCD-related deaths, collectively accounting for over half of the top 15 causes of mortality in the country [ 6 ]. With nearly 30% of these deaths occurring among individuals under 50 years old, the social, economic, and familial implications are substantial [ 7 ].

To tackle the burden of NCDs and their associated consequences and resulting implications, Ethiopia has implemented a comprehensive National Strategy (2014–2016) in alignment with the World Health Organization’s (WHO’s) global framework for addressing NCDs [ 3 ]. The national strategic plan primarily focused on improving health promotion and disease prevention to reduce behavioural risk factors and strengthening the primary health care system to address NCDs [ 3 ].

To assess progress and demonstrate results, it is crucial to have information on service availability and health system readiness at country and global levels. The WHO’s Service Availability and Readiness Assessment (SARA) survey is one of such tools designed to assess the availability and readiness of health facilities to provide essential and service-specific services. The Ethiopian SARA survey assessed the availability and readiness of facilities for critical healthcare interventions, such as HIV/AIDS, tuberculosis, and NCDs [ 8 ].

As defined by WHO, a service is deemed available if a facility manages, diagnoses, treats, or prescribes a patient who is coming to the facility for disease-specific service. At the same time, general and disease-specific readiness refers to the capability of health facilities to deliver service on a specific condition. Accordingly, General Service Readiness (GSR) assesses the overall capacity of health facilities to provide essential health services, which requires the availability and readiness of basic amenities, core equipment, infection prevention measures, diagnostics, and medicines. This capacity is measured by considering tracer items, including trained staff, guidelines, equipment, diagnostic capacity, medicines and commodities that are specific to and necessary for providing services for that particular disease [ 8 , 9 ]. Detailed descriptions of each indicator can be found in the method section.

Although Ethiopia’s SARA report and a few other studies descriptively assessed the NCD service availability and readiness in the last few years [ 6 , 9 , 10 , 11 , 12 , 13 , 14 ], there is a lack of detailed empirical studies on factors affecting NCD service availability and readiness in the country. In the current study, we used novel approaches to answer three specific questions: [ 1 ] does general service readiness predict NCD service availability [ 2 ]? is a facility’s readiness to provide NCD services affected by its management and geographic location within the country, and [ 3 ] is NCD service readiness associated with the availability and readiness of non-NCD services? The study’s findings will contribute to informing national and subnational initiatives to strengthen the health system to deliver NCD services in the country. In addition, the novel analytical approach used in this study offers valuable insights to investigate health system readiness in Ethiopia and beyond.

Data source and study setting

The study is a secondary analysis of the 2018 Ethiopian SARA survey [ 9 ]. The 2018 Ethiopian survey was undertaken by the Ethiopian Public Health Institute (EPHI) with financial support from the World Bank. Based on the WHO SARA survey tool, implemented in over 20 countries across LMICs, the Ethiopian version of SARA included all facility types and rural and urban areas across the country [ 8 ].

Data were collected using a facility inventory questionnaire, which obtained information on the availability of specific items (including location and functional status) and how the facilities are prepared to provide essential health services. The survey used a nationally representative sample stratified by health facility type and managing authority. All hospitals and selected health centres, clinics, and health posts were included. The sample allocation for this survey took into account the skewed distribution of health facilities at the regional level.

The SARA survey included 31 referral hospitals, 116 general hospitals, 156 primary hospitals, 164 health centres, 19 higher clinics, 74 medium clinics, 72 lower clinics and 132 health posts. Health posts, that are not mandated to provide NCD care in the country [ 7 , 9 ] were excluded from the analysis. Data were collected from October - December 2017. Computer assisted data was collected by trained health providers from the selected facilities. The information entered in the tablets by each interviewer was sent regularly to the EPHI central server by the interviewers for data management and analysis. Detailed sample size calculation and sampling technique can be found in the Ethiopian SARA report [ 8 , 10 ].

Ethiopia has an estimated population of 117 million in 2021, with the majority of the population (78%) are rural residents. The burden of NCDs and their associated risk factors in Ethiopia vary according to geography and socioeconomic status [ 6 ]. Ethiopia’s health care system follows a three-tiered structure— primary, secondary, and tertiary levels of health care—with federal, regional, zone (administrative structure between region and district) and district administration, operating under a unified planning, financing, and reporting framework. The primary level of care includes primary hospitals, health centres and health posts (the lowest-level health system facility, at village level). The government plays a prominent role in delivering health services, including NCD services, especially in rural communities [ 10 ]. A detailed description of the country’s health system, including the delivery of NCD services, has been comprehensively reported in previous studies [ 10 , 15 ].

Statistical analysis

Outcome measures.

NCD service availability was based on a response to whether the facility offered diagnosis and management of each of the four NCDs. The responses were coded as either 1 (if the facility provided the service) or 0 (if it did not) [ 8 , 9 ].

NCD service readiness was based on a dichotomous response (1 = available and 0 otherwise) for a list of five core tracer items (availability of trained staff and guidelines; equipment; diagnostic capacity; and NCD medicines and commodities) required to provide the specific service and aggregated into a mean score (Table 1 provide detailed on the tracer items). Readiness was a composite measure and was restricted to the subset of facilities that offered the service. The service readiness score for each NCD service was calculated based on the mean percentage availability of the tracer indicators for delivering NCD services.

We examined three groups of exposure variables to assess associations with NCD service availability and readiness:

facility characteristics including location (rural or urban), managing authority (public or others), and facility type (hospitals, higher and medium clinics, health centres, and lower clinics).

GSR defined as the mean percentage availability of the following services: basic amenities, basic equipment, standard precautions for infection prevention, diagnostic capacity, and essential medicines (Table 1 ).

The availability of non-NCD services (communicable disease services), measured based on a response to whether the facility offered diagnosis and management of each non-NCD service. In this study, we included five non-NCD services: maternal health, child health services, tuberculosis, malaria and prevention, care, and management of HIV/AIDS.

Non-NCD service readiness, which measures the facility’s capacity to provide the service. This was assessed in our study based on the availability of tracer items specific to the disease, including maternal health, child health services, tuberculosis, malaria and prevention, care, and management of HIV/AIDS [ 8 , 9 ].

Table 1 provides a list of tracer items used in the WHO survey. A detailed description of how these indices were constructed and the tracer items for the specific non-NCD service—maternal health, child health services, tuberculosis, malaria and prevention, care, and management of HIV/AIDS—can be found in reports published elsewhere [ 8 , 9 ]. The scores for the various indicators range from zero to 100, with 100 indicating that the facility possesses all the tracer items and zero suggesting the absence of any tracer items.

To analyse the date, first we first used descriptive statistics to summarize the availability and readiness of NCD services. We then employed multivariate models to answer the three core research questions outlined earlier.

Our multivariate analysis consisted of two broad sets of models [ 16 , 17 , 18 ]. First, we modelled the probability of NCD service availability as a function of GSR and selected facility characteristics. We included three facility characteristics, location (urban/rural), managing authority and facility type as predictors of NCD availability. The outcome variable consisted of three mutually exclusive response categories—i.e., those providing all four NCD services, those offering some (i.e., at least one but not all) NCD services and those offering none. Since the outcome variable had an ordinal nature and local conditions may influence facility service, we used a mixed-effect ordered-logit model with a random effect at a zonal level.

Subsequently, we modelled the probably of readiness for providing NCD services as a function of two key variables: readiness for communicable disease services (cross-program association) and the readiness for NCD services other than the service attributed to the disease under consideration (between program association, i.e. interaction between different NCD services). Considering the effect of having an extended list of variables on the degree of freedom of the model in a modest sample like ours, we decided to reduce the number of parameters by having just one readiness score for all five communicable diseases (instead of generating a readiness score for each of the five communicable diseases captured in our study). Hence, we combined the readiness measure for these diseases with a simple averaging of the respective readiness scores and called the resulting estimate the communicable disease service readiness score. In this study, we included the following five non-NCD services: maternal health, child health services, tuberculosis, malaria and prevention, care, and management of HIV/AIDS in our analysis.

We included readiness for NCDs to test the difference between program or reciprocal association, excluding the service being analysed as the outcome variable. For example, in the CVD readiness model, we included the combined readiness score for diabetes, chronic respiratory disease, and cancer screening services. We employed mixed-effect linear models with a random effect at a zonal level, considering the continuous nature of the outcome variables and potential area-level factors influencing the availability and readiness of services at a facility level. All models included urban-rural location, managing authority, and facility type as covariates. We applied sample weights to address survey response variations and the unequal probability of facility selection across different geographic locations. The models were estimated using STATA version 18.0.

Characteristics of surveyed health facilities and general service readiness

Out of the 632 health facilities (excluding the 132 health posts), the majority (65%) were public, and 78% were in urban areas. General hospitals represented 18%, and primary hospitals, health centres and clinics each accounted for a quarter of the sample facilities (Table 2 and Appendix I).

Limited NCD services

As shown in Table 2 , only 8% (6.2 10.6) of facilities provided all four essential NCD services. Availability varied for specific services, with cervical cancer screening being the least available service in the country: less than 10% of facilities, primarily higher-level hospitals, provided the service. On the other hand, chronic respiratory disease services are available in 53% [95% CI: (49.3, 57.1)] of facilities, cardiovascular disease services in 49% [95% CI: (44.8, 52.6)], and diabetes mellitus services in 37% [95% CI: (32.7, 40.2)] of facilities. None of the lower clinics and only 5% [ 95% CI: (1.3, 7.7)] of health centres provided all four NCD services.

General service readiness

Nationally, all facilities combined had, on average, only half of the required GSR tracer items, 55% [95% CI (54.3, 56.7)]. GSR scores were higher in hospitals (such as referral hospitals at 89.3% [95% CI (86.5, 92.1)]) and urban facilities at 58% [95% CI 6.6, 59.4)] than in health centre at 59% [95% CI (57.6, 61.2)] and rural areas 51.5% [95% CI (49.2, 53.7)]. Referral hospitals (89.3% [95% CI (86.5, 92.1)]) and general hospitals (86.2% [95% CI (84.9, 87.5)]) lacked all the tracer items to provide basic services as their individual scores were well below 100% (See Table 2 and Appendix II). Lower clinics had the lowest general readiness of all facility types, 40.6% [95% CI (39.3, 42.0)] (See Table 2 and Appendix I).

Generally, the proportion of facilities providing services on all four NCDs was higher in more urbanized regions, such as Dire Dawa at 86% [95% CI: (85.6, 100)] and Addis Ababa at 94.5% [95% CI: (77.6, 98.60], than in other jurisdictions. NCD services were also generally more available in hospitals through primary health care units (health centres) and concentrated in facilities with high GSR scores (see Appendix II).

Table 3 describes the associations between GSR, facility type, and NCD service availability. After controlling for facility characteristics, for every 10-unit increase in GSR score, there was an increase in offering some and all NCD services by 0.11% and 0.66%, respectively and a 0.43% decrease in having no NCD service. After controlling for GSR, non-public facilities were associated with higher NCD service availability, which differs from what was observed in Table 2 . The significant random effect coefficient indicates the importance of unmeasured locational factors (captured at the zonal level) on NCD service availability. The Inter-Class Correlation (ICC) 0.74 [CI (0.61 0.84) further suggests that about 75% of the variation in NCD service availability in the country can be attributed to location. At the same time, GSR and other facility characteristics contribute to the remaining balance.

Low readiness for NCD services

Table 4 presents NCD service readiness by selected facility characteristics. Accordingly, nationally, half of the facilities in the country had a readiness score of 26% or less, ranging from 46% for diabetes diagnosis and/or management (DDM) to 33% for cardiovascular disease (CVD) service and less than 27% for chronic respiratory disease (CRD) service. Similarly, half of the facilities had none of the tracer items for cervical cancer screening (CSS), and even the top 10% had just 50% of the tracer items for CSS.

Generally, referral hospitals had the highest mean readiness score (72%), followed by other hospital types and higher clinics (47%). Among the top 10% of health facilities, NCD service readiness was as high as 80% or more in referral and general hospitals but less than 40% and 20% in health centres and lower clinics, respectively. Service-specific differences were negligible within the hospital setting, except for CRD, where referral hospitals had a slight edge over other types.

The highest difference in service readiness between locations and management types was at the higher end of the distribution, particularly for cervical cancer screening. The average readiness for CCS in urban areas was about 16% compared to less than 5 in rural areas. However, up to 50% of facilities in both rural and urban areas had no cervical cancer screening. In both non-public and publicly managed facilities, readiness for CRD and CVD among the bottom 10% was about 10 and 25%, respectively, and 50% had no CSS services.

Interaction between NCD and non-NCD service readiness

Table 5 shows the associations between NCD and non-NCD service readiness. Overall, readiness for specific NCD services was significantly associated with readiness for communicable diseases [0.155, 95% CI: (0.046, 0.75)]. The readiness of DDM, CRD, and CSS services was also positively associated with communicable disease service readiness, suggesting complementary between-program effects. Similarly, within-program reciprocal effects (i.e., interaction effects of readiness for a given NCD service on other NCD service readiness) were significant for DDM, CVD and CRD.

Also, location and facility type, rather than under which management the facility is or whether it is in urban or rural areas, were significant predictors of overall NCD service readiness and readiness for specific NCD services. As suggested by the inter-class correlations—i.e., the expected correlation in service readiness between two randomly drawn facilities in the same zone—over a third of the variation in service readiness for diabetes, cervical cancer screening and overall NCD, and up to 43 and 51 per cent of the readiness variations for CVD and CRD services were attributable to differences between locations. The relatively high and significant ICC for all models suggests that including the location (i.e., zone) level random effect variable was imperative as it improved our model estimates. Appendix III shows unadjusted estimate of the associations between NCD and non-NCD service readiness.

The burden of NCDs in Ethiopia continues to impact health outcomes and strain the healthcare system negatively. To effectively address this burden, it is crucial to integrate services across primary, secondary, and tertiary care and between different programs. Our study examined the association of NCD service availability with GSR and identified the predictors of NCD service readiness. We also examined the interactions between NCD service readiness and the availability and readiness of various non-NCD services in Ethiopia. There are three key findings from our analysis.

First, we found that GSR strongly predicts the availability of NCD services. Second, the readiness of NCD services has significant interactions with each other and the readiness of communicable disease programs such as HIV/AIDS, malaria, and tuberculosis. Third, we found that between-locality differences in NCD service availability and readiness were large and significant, and so were between facility types, especially for service readiness.

Access to care is crucial for addressing inequalities and reducing the burden of NCDs [ 11 , 12 , 19 , 20 ]. However, our results showed that NCD services are limited in Ethiopia. While there has been some improvement compared to previous years—where availability was at 22%, 41%, 45%, and 2% in 2016—the current findings still highlight the limited availability of NCD services within the country’s health system, which is consistent with studies conducted in Ethiopia and other LMICs [ 11 , 14 , 19 , 20 ].

Our study also found that the availability of NCD services differed across facility types, managing authorities, and geographical settings. Hospitals, public facilities, and urbanized regions generally had a higher proportion of facilities providing all four NCD services. Medium and higher-level clinics had a higher probability of delivering all four services than hospitals. Similar findings were reported in other studies conducted in Ethiopia and elsewhere, highlighting the absence of NCD services in primary healthcare facilities, public sectors, and rural settings [ 6 , 11 , 13 , 14 , 20 , 21 ].

However, our analysis showed that GSR readiness strongly predicts the availability of NCD services, regardless of facility attributes. In Ethiopia, facilities had only half the required items to provide essential health services, and the shortage was not limited to lower-level facilities. This could be due to the gradual decline in the availability of resources, including fundings, required to provide NCD services in the country [ 14 ]. The national GSR reported for Ethiopia was similar to Bangladesh and Nepal for comparable periods [ 6 , 12 , 22 ]. GSR may, however, be a proxy for other factors, such as workforce density, which our analysis did not capture. Moreover, averages do mask the realities beneath, especially if the distributions are skewed, as we have shown to be the case in Ethiopia.

Our results showed a ‘duality gap’ (i.e., wide geographic and facility-type variations) in the Ethiopian health system as resources were more skewed toward higher facilities. Large and significant locational (in this case, zonal level) disparities in the availability and readiness of NCD services also exist, as shown by the results of the random effect coefficient. The ‘duality gap’ has important policy implications because it signifies unequal distribution of health resources between locations and care levels and requires an all-encompassing response that addresses both locational disadvantage and bias in favour of hospital-based care. The drive to address such inequalities requires focusing on the primary health care system and public and rural facilities since these facilities are more accessible to most of the population and can ensure universal health coverage [ 10 , 11 , 15 , 22 ]. This is relevant to halt the burden of undiagnosed NCD conditions and chronic morbidity that come with a lack of timely access to services. [ 11 , 22 ].

The findings suggest that an integrated approach is needed to improve the availability and readiness of NCD services. This includes optimizing resources such as medicine, guidelines, workforce training, and strong leadership for sectoral coordination and engagement with the private sector [ 11 , 13 , 21 , 23 , 24 ]. Moreover, there is a need to engage the private sector with solid leadership and sectoral coordination of programs [ 11 , 13 , 21 , 23 , 24 ], as our results suggested a strong association between facility managing authority and NCD service availability. Furthermore, where relevant, there is a need to leverage the country’s prior success in primary health care reforms, primarily in the areas of MCH, HIV and malaria, as these were driven by strong political leadership, good governance, resource commitments, effective stakeholder engagement, and the existence of a well-defined program of action [ 6 , 10 , 25 , 26 , 27 , 28 ].

Improving service availability should go hand in hand with increasing readiness, as many facilities providing NCD services had low readiness scores. A previous report from Ethiopia has shown that access to quality health services is significantly below appropriate minimum standards and the estimated population in need of those services [ 6 ]. This indicates that despite the existence of an NCD strategy and NCDs being four of Ethiopia’s top 15 causes of death, service availability and readiness for NCDs are still low [ 6 , 14 , 29 ]. Studies in various settings also suggested sub-optimal service readiness for NCDs [ 11 , 12 , 23 ]. In our research, readiness to provide DDM, CRD and CSS services strongly and positively correlated with overall service readiness for communicable diseases (HIV, malaria, and TB), signifying a strong complementary between-program effect for these services. Similarly, within-program reciprocal effects were found to be substantial for DDM, CVD and CRD. An integrated people-centred care system has the potential to generate significant health and health system benefits, leading to improvements in access and efficiency of services, reduction in personal and health system costs, and optimisation of often scarce human resources [ 15 , 30 ]. However, despite the benefits, integration needs careful planning as resource scarcity and poor coordination can pose challenges and reverse gains [ 31 , 32 ].

While our study utilized a novel analytical strategy to answer a complex research question, the study has limitations. First, given the cross-sectional nature of the data used for analysis, the findings reported in the study should be viewed as indicators of associations rather than pointing to a direct cause-and-effect relationship. Particularly, the interaction between the readiness of NCD services and the readiness of communicable disease programs such as HIV/AIDS, malaria, and tuberculosis may require alternative study designs. These interactions may be influenced by other factors and require a comprehensive understanding of the dynamics within the health sector. Second, the sampling frame for the SARA survey included health centers and health posts listed in the country’s master health facility list, which might be incomplete and possibly excludes health facilities, especially newer constructions and private facilities [ 9 ].

In addition, the data used for the analysis come from a pre-COVID-19 pandemic period. The pandemic, as in all other countries, disrupted and re-configured healthcare services, as is the ongoing conflict in the country, and these are not captured in our analysis. However, the results still fill a significant knowledge gap and serve as a useful benchmark in rebuilding a resilient health system in the country.

If there was any lesson to be learned from COVID-19, it was the realisation that all health systems are fragile and vulnerable to pandemics, and the weakest link mattered most. As established by the present study, NCD service availability and readiness are Ethiopia’s weakest link, and knowing the gaps, as attempted in the present study, helps focus attention on the effort to build better for future similar eventualities. In addition, the approaches we used for analysing the SARA in the present study are novel and can serve as valuable tools for future researchers using SARA surveys to investigate health system readiness in Ethiopia and beyond.

Despite efforts to enhance the diagnosis and management of NCDs in Ethiopia, the availability and readiness of NCD services remain limited. Furthermore, the readiness of NCD services significantly interacts with each other and with the readiness of communicable disease programs such as HIV/AIDS, malaria, and tuberculosis. Strengthening the healthcare system and integrating services across different levels and programs is essential to address the NCD burden. Additionally, addressing geographic and facility-type variations and improving general service readiness such as infrastructure, essential medicines, diagnostic and laboratory tests, are important factors in reducing disparities, achieving universal health coverage, and mitigating inequities in healthcare access.

Availability of data and materials

All relevant data contributing to the findings are within the paper and in Appendices I, II, and III. As secondary data users, we are restricted by data sharing policy and ethical clearance to share additional data. All request for the original data should be directed to the data custodian, the Ethiopian Public Health Institute.

Abbreviations

Cardiovascular disease

Cervical cancer screening

Chronic respiratory disease

Diabetes diagnosis and/or management

General Service Readiness

Low-middle-income countries

Non-communicable disease

Service Availability and Readiness Assessment

World Health Organization

World Health Organization. Noncommunicable diseases. Key Facts. Geneva: World Health Organization 2021 [Available from:  https://www.who.int/news-room/fact-sheets/detail/noncommunicable-diseases .

World Health Organization. Noncommunicable Diseases Progress Monitor 2020. Geneva. 2020. Licence: CC BYNC-SA 3.0 IGO.

Global Burden of Diseases (GBD) COMPARE. Analyze updated data about the world’s health levels and trends from 1990 to 2019. Interactive tool using estimates from the Global Burden of Disease (GBD) study. 2019 [ https://vizhub.healthdata.org/gbd-compare/

World Health Organization. Integrating health services. Technical series on primary health care World Health Organization; 2018.

Google Scholar  

Global Burden of Diseases (GBD) COMPARE. 2023.  https://vizhub.healthdata.org/gbd-foresight/ .

Federal Democratic Republic of Ethiopia, Ministry of Health. The Ethiopia Noncommunicable diseases and injuries (NCDI) Commission Report Summary. Addressing the Impact of Noncommunicable Diseases and Injuries in Ethiopia; 2018.

Vladislav Dombrovskiy A, Workneh F, Shiferaw R, Small N, Banatvala. Prevention and control of noncommunicable diseases in Ethiopia. The case for investment. In: World Health Organization, UNDP, editors. Addis Ababa2019. 2019.

World Health Organization. Service Availability and Readiness Assessment (SARA). An annual monitoring system for service delivery. Reference Manual. Version 2.2. 2015. https://www.who.int/publications/i/item/WHO-HISHSI-2014.5-Rev.1 .

Ethiopian Public Health Institute, Ethiopia. Services Availability and Readiness Assessment (SARA). 2018.

Tesema AG, Abimbola S, Mulugeta A, Ajisegiri WS, Narasimhan P, Joshi R, et al. Health system capacity and readiness for delivery of integrated non-communicable disease services in primary health care: a qualitative analysis of the Ethiopian experience. PLOS Global Public Health. 2021;1(10):e0000026.

Article   PubMed   PubMed Central   Google Scholar  

Ammoun R, Wami WM, Otieno P, Schultsz C, Kyobutungi C, Asiki G. Readiness of health facilities to deliver non-communicable diseases services in Kenya: a national cross-sectional survey. BMC Health Serv Res. 2022;22(1):985.

Ghimire U, Shrestha N, Adhikari B, Mehata S, Pokharel Y, Mishra SR. Health system’s readiness to provide cardiovascular, diabetes and chronic respiratory disease related services in Nepal: analysis using 2015 health facility survey. BMC Public Health. 2020;20(1):1163.

Mulugeta TK, Kassa DH. Readiness of the primary health care units and associated factors for the management of hypertension and type II diabetes mellitus in Sidama, Ethiopia. PeerJ. 2022;10:e13797.

Defar A, Zeleke GT, Berhanu D, Lemango ET, Bekele A, Alemu K, et al. Health system’s availability and readiness of health facilities for chronic non-communicable diseases: evidence from the Ethiopian national surveys. PLoS ONE. 2024;19(2):e0297622.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Azeb Gebresilassie Tesema. Improving the prevention and management of non-communicable diseases through primary health care in Ethiopia [Thesis]. Sydney, Australia: University of New South Wales; 2023.

Carlson A, Joshi R. Sample selection in linear panel data models with heterogeneous coefficients. J Appl Econom. 2024;39(2):237–55.

Greene WH. Econometric Analysis. 7th ed. Upper Saddle River, NJ: Prentice Hall; 2012.

Heckman J. Sample selection bias as a specification error. Econometrica. 1979;47:153–61.

Article   Google Scholar  

Getachew T, Bekele A, Amenu K, Defar A, Teklie H, Taye G, Taddele T, Gonfa G, Getnet M, Gelibo T, Assefa Y. Service availability and readiness for major non-communicable diseases at health facilities in Ethiopia. Ethiop J Health Dev. 2017;31(1):384–90.

Bintabara D, Ngajilo D. Readiness of health facilities for the outpatient management of non-communicable diseases in a low-resource setting: an example from a facility-based cross-sectional survey in Tanzania. BMJ open. 2020;10(11):e040908.

Bintabara D, Shayo FK. Disparities in availability of services and prediction of the readiness of primary healthcare to manage diabetes in Tanzania. Prim Care Diabetes. 2021;15(2):365–71.

Article   PubMed   Google Scholar  

Biswas T, Haider MM, Gupta RD, Uddin J. Assessing the readiness of health facilities for diabetes and cardiovascular services in Bangladesh: a cross-sectional survey. BMJ open. 2018;8(10):e022817.

Moucheraud C. Service Readiness for Noncommunicable diseases was low in five countries in 2013–15. Health Aff (Millwood). 2018;37(8):1321–30.

Orji IA, Baldridge AS, Omitiran K, Guo M, Ajisegiri WS, Ojo TM, et al. Capacity and site readiness for hypertension control program implementation in the Federal Capital Territory of Nigeria: a cross-sectional study. BMC Health Serv Res. 2021;21(1):322.

Croke K. The origins of Ethiopia’s primary health care expansion: the politics of state building and health system strengthening. Health Policy Plan. 2020;35(10):1318–27.

Article   PubMed Central   Google Scholar  

Assefa Y, Hill PS, Gilks CF, Admassu M, Tesfaye D, Van Damme W. Primary health care contributions to universal health coverage, Ethiopia. Bull World Health Organ. 2020;98(12):894–A905.

Federal Democratic Republic of Ethiopia, Ministry of Health. Health sector transformation plan, 2015/16 - 2019/20. 2015. https://faolex.fao.org/docs/pdf/eth208347.pdf .

Pamela A, Juma C, Mapa-tassou, Shukri F, Mohamed, Beatrice L, Matanje Mwagomba C, Ndinda M, Oluwasanu, et al. Multi-sectoral action in non-communicable disease prevention policy development in five African countries. BMC Public Health. 2018;18(1):953.

Girum T, Mesfin D, Bedewi J, Shewangizaw M. The Burden of Noncommunicable diseases in Ethiopia, 2000–2016: analysis of evidence from global burden of Disease Study 2016 and Global Health estimates 2016. Int J Chronic Dis. 2020;2020:3679528.

PubMed   PubMed Central   Google Scholar  

World Health Organization. Framework on integrated, people-centred health services: report by the Secretariat. Sixty-Ninth World Health Assembly; 2016.

Olukemi Adeyemi M, Lyons T, Njim J, Okebe J, Birungi K, Nana, et al. Integration of non-communicable disease and HIV/AIDS management: a review of healthcare policies and plans in East Africa. BMJ Global Health. 2021;6(5):e004669.

Tesema AG, Peiris D, Joshi R, Abimbola S, Fentaye FW, Teklu AM, et al. Exploring complementary and competitive relations between non-communicable disease services and other health extension programme services in Ethiopia: a multilevel analysis. BMJ Global Health. 2022;7(6):e009025.

Download references

Acknowledgements

This work benefitted from an incentive grant from the World health Organisation (WHO) as part of the WHO Global Noncommunicable Disease (NCD) Platform for Young Researchers Programme to AGT (Grant Reg 2022/1249356). AGT would like to thank Associate Professor Yohannes Kinfu for guidance on data analysis. All authors would also like to thank the Ethiopian Public Health Institute, Ministry of Health, Ethiopia for providing the survey data.

AGT received incentive grant from the World health Organization (WHO) as part of the WHO Global Noncommunicable Disease (NCD) Platform for Young Researchers Programme.

Author information

Authors and affiliations.

School of Population Health, University of New South Wales, Sydney, Australia

Azeb Gebresilassie Tesema & Rohina Joshi

The George Institute for Global Health, University of New South Wales (UNSW), Sydney, Sydney, Australia

Azeb Gebresilassie Tesema, Seye Abimbola & David Peiris

The George Institute for Global Health, New Delhi, India

Rohina Joshi

School of Public Health, University of Sydney, Sydney, Australia

Seye Abimbola

Ethiopia Public Health Institute, Addis Ababa, Ethiopia

Alemnesh H. Mirkuzie

Global Noncommunicable Diseases Platform, World Health Organization, Geneva, Switzerland

Daria Berlina & Tea Collins

You can also search for this author in PubMed   Google Scholar

Contributions

AGT, DP, RJ and SA contributed to the conception of the study. AGT, AHM, contributed to data curation. AGT, conducted formal analysis and visualization. AGT and DP contributed to data interpretation. Writing—original draft preparation: AGT. Writing—review and editing: AGT, DP, RJ, SA, AHM, DB and TC. All authors provided critical intellectual input and revised the final draft manuscript for submission.

Corresponding author

Correspondence to Azeb Gebresilassie Tesema .

Ethics declarations

Ethics approval and consent to participate.

The study has ethics approval from the University of New South Wales (UNSW) Human Research Ethics Committee (HC210066), Sydney, Australia, to conduct a secondary data analysis of the SARA data.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary material 1., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Cite this article.

Tesema, A.G., Joshi, R., Abimbola, S. et al. Readiness for non-communicable disease service delivery in Ethiopia: an empirical analysis. BMC Health Serv Res 24 , 1021 (2024). https://doi.org/10.1186/s12913-024-11455-5

Download citation

Received : 03 March 2024

Accepted : 19 August 2024

Published : 04 September 2024

DOI : https://doi.org/10.1186/s12913-024-11455-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Health service readiness
  • Non-communicable diseases

BMC Health Services Research

ISSN: 1472-6963

the quality of a research study is primarily assessed on

Dynamic Bayesian networks for spatiotemporal modeling and its uncertainty in tradeoffs and synergies of ecosystem services: a case study in the Tarim River Basin, China

  • ORIGINAL PAPER
  • Published: 02 September 2024

Cite this article

the quality of a research study is primarily assessed on

  • Yang Hu 1 , 2 ,
  • Jie Xue 2 , 3 , 4 ,
  • Jianping Zhao 1 ,
  • Xinlong Feng 1 ,
  • Huaiwei Sun 5 ,
  • Junhu Tang 6 &
  • Jingjing Chang 2  

Ecosystem services (ESs) refer to the benefits that humans obtain from ecosystems. These services are subject to environmental changes and human interventions, which introduce a significant level of uncertainty. Traditional ES modeling approaches often employ Bayesian networks, but they fall short in capturing spatiotemporal dynamic change processes. To address this limitation, dynamic Bayesian networks (DBNs) have emerged as stochastic models capable of incorporating uncertainty and capturing dynamic changes. Consequently, DBNs have found increasing application in ES modeling. However, the structure and parameter learning of DBNs present complexities within the field of ES modeling. To mitigate the reliance on expert knowledge, this study proposes an algorithm for structure and parameter learning, integrating the InVEST (Integrated Valuation of Ecosystem Services and Trade-Offs) model with DBNs to develop a comprehensive understanding of the spatiotemporal dynamics and uncertainty of ESs in the Tarim River Basin, China from 2000 to 2020. The study further evaluates the tradeoffs and synergies among four key ecosystem services: water yield, habitat quality, sediment delivery ratio, and carbon storage and sequestration. The findings show that (1) the proposed structure learning and parameter learning algorithm for DBNs, including the hill-climb algorithm, linear analysis, the Markov blanket, and the EM algorithm, effectively address subjective factors that can influence model learning when dealing with uncertainty; (2) significant spatial heterogeneity is observed in the supply of ESs within the Tarim River Basin, with notable changes in habitat quality, water yield, and sediment delivery ratios occurring between 2000–2005, 2010–2015, and 2015–2020, respectively; (3) tradeoffs exist between water yield and habitat quality, as well as between soil conservation and carbon sequestration, while synergies are found among habitat quality, soil retention, and carbon sequestration. The land-use type emerges as the most influential factor affecting the tradeoffs and synergies of ESs. This study serves to validate the capacity of DBNs in addressing spatiotemporal dynamic changes and establishes an improved research methodology for ES modeling that considers uncertainty.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

the quality of a research study is primarily assessed on

Similar content being viewed by others

the quality of a research study is primarily assessed on

Modelling relationships between socioeconomy, landscape and water flows in Mediterranean agroecosystems: a case study in Adra catchment (Spain) using Bayesian networks

the quality of a research study is primarily assessed on

Modeling the spatial distribution of multiple ecosystem services in Ilam dam watershed, Western Iran: identification of areas for spatial planning

Saores: a spatially explicit assessment and optimization tool for regional ecosystem services, data availability.

The datasets used for this study are available from the corresponding author on reasonable request. No datasets were generated or analysed during the current study.

Aguilera PA, Fernández A, Fernández R, Rumí R, Salmerón A (2011) Bayesian networks in environmental modelling. Environ Model Softw 26:1376–1388. https://doi.org/10.1016/j.envsoft.2011.06.004

Article   Google Scholar  

Barbier EB, Koch EW, Silliman BR et al (2008) Coastal ecosystem-based management with nonlinear ecological functions and values. Science 319(5861):321–323

Article   CAS   Google Scholar  

Bennett EM, Peterson GD, Gordon LJ (2009) Understanding relationships among multiple ecosystem services. Ecol Lett 12(12):1394–1404

Bicking S, Burkhard B, Kruse M, Müller F (2019) Bayesian belief network-based assessment of nutrient regulating ecosystem services in Northern Germany. PLoS ONE 14:e0216053. https://doi.org/10.1371/journal.pone.0216053

Chang JJ, Bai YX, Xue J, Gong L, Zeng FJ, Sun HW, Hu Y, Huang H, Ma YT (2023) Dynamic Bayesian networks with application in environmental modeling and management: a review. Environ Model Softw 170:105835

Costanza R, d’Arge R, De Groot R, Farber S, Grasso M, Hannon B, Limburg K, Naeem S, O’Neill RV, Paruelo J, Raskin RG, Sutton P, Van Den Belt M (1998) The value of the world’s ecosystem services and natural capital. Ecol Econ 25:3–15. https://doi.org/10.1016/S0921-8009(98)00020-2

Costanza R, De Groot R, Sutton P, Van Der Ploeg S, Anderson SJ, Kubiszewski I, Farber S, Turner RK (2014) Changes in the global value of ecosystem services. Glob Environ Change 26:152–158. https://doi.org/10.1016/j.gloenvcha.2014.04.002

Daily GC (ed) (1997) Nature’s services: societal dependence on natural ecosystems. Island Press, Washington, DC

Google Scholar  

Dang KB, Windhorst W, Burkhard B, Müller F (2019) A Bayesian belief network—based approach to link ecosystem functions with rice provisioning ecosystem services. Ecol Ind 100:30–44. https://doi.org/10.1016/j.ecolind.2018.04.055

Das A, Das M, Houqe R, Pereira P (2023) Mapping ecosystem services for ecological planning and management: a case from a tropical planning region, Eastern India. Environ Sci Pollut Res 30:7543–7560. https://doi.org/10.1007/s11356-022-22732-3

Domínguez-Tejo E, Metternicht G (2019) An ecosystem-based approach and Bayesian modelling to inform coastal planning: a case study of Manly, Australia. Environ Sci Policy 101:72–86. https://doi.org/10.1016/j.envsci.2019.07.019

Feng Z, Jin X, Chen T, Wu J (2021) Understanding trade-offs and synergies of ecosystem services to support the decision-making in the Beijing–Tianjin–Hebei region. Land Use Policy 106:105446

Forio MAE, Villa-Cox G, Van Echelpoel W, Ryckebusch H, Lock K, Spanoghe P, Deknock A, De Troyer N, Nolivos-Alvarez I, Dominguez-Granda L, Speelman S, Goethals PLM (2020) Bayesian Belief Network models as trade-off tools of ecosystem services in the Guayas River Basin in Ecuador. Ecosyst Serv 44:101124. https://doi.org/10.1016/j.ecoser.2020.101124

Fox WE, Medina-Cetina Z, Angerer J, Varela P, Ryang Chung J (2017) Water quality & natural resource management on military training lands in Central Texas: improved decision support via Bayesian Networks. Sustain Water Qual Ecol 9–10:39–52. https://doi.org/10.1016/j.swaqe.2017.03.001

Fu Q, Hou Y, Wang B, Bi X, Li B, Zhang X (2018) Scenario analysis of ecosystem service changes and interactions in a mountain-oasis-desert system: a case study in Altay Prefecture, China. Sci Rep 8:12939. https://doi.org/10.1038/s41598-018-31043-y

Furlan E, Slanzi D, Torresan S, Critto A, Marcomini A (2020) Multi-scenario analysis in the Adriatic Sea: a GIS-based Bayesian network to support maritime spatial planning. Sci Total Environ 703:134972. https://doi.org/10.1016/j.scitotenv.2019.134972

Gao J, Li F, Gao H, Zhou C, Zhang X (2017) The impact of land-use change on water-related ecosystem services: a study of the Guishui River Basin, Beijing, China. J Clean Prod 163:S148–S155. https://doi.org/10.1016/j.jclepro.2016.01.049

Grömping U (2006) Relative importance for linear regression in R : The package relaimpo . J Stat Soft. https://doi.org/10.18637/jss.v017.i01

Han H, Zhang J, Ma G, Zhang X, Bai Y (2018) Advances on impact of climate change on ecosystem services. J Nanjing for Univ (Nat Sci Ed) 61(02):184–190. https://doi.org/10.3969/j.issn.1000-2006.201706007

Hao R, Yu D, Sun Y, Shi M (2019) The features and influential factors of interactions among ecosystem services. Ecol Ind 101:770–779. https://doi.org/10.1016/j.ecolind.2019.01.080

Hernández-Guzmán R, Ruiz-Luna A, González C (2019) Assessing and modeling the impact of land use and changes in land cover related to carbon storage in a western basin in Mexico. Remote Sens Appl Soc Environ 13:318–327. https://doi.org/10.1016/j.rsase.2018.12.005

Hou Y, Chen Y, Ding J, Li Z, Li Y, Sun F (2022) Ecological Impacts of land use change in the Arid Tarim river Basin of China. Remote Sens 14:1894. https://doi.org/10.3390/rs14081894

Hough RL, Towers W, Aalders I (2010) The risk of peat erosion from climate change: land management combinations—an assessment with Bayesian Belief Networks. Hum Ecol Risk Assess Int J 16:962–976. https://doi.org/10.1080/10807039.2010.511964

Huang H, Xue J, Feng X, Zhao J, Sun H, Hu Y, Ma Y (2024) Thriving arid oasis urban agglomerations: optimizing ecosystem services pattern under future climate change scenarios using dynamic Bayesian network. J Environ Manag 350:119612

Jäger WS, Christie EK, Hanea AM, Den Heijer C, Spencer T (2018) A Bayesian network approach for coastal risk analysis and decision making. Coast Eng 134:48–61. https://doi.org/10.1016/j.coastaleng.2017.05.004

Jia X, Fu B, Feng X, Hou G, Liu Y, Wang X (2014) The tradeoff and synergy between ecosystem services in the Grain-for-Green areas in Northern Shaanxi, China. Ecol Ind 43:103–113. https://doi.org/10.1016/j.ecolind.2014.02.028

Jiang C, Li D, Wang D, Zhang L (2016) Quantification and assessment of changes in ecosystem service in the Three-River Headwaters Region, China as a result of climate variability and land cover change. Ecol Ind 66:199–211. https://doi.org/10.1016/j.ecolind.2016.01.051

Jing L (2021) Research on optimization of spatial pattern of ecosystem service in Qinhuangdao. Hebei Agricultural University. https://doi.org/10.27109/d.cnki.ghbnu.2021.000524

Kragt ME (2009) A beginners guide to Bayesian network modelling for integrated catchment management Technical Report No. 9. https://api.semanticscholar.org/CorpusID:14470543

Landuyt D, Broekx S, D’hondt R, Engelen G, Aertsens J, Goethals PLM (2013) A review of Bayesian belief networks in ecosystem service modelling. Environ Model Softw 46:1–11. https://doi.org/10.1016/j.envsoft.2013.03.011

Landuyt D, Lemmens P, D’hondt R, Broekx S, Liekens I, De Bie T, Declerck SAJ, De Meester L, Goethals PLM (2014) An ecosystem service approach to support integrated pond management: a case study using Bayesian belief networks—highlighting opportunities and risks. J Environ Manag 145:79–87. https://doi.org/10.1016/j.jenvman.2014.06.015

Landuyt D, Van Der Biest K, Broekx S, Staes J, Meire P, Goethals PLM (2015) A GIS plug-in for Bayesian belief networks: towards a transparent software framework to assess and visualise uncertainties in ecosystem service mapping. Environ Model Softw 71:30–38. https://doi.org/10.1016/j.envsoft.2015.05.002

Landuyt D, Broekx S, Goethals PLM (2016) Bayesian belief networks to analyse trade-offs among ecosystem services at the regional scale. Ecol Ind 71:327–335. https://doi.org/10.1016/j.ecolind.2016.07.015

Lang Y, Song W (2018) Trade-off analysis of ecosystem services in a Mountainous Karst Area. China Water 10:300. https://doi.org/10.3390/w10030300

Liang J, Li S, Li X, Li X, Liu Q, Meng Q, Lin A, Li J (2021) Trade-off analyses and optimization of water-related ecosystem services (WRESs) based on land use change in a typical agricultural watershed, southern China. J Clean Prod 279:123851. https://doi.org/10.1016/j.jclepro.2020.123851

Liu S, Crossman ND, Nolan M, Ghirmay H (2013) Bringing ecosystem services into integrated water resources management. J Environ Manag 129:92–102. https://doi.org/10.1016/j.jenvman.2013.06.047

Liu L, Feng Q (2015) Advances in research of function and valuation of ecosystem services. Sci Cold Arid Reg 7(2):194-198. https://doi.org/10.3724/SP.J.1226.2015.00194

Liu SY, Hu NK, Zhang J, Lv ZC (2018) Spatiotemporal change of carbon storage in the Loess Plateau of northern Shaanxi, based on the invest model. Sci Cold Arid Reg 10(3):240–250. https://doi.org/10.3724/SP.J.1226.2018.00240

Ma YT, Xue J, Feng XL, Zhao JP, Tang JH, Sun HW, Chang JJ, Yan LK (2024) Crop water productivity assessment and planting structure optimization in typical arid irrigation district using dynamic Bayesian network. Sci Rep 14:17695. https://doi.org/10.1038/s41598-024-68523-3

Maes J, Egoh B, Willemen L, Liquete C, Vihervaara P, Schägner JP, Grizzetti B, Drakou EG, Notte AL, Zulian G, Bouraoui F, Luisa Paracchini M, Braat L, Bidoglio G (2012) Mapping ecosystem services for policy support and decision making in the European Union. Ecosyst Serv 1:31–39. https://doi.org/10.1016/j.ecoser.2012.06.004

Marcot BG (2012) Metrics for evaluating performance and uncertainty of Bayesian network models. Ecol Model 230:50–62. https://doi.org/10.1016/j.ecolmodel.2012.01.013

MEA (2005) Ecosystems and human well-being: synthesis. Island Press, Washington, DC

Molina J-L, Pulido-Velázquez D, García-Aróstegui JL, Pulido-Velázquez M (2013) Dynamic Bayesian networks as a decision support tool for assessing climate change impacts on highly stressed groundwater systems. J Hydrol 479:113–129. https://doi.org/10.1016/j.jhydrol.2012.11.038

Nadkarni S, Shenoy PP (2004) A causal mapping approach to constructing Bayesian networks. Decis Support Syst 38:259–281. https://doi.org/10.1016/S0167-9236(03)00095-2

Pham HV, Sperotto A, Torresan S, Acuña V, Jorda-Capdevila D, Rianna G, Marcomini A, Critto A (2019) Coupling scenarios of climate and land-use change with assessments of potential ecosystem services at the river basin scale. Ecosyst Serv 40:101045. https://doi.org/10.1016/j.ecoser.2019.101045

Pham HV, Sperotto A, Furlan E, Torresan S, Marcomini A, Critto A (2021) Integrating Bayesian Networks into ecosystem services assessment to support water management at the river basin scale. Ecosyst Serv 50:101300. https://doi.org/10.1016/j.ecoser.2021.101300

Renard D, Rhemtulla JM, Bennett EM (2015) Historical dynamics in ecosystem service bundles. Proc Natl Acad Sci USA 112:13411–13416. https://doi.org/10.1073/pnas.1502565112

Ronquist F (2004) Bayesian inference of character evolution. Trends Ecol Evol 19:475–481. https://doi.org/10.1016/j.tree.2004.07.002

Scutari M (2017) Understanding Bayesian networks with examples in R. University of Oxford

Scutari M (2010) Learning Bayesian Networks with the bnlearn R package

Scutari M, Graafland CE, Gutiérrez JM (2019) Who learns better Bayesian network structures: accuracy and speed of structure learning algorithms. Int J Approx Reason 115:235–253. https://doi.org/10.1016/j.ijar.2019.10.003

Sharp R, Chaplin-Kramer R, Wood S, Guerry A, Tallis H, Ricketts T, Nelson E, Ennaanay D, Wolny S, Olwero N, Vigerstol K, Pennington D, Mendoza G, Aukema J, Foster J, Forrest J, Cameron DR, Arkema K, Lonsdorf E, Douglass J (2018) In:VEST User’s Guide. https://doi.org/10.13140/RG.2.2.32693.78567

Sheikholeslami R, Razavi S (2020) A fresh look at variography: measuring dependence and possible sensitivities across geophysical systems from any given data. Geophys Res Lett 47(20):e2020GL089829

Shen J, Li S, Liang Z, Liu L, Li D, Wu S (2020) Exploring the heterogeneity and nonlinearity of trade-offs and synergies among ecosystem services bundles in the Beijing-Tianjin-Hebei urban agglomeration. Ecosyst Serv 43:101103. https://doi.org/10.1016/j.ecoser.2020.101103

Sperotto A, Molina J-L, Torresan S, Critto A, Marcomini A (2017) Reviewing Bayesian Networks potentials for climate change impacts assessment and management: a multi-risk perspective. J Environ Manag 202:320–331. https://doi.org/10.1016/j.jenvman.2017.07.044

Sun Z, Müller D (2013) A framework for modeling payments for ecosystem services with agent-based models, Bayesian belief networks and opinion dynamics models. Environ Model Softw 45:15–28. https://doi.org/10.1016/j.envsoft.2012.06.007

Sun F, Wang Y, Chen Y, Li Y, Zhang Q, Qin J, Kayumba PM (2021) Historic and Simulated desert-oasis ecotone changes in the Arid Tarim River Basin, China. Remote Sens 13:647. https://doi.org/10.3390/rs13040647

Tolessa T, Senbeta F, Kidane M (2017) The impact of land use/land cover change on ecosystem services in the central highlands of Ethiopia. Ecosyst Serv 23:47–54. https://doi.org/10.1016/j.ecoser.2016.11.010

Vallet A, Locatelli B, Levrel H, Wunder S, Seppelt R, Scholes RJ, Oszwald J (2018) Relationships between ecosystem services: comparing methods for assessing tradeoffs and synergies. Ecol Econ 150:96–106. https://doi.org/10.1016/j.ecolecon.2018.04.002

Van Jaarsveld AS, Biggs R, Scholes RJ (2005) Measuring conditions and trends in ecosystem services at multiple scales: the Southern African Millennium Ecosystem Assessment (SA f MA) experience. Philos Trans R Soc Lond B Biol Sci 360(1454):425–441

Voinov A, Bousquet F (2010) Modelling with stakeholders☆. Environ Model Softw 25:1268–1281. https://doi.org/10.1016/j.envsoft.2010.03.007

Wang Y, Dai E (2020) Spatial-temporal changes in ecosystem services and the trade-off relationship in mountain regions: a case study of Hengduan Mountain region in Southwest China. J Clean Prod 264:121573. https://doi.org/10.1016/j.jclepro.2020.121573

Wang C, Zhan J, Chu X, Liu W, Zhang F (2019) Variation in ecosystem services with rapid urbanization: a study of carbon sequestration in the Beijing–Tianjin–Hebei region, China. Phys Chem Earth Parts a/b/c 110:195–202. https://doi.org/10.1016/j.pce.2018.09.001

Wu L (2018) Tarim River Basin boundary dataset. National Glacier and Desert Science Data Center ( www.ncdc.ac . cn). https://cstr.cn/CSTR:11738.11.ncdc.Westdc.2020.338

Xue J, Gui D, Zhao Y, Lei J, Zeng F, Feng X, Mao D, Shareef M (2016) A decision-making framework to model environmental flow requirements in oasis areas using Bayesian networks. J Hydrol 540:1209–1222

Xue J, Gui D, Lei J, Zeng F, Mao D, Zhang Z (2017) Model development of a participatory Bayesian network for coupling ecosystem services into integrated water resources management. J Hydrol 554:50–65

Xue J, Lei JQ, Chang JJ, Zeng FJ, Zhang ZW, Sun HW (2022) A causal structure-based multiple-criteria decision framework for evaluating the waterrelated ecosystem service tradeoffs in a desert oasis region. J Hydrol Reg Stud 44:101226

Yang J, Huang X (2021) The 30 m annual land cover dataset and its dynamics in China from 1990 to 2019. Earth Syst Sci Data 13:3907–3925. https://doi.org/10.5194/essd-13-3907-2021

Yang S, Zhao W, Liu Y, Wang S, Wang J, Zhai R (2018) Influence of land use change on the ecosystem service trade-offs in the ecological restoration area: dynamics and scenarios in the Yanhe watershed, China. Sci Total Environ 644:556–566. https://doi.org/10.1016/j.scitotenv.2018.06.348

YuLiZhou YJZ, Tang C (2022) Spatial pattern optimization of ecosystem services based on Bayesian networks: a case of the Jing River Basin. Arid Land Geogr 45(4):1268–1280

Zeng L, Li J, Li T, Yang XN, Wang YZ (2018) Optimizing spatial patterns of water conservation ecosystem service based on Bayesian belief networks. Acta Geogr Sin 73(9):1809–1822. https://doi.org/10.11821/dlxb201809015

Ma YT, Xue J, Feng XL, Zhao JP, Tang JH, Sun HW, Chang JJ, Yan LK (2024) Crop water productivity assessment and planting structure optimization in typical arid irrigation district using dynamic Bayesian network Sci Rep 14: 17695. 10.1038/s41598-024-68523-3

Download references

Acknowledgements

This work was financially supported by National Natural Science Foundation of China (42071259), the Tianshan Talents Program of Xinjiang Uygur Autonomous Region (2022TSYCJU0002), the original innovation project of the basic frontier scientific research program, Chinese Academy of Sciences (ZDBS-LY-DQC031), the Natural Science Foundation of Xinjiang Uygur Autonomous Region (2021D01E01), the water system evolution and risk assessment in arid regions for original innovation project of institute (2023–2025), and the Outstanding Member of the Youth Innovation Promotion Association of the Chinese Academy of Sciences (2019430) (2024-2026). We are also grateful to three anonymous referees for their constructive comments in this manuscript.

This work was supported by National Natural Science Foundation of China (Grant number: 42071259).

Author information

Authors and affiliations.

College of Mathematics and System Science, Xinjiang University, Urumqi, 830046, China

Yang Hu, Jianping Zhao & Xinlong Feng

State Key Laboratory of Desert and Oasis Ecology, Key Laboratory of Ecological Safety and Sustainable Development in Arid Lands, Xinjiang Institute of Ecology and Geography, Chinese Academy of Sciences, Urumqi, 830011, Xinjiang, China

Yang Hu, Jie Xue & Jingjing Chang

Cele National Station of Observation and Research for Desert-Grassland Ecosystems, Cele, 848300, Xinjiang, China

University of Chinese Academy of Sciences, Beijing, 100049, China

School of Civil and Hydraulic Engineering, Huazhong University of Science and Technology, Wuhan, 430074, China

Huaiwei Sun

College of Ecology and Environment, Xinjiang University, Urumqi, 830046, China

You can also search for this author in PubMed   Google Scholar

Contributions

Yang Hu: conceptualization, methodology, software, validation, formal analysis, writing—original draft. Jie Xue, Jianping Zhao, Xinlong Feng, and Huaiwei Sun: conceptualization, methodology, supervision, writing—review & editing. Junhu Tang and Jingjing Chang: data curation, visualization.

Corresponding authors

Correspondence to Jie Xue or Jianping Zhao .

Ethics declarations

Conflict of interest.

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 2561 KB)

Rights and permissions.

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Hu, Y., Xue, J., Zhao, J. et al. Dynamic Bayesian networks for spatiotemporal modeling and its uncertainty in tradeoffs and synergies of ecosystem services: a case study in the Tarim River Basin, China. Stoch Environ Res Risk Assess (2024). https://doi.org/10.1007/s00477-024-02805-0

Download citation

Accepted : 19 August 2024

Published : 02 September 2024

DOI : https://doi.org/10.1007/s00477-024-02805-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Ecosystem services
  • Dynamic Bayesian networks
  • Hill-climb algorithm
  • Markov blanket
  • EM algorithm
  • Find a journal
  • Publish with us
  • Track your research

Cart

  • SUGGESTED TOPICS
  • The Magazine
  • Newsletters
  • Managing Yourself
  • Managing Teams
  • Work-life Balance
  • The Big Idea
  • Data & Visuals
  • Reading Lists
  • Case Selections
  • HBR Learning
  • Topic Feeds
  • Account Settings
  • Email Preferences

HBR’s Most-Read Articles of 2024 (So Far)

  • Kelsey Hansen

the quality of a research study is primarily assessed on

The five stories that have resonated most with our readers this year.

HBR’s top five most popular articles of 2024 (so far), present an opportunity to reflect on the work you’ve done in the preceding months, and chart any necessary course changes. The list includes a case study of how Starbucks lost its way (and how it could pivot); a guide to how to shift your leadership style based on situation; and a playbook for assessing the quality of the questions you ask at work.

The waning days of summer present a prime opportunity to step back and reflect on the paths you’ve taken so far this year, whether they’re personal or professional, and ask yourself: Am I growing in the right direction? What are my blind spots? Where could I be doing better?

the quality of a research study is primarily assessed on

  • Kelsey Hansen is the senior associate editor for audience engagement at Harvard Business Review.

Partner Center

IMAGES

  1. Types of Research

    the quality of a research study is primarily assessed on

  2. Evaluating Quality of a Research Article

    the quality of a research study is primarily assessed on

  3. The quality of each study was assessed by answering questions

    the quality of a research study is primarily assessed on

  4. The Quality of A Research Study Is Primarily Assessed On:: Incorrect

    the quality of a research study is primarily assessed on

  5. Types of Study Designs in Health Research: The Evidence Hierarchy

    the quality of a research study is primarily assessed on

  6. The five parts of the research study.

    the quality of a research study is primarily assessed on

VIDEO

  1. The New Aged Care Standards

  2. How To Conduct Compliant RN Supervisory Visits

  3. IAQM Webinar: Air Quality Futures: Environment Agency horizon scanning

  4. The Impact of Contaminants in Metal Processing: Health & Productivity

  5. Electrophysiological Study of Cord Reflexes

  6. AQA Inside Assessment principles video 1 Principles of Assessment

COMMENTS

  1. 641 Quiz 3 Flashcards

    Study with Quizlet and memorize flashcards containing terms like The credibility of the findings of a research study is primarily determined by which of the following?\u000B a. The significance of the statistical results b. The clarity of the study purposes or questions c. The sample size and appropriateness of the statistics used d. The appropriateness and soundness of the study methods ...

  2. In brief: How is the quality of studies assessed?

    So it is important to critically evaluate every study. This can be done in a systematic review that analyzes all the available studies on a specific medical issue. In order to assess whether the results of a study are reliable, you first have to find out why the study was done in the first place and which questions it tried to answer. This may ...

  3. Levels of Evidence, Quality Assessment, and Risk of Bias: Evaluating

    The initial methods guidelines, published in 1997, recommended that a quality assessment be performed on each included study, with each item in the quality assessment tool scored based on whether the authors reported their use . Updated methods guidelines were published in 2003 . The framework for levels of evidence in this guidance was ...

  4. A Review of the Quality Indicators of Rigor in Qualitative Research

    Quality criteria for assessing a study's problem statement, conceptual framework, and research question include the following: introduction builds a logical case and provides context for the problem statement; problem statement is clear and well-articulated; conceptual framework is explicit and justified; research purpose and/or question is ...

  5. Systematic Reviews: Step 6: Assess Quality of Included Studies

    Quality Assessment tools are questionnaires created to help you assess the quality of a variety of study designs. Depending on the types of studies you are analyzing, the questionnaire will be tailored to ask specific questions about the methodology of the study. There are appraisal tools for most kinds of study designs.

  6. Criteria for Good Qualitative Research: A Comprehensive Review

    Fundamental Criteria: General Research Quality. Various researchers have put forward criteria for evaluating qualitative research, which have been summarized in Table 3.Also, the criteria outlined in Table 4 effectively deliver the various approaches to evaluate and assess the quality of qualitative work. The entries in Table 4 are based on Tracy's "Eight big‐tent criteria for excellent ...

  7. Research quality: What it is, and how to achieve it

    2) Initiating research stream: The researcher (s) must be able to assemble a research team that can achieve the identified research potential. The team should be motivated to identify research opportunities and insights, as well as to produce top-quality articles, which can reach the highest-level journals.

  8. (PDF) What Is Quality in Research? Building a Framework of Design

    A literature-derived framework of research quality attributes is, thus, obtained, which is subject to an expert feedback process, involving scholars and practitioners in the fields of research ...

  9. What Are the Standards for Quality Research?

    In this manner, standards for quality research, whether primarily designed to gather quantitative or qualitative data, typically emphasize the traits of objectivity, internal validity, external validity, reliability, rigor, open-mindedness, and honest and thorough reporting (Ragin et al., July 2003; Shavelson & Towne, 2002; Wooding & Grant, 2003).

  10. Assessing Research Quality

    Assessing Research Quality. Using research is one method to identify effective strategies to help inform practice and policymaking decisions. Effective use of research in decision-making can help agencies and organizations: Early care and education leaders are responsible for developing policy and making practice decisions that impact providers ...

  11. Evaluating research: A multidisciplinary approach to assessing research

    Our intention in this study has been to formulate a framework for the assessment of the quality of research practice. We argue that this is a useful approach for discussing research practice and its quality from many perspectives, and can help to advance discussions on research quality within and across disciplines. ... primarily for legitimacy ...

  12. Defining and assessing research quality in a transdisciplinary context

    2.2 Search terms. Search terms were designed to identify publications that discuss the evaluation or assessment of quality or excellence 2 of research 3 that is done in a TDR context. Search terms are listed online in Supplementary Appendices 2 and 3.The search strategy favored sensitivity over specificity to ensure that we captured the relevant information.

  13. What makes a good-quality research study?

    When trying to judge a study, you are ultimately trying to assess whether it has been well designed, well conducted and well reported. It should be based on a clear research question, and it should use a sound research methodology, with limitations fully acknowledged. The data should be collected and analysed carefully, and all the analyses ...

  14. How to assess the quality of research?

    Firstly, the research must be peer reviewed, as evaluation by experienced researchers in the field ensures that no poor-quality, fictitious, or plagiarized data is published. Also, the aim of the research must be clearly defined. Further, you must find out if the methods used in the study are appropriate for the research topic or question.

  15. How to Assess Quality of Primary Research Studies in the Medical

    In an accuracy study, patients with a clinical suspicion of disease undergo both the new tests that is being evaluated, and the reference test or "gold-standard" test for the disease. ... The question of how to assess quality of primary diagnostic research studies in the medical literature has wide applicability across all clinical domains ...

  16. PDF Chapter 12: Doing Small-scale Exploratory Research Projects

    Research Projects 1 The quality of a research study is primarily assessed on: a the place of publication. b the ways in which the recommendations are implemented. c the rigour with which it was conducted. d the number of times it is replicated. 2 Which of the following is not an appropriate source for academic research? a An online encyclopaedia

  17. Assessing Quality in Systematic Literature Reviews: A Study of Novice

    Given that assessing study quality is a core principle of systematic reviews (Petticrew, 2015), checklists and/or rating scales such as the MQQ are useful for the following: (a) diagnosing and assessing potential bias in the original study and (b) minimizing systematic reviewer bias during the coding and rating processes.The first item relates to the internal validity of the appraisal tool itself.

  18. Assessing the quality of research

    Figure 1. Go to: Systematic reviews of research are always preferred. Go to: Level alone should not be used to grade evidence. Other design elements, such as the validity of measurements and blinding of outcome assessments. Quality of the conduct of the study, such as loss to follow up and success of blinding.

  19. The Quality of A Research Study Is Primarily Assessed On ...

    The document discusses how the quality of a research study is primarily assessed based on the rigor with which it was conducted, rather than factors like where it was published, how recommendations are implemented, or how many times it is replicated. It provides examples of multiple choice questions about research methodology, with the correct answers being things like the chain of association ...

  20. Using a Quality Management System and Risk-based Approach in ...

    A risk-based approach is a common quality management system used in interventional studies. We used a quality management system and risk-based approach in an observational study on a designated intractable disease. Our multidisciplinary team assessed the risks of the real-world data study comprehensively and systematically.

  21. Assessing the quality of research

    Systematic reviews of research are always preferred. With rare exceptions, no study, whatever the type, should be interpreted in isolation. Systematic reviews are required of the best available type of study for answering the clinical question posed.6 A systematic review does not necessarily involve quantitative pooling in a meta—analysis. Although case reports are a less than perfect source ...

  22. Knowledge mapping and evolution of research on older adults ...

    This study conducted a systematic review of the literature on older adults' technology acceptance over the past decade through bibliometric analysis, focusing on the distribution power, research ...

  23. Evaluating coupling coordination between urban smart ...

    This study employs mixed-method research, combining qualitative and quantitative analyses, to investigate the coupling coordination between urban smart performance (SCP) and low-carbon level (LCL ...

  24. A scoping review of early childhood caries, poverty and the first

    Poverty is a well-known risk factor for poor health. This scoping review (ScR) mapped research linking early childhood caries (ECC) and poverty using the targets and indicators of the Sustainable Development Goal 1 (SDG1). We searched PubMed, Web of Science, and Scopus in December 2023 using search terms derived from SDG1. Studies were included if they addressed clinically assessed or reported ...

  25. Effect of Indoor Air Quality on Respiratory Health of ...

    Indoor air quality (IAQ) in classrooms is a crucial factor in the growing health of children as they spend significant amounts of time in school. The present work examines indoor air quality in school classrooms, the relationship between indoor air and outdoor air, and a possible risk to children's learning. Four IAQ parameters, particulate matter (PM2.5 and PM10), carbon dioxide (CO2 ...

  26. Effect of Land Marketization on the High-Quality Development of ...

    This study developed a theoretical framework on the relationship between land marketization and industrial high-quality development (HQD) to guide the formulation of policies for advancing new industrialization and high-level manufacturing capabilities. An evaluation system was constructed that can assess regional industrial HQD in seven dimensions: innovation, efficiency, structural ...

  27. Learning effect of online versus onsite education in health and medical

    The disruption of health and medical education by the COVID-19 pandemic made educators question the effect of online setting on students' learning, motivation, self-efficacy and preference. In light of the health care staff shortage online scalable education seemed relevant. Reviews on the effect of online medical education called for high quality RCTs, which are increasingly relevant with ...

  28. Readiness for non-communicable disease service delivery in Ethiopia: an

    Ethiopia's health system is overwhelmed by the growing burden of non-communicable diseases (NCDs). In this study, we assessed the availability of and readiness for NCD services and the interaction of NCD services with other essential and non-NCD services. The analysis focused on four main NCD services: diabetes mellitus, cardiovascular diseases, chronic respiratory diseases, and cancer ...

  29. Dynamic Bayesian networks for spatiotemporal modeling and ...

    2.1 Study area. The Tarim River Basin (Wu 2018) is an inland river basin that encompasses a significant portion of the Tarim Basin in southern Xinjiang (Fig. 1).Characterized by an arid continental climate and surrounded by high mountains, the basin's key monitoring area spans 538,200 km 2.It is primarily fed by four main surface water sources: the Hetian River, Yerqiang River, Aksu River, and ...

  30. HBR's Most-Read Articles of 2024 (So Far)

    The list includes a case study of how Starbucks lost its way (and how it could pivot); a guide to how to shift your leadership style based on situation; and a playbook for assessing the quality of ...