• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case NPS+ Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

data analysis and research findings

Home Market Research

Data Analysis in Research: Types & Methods

data-analysis-in-research

Content Index

Why analyze data in research?

Types of data in research, finding patterns in the qualitative data, methods used for data analysis in qualitative research, preparing data for analysis, methods used for data analysis in quantitative research, considerations in research data analysis, what is data analysis in research.

Definition of research in data analysis: According to LeCompte and Schensul, research data analysis is a process used by researchers to reduce data to a story and interpret it to derive insights. The data analysis process helps reduce a large chunk of data into smaller fragments, which makes sense. 

Three essential things occur during the data analysis process — the first is data organization . Summarization and categorization together contribute to becoming the second known method used for data reduction. It helps find patterns and themes in the data for easy identification and linking. The third and last way is data analysis – researchers do it in both top-down and bottom-up fashion.

LEARN ABOUT: Research Process Steps

On the other hand, Marshall and Rossman describe data analysis as a messy, ambiguous, and time-consuming but creative and fascinating process through which a mass of collected data is brought to order, structure and meaning.

We can say that “the data analysis and data interpretation is a process representing the application of deductive and inductive logic to the research and data analysis.”

Researchers rely heavily on data as they have a story to tell or research problems to solve. It starts with a question, and data is nothing but an answer to that question. But, what if there is no question to ask? Well! It is possible to explore data even without a problem – we call it ‘Data Mining’, which often reveals some interesting patterns within the data that are worth exploring.

Irrelevant to the type of data researchers explore, their mission and audiences’ vision guide them to find the patterns to shape the story they want to tell. One of the essential things expected from researchers while analyzing data is to stay open and remain unbiased toward unexpected patterns, expressions, and results. Remember, sometimes, data analysis tells the most unforeseen yet exciting stories that were not expected when initiating data analysis. Therefore, rely on the data you have at hand and enjoy the journey of exploratory research. 

Create a Free Account

Every kind of data has a rare quality of describing things after assigning a specific value to it. For analysis, you need to organize these values, processed and presented in a given context, to make it useful. Data can be in different forms; here are the primary data types.

  • Qualitative data: When the data presented has words and descriptions, then we call it qualitative data . Although you can observe this data, it is subjective and harder to analyze data in research, especially for comparison. Example: Quality data represents everything describing taste, experience, texture, or an opinion that is considered quality data. This type of data is usually collected through focus groups, personal qualitative interviews , qualitative observation or using open-ended questions in surveys.
  • Quantitative data: Any data expressed in numbers of numerical figures are called quantitative data . This type of data can be distinguished into categories, grouped, measured, calculated, or ranked. Example: questions such as age, rank, cost, length, weight, scores, etc. everything comes under this type of data. You can present such data in graphical format, charts, or apply statistical analysis methods to this data. The (Outcomes Measurement Systems) OMS questionnaires in surveys are a significant source of collecting numeric data.
  • Categorical data: It is data presented in groups. However, an item included in the categorical data cannot belong to more than one group. Example: A person responding to a survey by telling his living style, marital status, smoking habit, or drinking habit comes under the categorical data. A chi-square test is a standard method used to analyze this data.

Learn More : Examples of Qualitative Data in Education

Data analysis in qualitative research

Data analysis and qualitative data research work a little differently from the numerical data as the quality data is made up of words, descriptions, images, objects, and sometimes symbols. Getting insight from such complicated information is a complicated process. Hence it is typically used for exploratory research and data analysis .

Although there are several ways to find patterns in the textual information, a word-based method is the most relied and widely used global technique for research and data analysis. Notably, the data analysis process in qualitative research is manual. Here the researchers usually read the available data and find repetitive or commonly used words. 

For example, while studying data collected from African countries to understand the most pressing issues people face, researchers might find  “food”  and  “hunger” are the most commonly used words and will highlight them for further analysis.

LEARN ABOUT: Level of Analysis

The keyword context is another widely used word-based technique. In this method, the researcher tries to understand the concept by analyzing the context in which the participants use a particular keyword.  

For example , researchers conducting research and data analysis for studying the concept of ‘diabetes’ amongst respondents might analyze the context of when and how the respondent has used or referred to the word ‘diabetes.’

The scrutiny-based technique is also one of the highly recommended  text analysis  methods used to identify a quality data pattern. Compare and contrast is the widely used method under this technique to differentiate how a specific text is similar or different from each other. 

For example: To find out the “importance of resident doctor in a company,” the collected data is divided into people who think it is necessary to hire a resident doctor and those who think it is unnecessary. Compare and contrast is the best method that can be used to analyze the polls having single-answer questions types .

Metaphors can be used to reduce the data pile and find patterns in it so that it becomes easier to connect data with theory.

Variable Partitioning is another technique used to split variables so that researchers can find more coherent descriptions and explanations from the enormous data.

LEARN ABOUT: Qualitative Research Questions and Questionnaires

There are several techniques to analyze the data in qualitative research, but here are some commonly used methods,

  • Content Analysis:  It is widely accepted and the most frequently employed technique for data analysis in research methodology. It can be used to analyze the documented information from text, images, and sometimes from the physical items. It depends on the research questions to predict when and where to use this method.
  • Narrative Analysis: This method is used to analyze content gathered from various sources such as personal interviews, field observation, and  surveys . The majority of times, stories, or opinions shared by people are focused on finding answers to the research questions.
  • Discourse Analysis:  Similar to narrative analysis, discourse analysis is used to analyze the interactions with people. Nevertheless, this particular method considers the social context under which or within which the communication between the researcher and respondent takes place. In addition to that, discourse analysis also focuses on the lifestyle and day-to-day environment while deriving any conclusion.
  • Grounded Theory:  When you want to explain why a particular phenomenon happened, then using grounded theory for analyzing quality data is the best resort. Grounded theory is applied to study data about the host of similar cases occurring in different settings. When researchers are using this method, they might alter explanations or produce new ones until they arrive at some conclusion.

LEARN ABOUT: 12 Best Tools for Researchers

Data analysis in quantitative research

The first stage in research and data analysis is to make it for the analysis so that the nominal data can be converted into something meaningful. Data preparation consists of the below phases.

Phase I: Data Validation

Data validation is done to understand if the collected data sample is per the pre-set standards, or it is a biased data sample again divided into four different stages

  • Fraud: To ensure an actual human being records each response to the survey or the questionnaire
  • Screening: To make sure each participant or respondent is selected or chosen in compliance with the research criteria
  • Procedure: To ensure ethical standards were maintained while collecting the data sample
  • Completeness: To ensure that the respondent has answered all the questions in an online survey. Else, the interviewer had asked all the questions devised in the questionnaire.

Phase II: Data Editing

More often, an extensive research data sample comes loaded with errors. Respondents sometimes fill in some fields incorrectly or sometimes skip them accidentally. Data editing is a process wherein the researchers have to confirm that the provided data is free of such errors. They need to conduct necessary checks and outlier checks to edit the raw edit and make it ready for analysis.

Phase III: Data Coding

Out of all three, this is the most critical phase of data preparation associated with grouping and assigning values to the survey responses . If a survey is completed with a 1000 sample size, the researcher will create an age bracket to distinguish the respondents based on their age. Thus, it becomes easier to analyze small data buckets rather than deal with the massive data pile.

LEARN ABOUT: Steps in Qualitative Research

After the data is prepared for analysis, researchers are open to using different research and data analysis methods to derive meaningful insights. For sure, statistical analysis plans are the most favored to analyze numerical data. In statistical analysis, distinguishing between categorical data and numerical data is essential, as categorical data involves distinct categories or labels, while numerical data consists of measurable quantities. The method is again classified into two groups. First, ‘Descriptive Statistics’ used to describe data. Second, ‘Inferential statistics’ that helps in comparing the data .

Descriptive statistics

This method is used to describe the basic features of versatile types of data in research. It presents the data in such a meaningful way that pattern in the data starts making sense. Nevertheless, the descriptive analysis does not go beyond making conclusions. The conclusions are again based on the hypothesis researchers have formulated so far. Here are a few major types of descriptive analysis methods.

Measures of Frequency

  • Count, Percent, Frequency
  • It is used to denote home often a particular event occurs.
  • Researchers use it when they want to showcase how often a response is given.

Measures of Central Tendency

  • Mean, Median, Mode
  • The method is widely used to demonstrate distribution by various points.
  • Researchers use this method when they want to showcase the most commonly or averagely indicated response.

Measures of Dispersion or Variation

  • Range, Variance, Standard deviation
  • Here the field equals high/low points.
  • Variance standard deviation = difference between the observed score and mean
  • It is used to identify the spread of scores by stating intervals.
  • Researchers use this method to showcase data spread out. It helps them identify the depth until which the data is spread out that it directly affects the mean.

Measures of Position

  • Percentile ranks, Quartile ranks
  • It relies on standardized scores helping researchers to identify the relationship between different scores.
  • It is often used when researchers want to compare scores with the average count.

For quantitative research use of descriptive analysis often give absolute numbers, but the in-depth analysis is never sufficient to demonstrate the rationale behind those numbers. Nevertheless, it is necessary to think of the best method for research and data analysis suiting your survey questionnaire and what story researchers want to tell. For example, the mean is the best way to demonstrate the students’ average scores in schools. It is better to rely on the descriptive statistics when the researchers intend to keep the research or outcome limited to the provided  sample  without generalizing it. For example, when you want to compare average voting done in two different cities, differential statistics are enough.

Descriptive analysis is also called a ‘univariate analysis’ since it is commonly used to analyze a single variable.

Inferential statistics

Inferential statistics are used to make predictions about a larger population after research and data analysis of the representing population’s collected sample. For example, you can ask some odd 100 audiences at a movie theater if they like the movie they are watching. Researchers then use inferential statistics on the collected  sample  to reason that about 80-90% of people like the movie. 

Here are two significant areas of inferential statistics.

  • Estimating parameters: It takes statistics from the sample research data and demonstrates something about the population parameter.
  • Hypothesis test: I t’s about sampling research data to answer the survey research questions. For example, researchers might be interested to understand if the new shade of lipstick recently launched is good or not, or if the multivitamin capsules help children to perform better at games.

These are sophisticated analysis methods used to showcase the relationship between different variables instead of describing a single variable. It is often used when researchers want something beyond absolute numbers to understand the relationship between variables.

Here are some of the commonly used methods for data analysis in research.

  • Correlation: When researchers are not conducting experimental research or quasi-experimental research wherein the researchers are interested to understand the relationship between two or more variables, they opt for correlational research methods.
  • Cross-tabulation: Also called contingency tables,  cross-tabulation  is used to analyze the relationship between multiple variables.  Suppose provided data has age and gender categories presented in rows and columns. A two-dimensional cross-tabulation helps for seamless data analysis and research by showing the number of males and females in each age category.
  • Regression analysis: For understanding the strong relationship between two variables, researchers do not look beyond the primary and commonly used regression analysis method, which is also a type of predictive analysis used. In this method, you have an essential factor called the dependent variable. You also have multiple independent variables in regression analysis. You undertake efforts to find out the impact of independent variables on the dependent variable. The values of both independent and dependent variables are assumed as being ascertained in an error-free random manner.
  • Frequency tables: The statistical procedure is used for testing the degree to which two or more vary or differ in an experiment. A considerable degree of variation means research findings were significant. In many contexts, ANOVA testing and variance analysis are similar.
  • Analysis of variance: The statistical procedure is used for testing the degree to which two or more vary or differ in an experiment. A considerable degree of variation means research findings were significant. In many contexts, ANOVA testing and variance analysis are similar.
  • Researchers must have the necessary research skills to analyze and manipulation the data , Getting trained to demonstrate a high standard of research practice. Ideally, researchers must possess more than a basic understanding of the rationale of selecting one statistical method over the other to obtain better data insights.
  • Usually, research and data analytics projects differ by scientific discipline; therefore, getting statistical advice at the beginning of analysis helps design a survey questionnaire, select data collection methods , and choose samples.

LEARN ABOUT: Best Data Collection Tools

  • The primary aim of data research and analysis is to derive ultimate insights that are unbiased. Any mistake in or keeping a biased mind to collect data, selecting an analysis method, or choosing  audience  sample il to draw a biased inference.
  • Irrelevant to the sophistication used in research data and analysis is enough to rectify the poorly defined objective outcome measurements. It does not matter if the design is at fault or intentions are not clear, but lack of clarity might mislead readers, so avoid the practice.
  • The motive behind data analysis in research is to present accurate and reliable data. As far as possible, avoid statistical errors, and find a way to deal with everyday challenges like outliers, missing data, data altering, data mining , or developing graphical representation.

LEARN MORE: Descriptive Research vs Correlational Research The sheer amount of data generated daily is frightening. Especially when data analysis has taken center stage. in 2018. In last year, the total data supply amounted to 2.8 trillion gigabytes. Hence, it is clear that the enterprises willing to survive in the hypercompetitive world must possess an excellent capability to analyze complex research data, derive actionable insights, and adapt to the new market needs.

LEARN ABOUT: Average Order Value

QuestionPro is an online survey platform that empowers organizations in data analysis and research and provides them a medium to collect data by creating appealing surveys.

MORE LIKE THIS

Techaton QuestionPro

Techathon by QuestionPro: An Amazing Showcase of Tech Brilliance

Jul 3, 2024

Stakeholder Interviews

Stakeholder Interviews: A Guide to Effective Engagement

Jul 2, 2024

zero correlation

Zero Correlation: Definition, Examples + How to Determine It

Jul 1, 2024

data analysis and research findings

When You Have Something Important to Say, You want to Shout it From the Rooftops

Jun 28, 2024

Other categories

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Tuesday CX Thoughts (TCXT)
  • Uncategorized
  • Video Learning Series
  • What’s Coming Up
  • Workforce Intelligence

Data Analysis

  • Introduction to Data Analysis
  • Quantitative Analysis Tools
  • Qualitative Analysis Tools
  • Mixed Methods Analysis
  • Geospatial Analysis
  • Further Reading

Profile Photo

What is Data Analysis?

According to the federal government, data analysis is "the process of systematically applying statistical and/or logical techniques to describe and illustrate, condense and recap, and evaluate data" ( Responsible Conduct in Data Management ). Important components of data analysis include searching for patterns, remaining unbiased in drawing inference from data, practicing responsible  data management , and maintaining "honest and accurate analysis" ( Responsible Conduct in Data Management ). 

In order to understand data analysis further, it can be helpful to take a step back and understand the question "What is data?". Many of us associate data with spreadsheets of numbers and values, however, data can encompass much more than that. According to the federal government, data is "The recorded factual material commonly accepted in the scientific community as necessary to validate research findings" ( OMB Circular 110 ). This broad definition can include information in many formats. 

Some examples of types of data are as follows:

  • Photographs 
  • Hand-written notes from field observation
  • Machine learning training data sets
  • Ethnographic interview transcripts
  • Sheet music
  • Scripts for plays and musicals 
  • Observations from laboratory experiments ( CMU Data 101 )

Thus, data analysis includes the processing and manipulation of these data sources in order to gain additional insight from data, answer a research question, or confirm a research hypothesis. 

Data analysis falls within the larger research data lifecycle, as seen below. 

( University of Virginia )

Why Analyze Data?

Through data analysis, a researcher can gain additional insight from data and draw conclusions to address the research question or hypothesis. Use of data analysis tools helps researchers understand and interpret data. 

What are the Types of Data Analysis?

Data analysis can be quantitative, qualitative, or mixed methods. 

Quantitative research typically involves numbers and "close-ended questions and responses" ( Creswell & Creswell, 2018 , p. 3). Quantitative research tests variables against objective theories, usually measured and collected on instruments and analyzed using statistical procedures ( Creswell & Creswell, 2018 , p. 4). Quantitative analysis usually uses deductive reasoning. 

Qualitative  research typically involves words and "open-ended questions and responses" ( Creswell & Creswell, 2018 , p. 3). According to Creswell & Creswell, "qualitative research is an approach for exploring and understanding the meaning individuals or groups ascribe to a social or human problem" ( 2018 , p. 4). Thus, qualitative analysis usually invokes inductive reasoning. 

Mixed methods  research uses methods from both quantitative and qualitative research approaches. Mixed methods research works under the "core assumption... that the integration of qualitative and quantitative data yields additional insight beyond the information provided by either the quantitative or qualitative data alone" ( Creswell & Creswell, 2018 , p. 4). 

  • Next: Planning >>
  • Last Updated: Jun 25, 2024 10:23 AM
  • URL: https://guides.library.georgetown.edu/data-analysis

Creative Commons

  • Privacy Policy

Research Method

Home » Data Analysis – Process, Methods and Types

Data Analysis – Process, Methods and Types

Table of Contents

Data Analysis

Data Analysis

Definition:

Data analysis refers to the process of inspecting, cleaning, transforming, and modeling data with the goal of discovering useful information, drawing conclusions, and supporting decision-making. It involves applying various statistical and computational techniques to interpret and derive insights from large datasets. The ultimate aim of data analysis is to convert raw data into actionable insights that can inform business decisions, scientific research, and other endeavors.

Data Analysis Process

The following are step-by-step guides to the data analysis process:

Define the Problem

The first step in data analysis is to clearly define the problem or question that needs to be answered. This involves identifying the purpose of the analysis, the data required, and the intended outcome.

Collect the Data

The next step is to collect the relevant data from various sources. This may involve collecting data from surveys, databases, or other sources. It is important to ensure that the data collected is accurate, complete, and relevant to the problem being analyzed.

Clean and Organize the Data

Once the data has been collected, it needs to be cleaned and organized. This involves removing any errors or inconsistencies in the data, filling in missing values, and ensuring that the data is in a format that can be easily analyzed.

Analyze the Data

The next step is to analyze the data using various statistical and analytical techniques. This may involve identifying patterns in the data, conducting statistical tests, or using machine learning algorithms to identify trends and insights.

Interpret the Results

After analyzing the data, the next step is to interpret the results. This involves drawing conclusions based on the analysis and identifying any significant findings or trends.

Communicate the Findings

Once the results have been interpreted, they need to be communicated to stakeholders. This may involve creating reports, visualizations, or presentations to effectively communicate the findings and recommendations.

Take Action

The final step in the data analysis process is to take action based on the findings. This may involve implementing new policies or procedures, making strategic decisions, or taking other actions based on the insights gained from the analysis.

Types of Data Analysis

Types of Data Analysis are as follows:

Descriptive Analysis

This type of analysis involves summarizing and describing the main characteristics of a dataset, such as the mean, median, mode, standard deviation, and range.

Inferential Analysis

This type of analysis involves making inferences about a population based on a sample. Inferential analysis can help determine whether a certain relationship or pattern observed in a sample is likely to be present in the entire population.

Diagnostic Analysis

This type of analysis involves identifying and diagnosing problems or issues within a dataset. Diagnostic analysis can help identify outliers, errors, missing data, or other anomalies in the dataset.

Predictive Analysis

This type of analysis involves using statistical models and algorithms to predict future outcomes or trends based on historical data. Predictive analysis can help businesses and organizations make informed decisions about the future.

Prescriptive Analysis

This type of analysis involves recommending a course of action based on the results of previous analyses. Prescriptive analysis can help organizations make data-driven decisions about how to optimize their operations, products, or services.

Exploratory Analysis

This type of analysis involves exploring the relationships and patterns within a dataset to identify new insights and trends. Exploratory analysis is often used in the early stages of research or data analysis to generate hypotheses and identify areas for further investigation.

Data Analysis Methods

Data Analysis Methods are as follows:

Statistical Analysis

This method involves the use of mathematical models and statistical tools to analyze and interpret data. It includes measures of central tendency, correlation analysis, regression analysis, hypothesis testing, and more.

Machine Learning

This method involves the use of algorithms to identify patterns and relationships in data. It includes supervised and unsupervised learning, classification, clustering, and predictive modeling.

Data Mining

This method involves using statistical and machine learning techniques to extract information and insights from large and complex datasets.

Text Analysis

This method involves using natural language processing (NLP) techniques to analyze and interpret text data. It includes sentiment analysis, topic modeling, and entity recognition.

Network Analysis

This method involves analyzing the relationships and connections between entities in a network, such as social networks or computer networks. It includes social network analysis and graph theory.

Time Series Analysis

This method involves analyzing data collected over time to identify patterns and trends. It includes forecasting, decomposition, and smoothing techniques.

Spatial Analysis

This method involves analyzing geographic data to identify spatial patterns and relationships. It includes spatial statistics, spatial regression, and geospatial data visualization.

Data Visualization

This method involves using graphs, charts, and other visual representations to help communicate the findings of the analysis. It includes scatter plots, bar charts, heat maps, and interactive dashboards.

Qualitative Analysis

This method involves analyzing non-numeric data such as interviews, observations, and open-ended survey responses. It includes thematic analysis, content analysis, and grounded theory.

Multi-criteria Decision Analysis

This method involves analyzing multiple criteria and objectives to support decision-making. It includes techniques such as the analytical hierarchy process, TOPSIS, and ELECTRE.

Data Analysis Tools

There are various data analysis tools available that can help with different aspects of data analysis. Below is a list of some commonly used data analysis tools:

  • Microsoft Excel: A widely used spreadsheet program that allows for data organization, analysis, and visualization.
  • SQL : A programming language used to manage and manipulate relational databases.
  • R : An open-source programming language and software environment for statistical computing and graphics.
  • Python : A general-purpose programming language that is widely used in data analysis and machine learning.
  • Tableau : A data visualization software that allows for interactive and dynamic visualizations of data.
  • SAS : A statistical analysis software used for data management, analysis, and reporting.
  • SPSS : A statistical analysis software used for data analysis, reporting, and modeling.
  • Matlab : A numerical computing software that is widely used in scientific research and engineering.
  • RapidMiner : A data science platform that offers a wide range of data analysis and machine learning tools.

Applications of Data Analysis

Data analysis has numerous applications across various fields. Below are some examples of how data analysis is used in different fields:

  • Business : Data analysis is used to gain insights into customer behavior, market trends, and financial performance. This includes customer segmentation, sales forecasting, and market research.
  • Healthcare : Data analysis is used to identify patterns and trends in patient data, improve patient outcomes, and optimize healthcare operations. This includes clinical decision support, disease surveillance, and healthcare cost analysis.
  • Education : Data analysis is used to measure student performance, evaluate teaching effectiveness, and improve educational programs. This includes assessment analytics, learning analytics, and program evaluation.
  • Finance : Data analysis is used to monitor and evaluate financial performance, identify risks, and make investment decisions. This includes risk management, portfolio optimization, and fraud detection.
  • Government : Data analysis is used to inform policy-making, improve public services, and enhance public safety. This includes crime analysis, disaster response planning, and social welfare program evaluation.
  • Sports : Data analysis is used to gain insights into athlete performance, improve team strategy, and enhance fan engagement. This includes player evaluation, scouting analysis, and game strategy optimization.
  • Marketing : Data analysis is used to measure the effectiveness of marketing campaigns, understand customer behavior, and develop targeted marketing strategies. This includes customer segmentation, marketing attribution analysis, and social media analytics.
  • Environmental science : Data analysis is used to monitor and evaluate environmental conditions, assess the impact of human activities on the environment, and develop environmental policies. This includes climate modeling, ecological forecasting, and pollution monitoring.

When to Use Data Analysis

Data analysis is useful when you need to extract meaningful insights and information from large and complex datasets. It is a crucial step in the decision-making process, as it helps you understand the underlying patterns and relationships within the data, and identify potential areas for improvement or opportunities for growth.

Here are some specific scenarios where data analysis can be particularly helpful:

  • Problem-solving : When you encounter a problem or challenge, data analysis can help you identify the root cause and develop effective solutions.
  • Optimization : Data analysis can help you optimize processes, products, or services to increase efficiency, reduce costs, and improve overall performance.
  • Prediction: Data analysis can help you make predictions about future trends or outcomes, which can inform strategic planning and decision-making.
  • Performance evaluation : Data analysis can help you evaluate the performance of a process, product, or service to identify areas for improvement and potential opportunities for growth.
  • Risk assessment : Data analysis can help you assess and mitigate risks, whether it is financial, operational, or related to safety.
  • Market research : Data analysis can help you understand customer behavior and preferences, identify market trends, and develop effective marketing strategies.
  • Quality control: Data analysis can help you ensure product quality and customer satisfaction by identifying and addressing quality issues.

Purpose of Data Analysis

The primary purposes of data analysis can be summarized as follows:

  • To gain insights: Data analysis allows you to identify patterns and trends in data, which can provide valuable insights into the underlying factors that influence a particular phenomenon or process.
  • To inform decision-making: Data analysis can help you make informed decisions based on the information that is available. By analyzing data, you can identify potential risks, opportunities, and solutions to problems.
  • To improve performance: Data analysis can help you optimize processes, products, or services by identifying areas for improvement and potential opportunities for growth.
  • To measure progress: Data analysis can help you measure progress towards a specific goal or objective, allowing you to track performance over time and adjust your strategies accordingly.
  • To identify new opportunities: Data analysis can help you identify new opportunities for growth and innovation by identifying patterns and trends that may not have been visible before.

Examples of Data Analysis

Some Examples of Data Analysis are as follows:

  • Social Media Monitoring: Companies use data analysis to monitor social media activity in real-time to understand their brand reputation, identify potential customer issues, and track competitors. By analyzing social media data, businesses can make informed decisions on product development, marketing strategies, and customer service.
  • Financial Trading: Financial traders use data analysis to make real-time decisions about buying and selling stocks, bonds, and other financial instruments. By analyzing real-time market data, traders can identify trends and patterns that help them make informed investment decisions.
  • Traffic Monitoring : Cities use data analysis to monitor traffic patterns and make real-time decisions about traffic management. By analyzing data from traffic cameras, sensors, and other sources, cities can identify congestion hotspots and make changes to improve traffic flow.
  • Healthcare Monitoring: Healthcare providers use data analysis to monitor patient health in real-time. By analyzing data from wearable devices, electronic health records, and other sources, healthcare providers can identify potential health issues and provide timely interventions.
  • Online Advertising: Online advertisers use data analysis to make real-time decisions about advertising campaigns. By analyzing data on user behavior and ad performance, advertisers can make adjustments to their campaigns to improve their effectiveness.
  • Sports Analysis : Sports teams use data analysis to make real-time decisions about strategy and player performance. By analyzing data on player movement, ball position, and other variables, coaches can make informed decisions about substitutions, game strategy, and training regimens.
  • Energy Management : Energy companies use data analysis to monitor energy consumption in real-time. By analyzing data on energy usage patterns, companies can identify opportunities to reduce energy consumption and improve efficiency.

Characteristics of Data Analysis

Characteristics of Data Analysis are as follows:

  • Objective : Data analysis should be objective and based on empirical evidence, rather than subjective assumptions or opinions.
  • Systematic : Data analysis should follow a systematic approach, using established methods and procedures for collecting, cleaning, and analyzing data.
  • Accurate : Data analysis should produce accurate results, free from errors and bias. Data should be validated and verified to ensure its quality.
  • Relevant : Data analysis should be relevant to the research question or problem being addressed. It should focus on the data that is most useful for answering the research question or solving the problem.
  • Comprehensive : Data analysis should be comprehensive and consider all relevant factors that may affect the research question or problem.
  • Timely : Data analysis should be conducted in a timely manner, so that the results are available when they are needed.
  • Reproducible : Data analysis should be reproducible, meaning that other researchers should be able to replicate the analysis using the same data and methods.
  • Communicable : Data analysis should be communicated clearly and effectively to stakeholders and other interested parties. The results should be presented in a way that is understandable and useful for decision-making.

Advantages of Data Analysis

Advantages of Data Analysis are as follows:

  • Better decision-making: Data analysis helps in making informed decisions based on facts and evidence, rather than intuition or guesswork.
  • Improved efficiency: Data analysis can identify inefficiencies and bottlenecks in business processes, allowing organizations to optimize their operations and reduce costs.
  • Increased accuracy: Data analysis helps to reduce errors and bias, providing more accurate and reliable information.
  • Better customer service: Data analysis can help organizations understand their customers better, allowing them to provide better customer service and improve customer satisfaction.
  • Competitive advantage: Data analysis can provide organizations with insights into their competitors, allowing them to identify areas where they can gain a competitive advantage.
  • Identification of trends and patterns : Data analysis can identify trends and patterns in data that may not be immediately apparent, helping organizations to make predictions and plan for the future.
  • Improved risk management : Data analysis can help organizations identify potential risks and take proactive steps to mitigate them.
  • Innovation: Data analysis can inspire innovation and new ideas by revealing new opportunities or previously unknown correlations in data.

Limitations of Data Analysis

  • Data quality: The quality of data can impact the accuracy and reliability of analysis results. If data is incomplete, inconsistent, or outdated, the analysis may not provide meaningful insights.
  • Limited scope: Data analysis is limited by the scope of the data available. If data is incomplete or does not capture all relevant factors, the analysis may not provide a complete picture.
  • Human error : Data analysis is often conducted by humans, and errors can occur in data collection, cleaning, and analysis.
  • Cost : Data analysis can be expensive, requiring specialized tools, software, and expertise.
  • Time-consuming : Data analysis can be time-consuming, especially when working with large datasets or conducting complex analyses.
  • Overreliance on data: Data analysis should be complemented with human intuition and expertise. Overreliance on data can lead to a lack of creativity and innovation.
  • Privacy concerns: Data analysis can raise privacy concerns if personal or sensitive information is used without proper consent or security measures.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Inferential Statistics

Inferential Statistics – Types, Methods and...

Phenomenology

Phenomenology – Methods, Examples and Guide

References in Research

References in Research – Types, Examples and...

Background of The Study

Background of The Study – Examples and Writing...

Data collection

Data Collection – Methods Types and Examples

Narrative Analysis

Narrative Analysis – Types, Methods and Examples

Banner

Research Guide: Data analysis and reporting findings

  • Postgraduate Online Training subject guide This link opens in a new window
  • Open Educational Resources (OERs)
  • Library support
  • Research ideas
  • You and your supervisor
  • Researcher skills
  • Research Data Management This link opens in a new window
  • Literature review
  • Plagiarism This link opens in a new window
  • Research Methods
  • Data analysis and reporting findings
  • Statistical support
  • Writing support
  • Researcher visibility
  • Conferences and Presentations
  • Postgraduate Forums
  • Soft skills development
  • Emotional support
  • The Commons Informer (blog)
  • Research Tip Archives
  • RC Newsletter Archives
  • Evaluation Forms
  • Editing FAQs

Data analysis and findings

Data analysis is the most crucial part of any research. Data analysis summarizes collected data. It involves the interpretation of data gathered through the use of analytical and logical reasoning to determine patterns, relationships or trends. 

Data Analysis Checklist

Cleaning  data

* Did you capture and code your data in the right manner?

*Do you have all data or missing data?

* Do you have enough observations?

* Do you have any outliers? If yes, what is the remedy for outlier?

* Does your data have the potential to answer your questions?

Analyzing data

* Visualize your data, e.g. charts, tables, and graphs, to mention a few.

*  Identify patterns, correlations, and trends

* Test your hypotheses

* Let your data tell a story

Reports the results

* Communicate and interpret the results

* Conclude and recommend

* Your targeted audience must understand your results

* Use more datasets and samples

* Use accessible and understandable data analytical tool

* Do not delegate your data analysis

* Clean data to confirm that they are complete and free from errors

* Analyze cleaned data

* Understand your results

* Keep in mind who will be reading your results and present it in a way that they will understand it

* Share the results with the supervisor oftentimes

Past presentations

  • PhD Writing Retreat - Analysing_Fieldwork_Data by Cori Wielenga A clear and concise presentation on the ‘now what’ and ‘so what’ of data collection and analysis - compiled and originally presented by Cori Wielenga.

Online Resources

data analysis and research findings

  • Qualitative analysis of interview data: A step-by-step guide
  • Qualitative Data Analysis - Coding & Developing Themes

Recommended Quantitative Data Analysis books

data analysis and research findings

Recommended Qualitative Data Analysis books

data analysis and research findings

  • << Previous: Data collection techniques
  • Next: Statistical support >>
  • Last Updated: Jul 2, 2024 7:20 AM
  • URL: https://library.up.ac.za/c.php?g=485435

Your Modern Business Guide To Data Analysis Methods And Techniques

Data analysis methods and techniques blog post by datapine

Table of Contents

1) What Is Data Analysis?

2) Why Is Data Analysis Important?

3) What Is The Data Analysis Process?

4) Types Of Data Analysis Methods

5) Top Data Analysis Techniques To Apply

6) Quality Criteria For Data Analysis

7) Data Analysis Limitations & Barriers

8) Data Analysis Skills

9) Data Analysis In The Big Data Environment

In our data-rich age, understanding how to analyze and extract true meaning from our business’s digital insights is one of the primary drivers of success.

Despite the colossal volume of data we create every day, a mere 0.5% is actually analyzed and used for data discovery , improvement, and intelligence. While that may not seem like much, considering the amount of digital information we have at our fingertips, half a percent still accounts for a vast amount of data.

With so much data and so little time, knowing how to collect, curate, organize, and make sense of all of this potentially business-boosting information can be a minefield – but online data analysis is the solution.

In science, data analysis uses a more complex approach with advanced techniques to explore and experiment with data. On the other hand, in a business context, data is used to make data-driven decisions that will enable the company to improve its overall performance. In this post, we will cover the analysis of data from an organizational point of view while still going through the scientific and statistical foundations that are fundamental to understanding the basics of data analysis. 

To put all of that into perspective, we will answer a host of important analytical questions, explore analytical methods and techniques, while demonstrating how to perform analysis in the real world with a 17-step blueprint for success.

What Is Data Analysis?

Data analysis is the process of collecting, modeling, and analyzing data using various statistical and logical methods and techniques. Businesses rely on analytics processes and tools to extract insights that support strategic and operational decision-making.

All these various methods are largely based on two core areas: quantitative and qualitative research.

To explain the key differences between qualitative and quantitative research, here’s a video for your viewing pleasure:

Gaining a better understanding of different techniques and methods in quantitative research as well as qualitative insights will give your analyzing efforts a more clearly defined direction, so it’s worth taking the time to allow this particular knowledge to sink in. Additionally, you will be able to create a comprehensive analytical report that will skyrocket your analysis.

Apart from qualitative and quantitative categories, there are also other types of data that you should be aware of before dividing into complex data analysis processes. These categories include: 

  • Big data: Refers to massive data sets that need to be analyzed using advanced software to reveal patterns and trends. It is considered to be one of the best analytical assets as it provides larger volumes of data at a faster rate. 
  • Metadata: Putting it simply, metadata is data that provides insights about other data. It summarizes key information about specific data that makes it easier to find and reuse for later purposes. 
  • Real time data: As its name suggests, real time data is presented as soon as it is acquired. From an organizational perspective, this is the most valuable data as it can help you make important decisions based on the latest developments. Our guide on real time analytics will tell you more about the topic. 
  • Machine data: This is more complex data that is generated solely by a machine such as phones, computers, or even websites and embedded systems, without previous human interaction.

Why Is Data Analysis Important?

Before we go into detail about the categories of analysis along with its methods and techniques, you must understand the potential that analyzing data can bring to your organization.

  • Informed decision-making : From a management perspective, you can benefit from analyzing your data as it helps you make decisions based on facts and not simple intuition. For instance, you can understand where to invest your capital, detect growth opportunities, predict your income, or tackle uncommon situations before they become problems. Through this, you can extract relevant insights from all areas in your organization, and with the help of dashboard software , present the data in a professional and interactive way to different stakeholders.
  • Reduce costs : Another great benefit is to reduce costs. With the help of advanced technologies such as predictive analytics, businesses can spot improvement opportunities, trends, and patterns in their data and plan their strategies accordingly. In time, this will help you save money and resources on implementing the wrong strategies. And not just that, by predicting different scenarios such as sales and demand you can also anticipate production and supply. 
  • Target customers better : Customers are arguably the most crucial element in any business. By using analytics to get a 360° vision of all aspects related to your customers, you can understand which channels they use to communicate with you, their demographics, interests, habits, purchasing behaviors, and more. In the long run, it will drive success to your marketing strategies, allow you to identify new potential customers, and avoid wasting resources on targeting the wrong people or sending the wrong message. You can also track customer satisfaction by analyzing your client’s reviews or your customer service department’s performance.

What Is The Data Analysis Process?

Data analysis process graphic

When we talk about analyzing data there is an order to follow in order to extract the needed conclusions. The analysis process consists of 5 key stages. We will cover each of them more in detail later in the post, but to start providing the needed context to understand what is coming next, here is a rundown of the 5 essential steps of data analysis. 

  • Identify: Before you get your hands dirty with data, you first need to identify why you need it in the first place. The identification is the stage in which you establish the questions you will need to answer. For example, what is the customer's perception of our brand? Or what type of packaging is more engaging to our potential customers? Once the questions are outlined you are ready for the next step. 
  • Collect: As its name suggests, this is the stage where you start collecting the needed data. Here, you define which sources of data you will use and how you will use them. The collection of data can come in different forms such as internal or external sources, surveys, interviews, questionnaires, and focus groups, among others.  An important note here is that the way you collect the data will be different in a quantitative and qualitative scenario. 
  • Clean: Once you have the necessary data it is time to clean it and leave it ready for analysis. Not all the data you collect will be useful, when collecting big amounts of data in different formats it is very likely that you will find yourself with duplicate or badly formatted data. To avoid this, before you start working with your data you need to make sure to erase any white spaces, duplicate records, or formatting errors. This way you avoid hurting your analysis with bad-quality data. 
  • Analyze : With the help of various techniques such as statistical analysis, regressions, neural networks, text analysis, and more, you can start analyzing and manipulating your data to extract relevant conclusions. At this stage, you find trends, correlations, variations, and patterns that can help you answer the questions you first thought of in the identify stage. Various technologies in the market assist researchers and average users with the management of their data. Some of them include business intelligence and visualization software, predictive analytics, and data mining, among others. 
  • Interpret: Last but not least you have one of the most important steps: it is time to interpret your results. This stage is where the researcher comes up with courses of action based on the findings. For example, here you would understand if your clients prefer packaging that is red or green, plastic or paper, etc. Additionally, at this stage, you can also find some limitations and work on them. 

Now that you have a basic understanding of the key data analysis steps, let’s look at the top 17 essential methods.

17 Essential Types Of Data Analysis Methods

Before diving into the 17 essential types of methods, it is important that we go over really fast through the main analysis categories. Starting with the category of descriptive up to prescriptive analysis, the complexity and effort of data evaluation increases, but also the added value for the company.

a) Descriptive analysis - What happened.

The descriptive analysis method is the starting point for any analytic reflection, and it aims to answer the question of what happened? It does this by ordering, manipulating, and interpreting raw data from various sources to turn it into valuable insights for your organization.

Performing descriptive analysis is essential, as it enables us to present our insights in a meaningful way. Although it is relevant to mention that this analysis on its own will not allow you to predict future outcomes or tell you the answer to questions like why something happened, it will leave your data organized and ready to conduct further investigations.

b) Exploratory analysis - How to explore data relationships.

As its name suggests, the main aim of the exploratory analysis is to explore. Prior to it, there is still no notion of the relationship between the data and the variables. Once the data is investigated, exploratory analysis helps you to find connections and generate hypotheses and solutions for specific problems. A typical area of ​​application for it is data mining.

c) Diagnostic analysis - Why it happened.

Diagnostic data analytics empowers analysts and executives by helping them gain a firm contextual understanding of why something happened. If you know why something happened as well as how it happened, you will be able to pinpoint the exact ways of tackling the issue or challenge.

Designed to provide direct and actionable answers to specific questions, this is one of the world’s most important methods in research, among its other key organizational functions such as retail analytics , e.g.

c) Predictive analysis - What will happen.

The predictive method allows you to look into the future to answer the question: what will happen? In order to do this, it uses the results of the previously mentioned descriptive, exploratory, and diagnostic analysis, in addition to machine learning (ML) and artificial intelligence (AI). Through this, you can uncover future trends, potential problems or inefficiencies, connections, and casualties in your data.

With predictive analysis, you can unfold and develop initiatives that will not only enhance your various operational processes but also help you gain an all-important edge over the competition. If you understand why a trend, pattern, or event happened through data, you will be able to develop an informed projection of how things may unfold in particular areas of the business.

e) Prescriptive analysis - How will it happen.

Another of the most effective types of analysis methods in research. Prescriptive data techniques cross over from predictive analysis in the way that it revolves around using patterns or trends to develop responsive, practical business strategies.

By drilling down into prescriptive analysis, you will play an active role in the data consumption process by taking well-arranged sets of visual data and using it as a powerful fix to emerging issues in a number of key areas, including marketing, sales, customer experience, HR, fulfillment, finance, logistics analytics , and others.

Top 17 data analysis methods

As mentioned at the beginning of the post, data analysis methods can be divided into two big categories: quantitative and qualitative. Each of these categories holds a powerful analytical value that changes depending on the scenario and type of data you are working with. Below, we will discuss 17 methods that are divided into qualitative and quantitative approaches. 

Without further ado, here are the 17 essential types of data analysis methods with some use cases in the business world: 

A. Quantitative Methods 

To put it simply, quantitative analysis refers to all methods that use numerical data or data that can be turned into numbers (e.g. category variables like gender, age, etc.) to extract valuable insights. It is used to extract valuable conclusions about relationships, differences, and test hypotheses. Below we discuss some of the key quantitative methods. 

1. Cluster analysis

The action of grouping a set of data elements in a way that said elements are more similar (in a particular sense) to each other than to those in other groups – hence the term ‘cluster.’ Since there is no target variable when clustering, the method is often used to find hidden patterns in the data. The approach is also used to provide additional context to a trend or dataset.

Let's look at it from an organizational perspective. In a perfect world, marketers would be able to analyze each customer separately and give them the best-personalized service, but let's face it, with a large customer base, it is timely impossible to do that. That's where clustering comes in. By grouping customers into clusters based on demographics, purchasing behaviors, monetary value, or any other factor that might be relevant for your company, you will be able to immediately optimize your efforts and give your customers the best experience based on their needs.

2. Cohort analysis

This type of data analysis approach uses historical data to examine and compare a determined segment of users' behavior, which can then be grouped with others with similar characteristics. By using this methodology, it's possible to gain a wealth of insight into consumer needs or a firm understanding of a broader target group.

Cohort analysis can be really useful for performing analysis in marketing as it will allow you to understand the impact of your campaigns on specific groups of customers. To exemplify, imagine you send an email campaign encouraging customers to sign up for your site. For this, you create two versions of the campaign with different designs, CTAs, and ad content. Later on, you can use cohort analysis to track the performance of the campaign for a longer period of time and understand which type of content is driving your customers to sign up, repurchase, or engage in other ways.  

A useful tool to start performing cohort analysis method is Google Analytics. You can learn more about the benefits and limitations of using cohorts in GA in this useful guide . In the bottom image, you see an example of how you visualize a cohort in this tool. The segments (devices traffic) are divided into date cohorts (usage of devices) and then analyzed week by week to extract insights into performance.

Cohort analysis chart example from google analytics

3. Regression analysis

Regression uses historical data to understand how a dependent variable's value is affected when one (linear regression) or more independent variables (multiple regression) change or stay the same. By understanding each variable's relationship and how it developed in the past, you can anticipate possible outcomes and make better decisions in the future.

Let's bring it down with an example. Imagine you did a regression analysis of your sales in 2019 and discovered that variables like product quality, store design, customer service, marketing campaigns, and sales channels affected the overall result. Now you want to use regression to analyze which of these variables changed or if any new ones appeared during 2020. For example, you couldn’t sell as much in your physical store due to COVID lockdowns. Therefore, your sales could’ve either dropped in general or increased in your online channels. Through this, you can understand which independent variables affected the overall performance of your dependent variable, annual sales.

If you want to go deeper into this type of analysis, check out this article and learn more about how you can benefit from regression.

4. Neural networks

The neural network forms the basis for the intelligent algorithms of machine learning. It is a form of analytics that attempts, with minimal intervention, to understand how the human brain would generate insights and predict values. Neural networks learn from each and every data transaction, meaning that they evolve and advance over time.

A typical area of application for neural networks is predictive analytics. There are BI reporting tools that have this feature implemented within them, such as the Predictive Analytics Tool from datapine. This tool enables users to quickly and easily generate all kinds of predictions. All you have to do is select the data to be processed based on your KPIs, and the software automatically calculates forecasts based on historical and current data. Thanks to its user-friendly interface, anyone in your organization can manage it; there’s no need to be an advanced scientist. 

Here is an example of how you can use the predictive analysis tool from datapine:

Example on how to use predictive analytics tool from datapine

**click to enlarge**

5. Factor analysis

The factor analysis also called “dimension reduction” is a type of data analysis used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors. The aim here is to uncover independent latent variables, an ideal method for streamlining specific segments.

A good way to understand this data analysis method is a customer evaluation of a product. The initial assessment is based on different variables like color, shape, wearability, current trends, materials, comfort, the place where they bought the product, and frequency of usage. Like this, the list can be endless, depending on what you want to track. In this case, factor analysis comes into the picture by summarizing all of these variables into homogenous groups, for example, by grouping the variables color, materials, quality, and trends into a brother latent variable of design.

If you want to start analyzing data using factor analysis we recommend you take a look at this practical guide from UCLA.

6. Data mining

A method of data analysis that is the umbrella term for engineering metrics and insights for additional value, direction, and context. By using exploratory statistical evaluation, data mining aims to identify dependencies, relations, patterns, and trends to generate advanced knowledge.  When considering how to analyze data, adopting a data mining mindset is essential to success - as such, it’s an area that is worth exploring in greater detail.

An excellent use case of data mining is datapine intelligent data alerts . With the help of artificial intelligence and machine learning, they provide automated signals based on particular commands or occurrences within a dataset. For example, if you’re monitoring supply chain KPIs , you could set an intelligent alarm to trigger when invalid or low-quality data appears. By doing so, you will be able to drill down deep into the issue and fix it swiftly and effectively.

In the following picture, you can see how the intelligent alarms from datapine work. By setting up ranges on daily orders, sessions, and revenues, the alarms will notify you if the goal was not completed or if it exceeded expectations.

Example on how to use intelligent alerts from datapine

7. Time series analysis

As its name suggests, time series analysis is used to analyze a set of data points collected over a specified period of time. Although analysts use this method to monitor the data points in a specific interval of time rather than just monitoring them intermittently, the time series analysis is not uniquely used for the purpose of collecting data over time. Instead, it allows researchers to understand if variables changed during the duration of the study, how the different variables are dependent, and how did it reach the end result. 

In a business context, this method is used to understand the causes of different trends and patterns to extract valuable insights. Another way of using this method is with the help of time series forecasting. Powered by predictive technologies, businesses can analyze various data sets over a period of time and forecast different future events. 

A great use case to put time series analysis into perspective is seasonality effects on sales. By using time series forecasting to analyze sales data of a specific product over time, you can understand if sales rise over a specific period of time (e.g. swimwear during summertime, or candy during Halloween). These insights allow you to predict demand and prepare production accordingly.  

8. Decision Trees 

The decision tree analysis aims to act as a support tool to make smart and strategic decisions. By visually displaying potential outcomes, consequences, and costs in a tree-like model, researchers and company users can easily evaluate all factors involved and choose the best course of action. Decision trees are helpful to analyze quantitative data and they allow for an improved decision-making process by helping you spot improvement opportunities, reduce costs, and enhance operational efficiency and production.

But how does a decision tree actually works? This method works like a flowchart that starts with the main decision that you need to make and branches out based on the different outcomes and consequences of each decision. Each outcome will outline its own consequences, costs, and gains and, at the end of the analysis, you can compare each of them and make the smartest decision. 

Businesses can use them to understand which project is more cost-effective and will bring more earnings in the long run. For example, imagine you need to decide if you want to update your software app or build a new app entirely.  Here you would compare the total costs, the time needed to be invested, potential revenue, and any other factor that might affect your decision.  In the end, you would be able to see which of these two options is more realistic and attainable for your company or research.

9. Conjoint analysis 

Last but not least, we have the conjoint analysis. This approach is usually used in surveys to understand how individuals value different attributes of a product or service and it is one of the most effective methods to extract consumer preferences. When it comes to purchasing, some clients might be more price-focused, others more features-focused, and others might have a sustainable focus. Whatever your customer's preferences are, you can find them with conjoint analysis. Through this, companies can define pricing strategies, packaging options, subscription packages, and more. 

A great example of conjoint analysis is in marketing and sales. For instance, a cupcake brand might use conjoint analysis and find that its clients prefer gluten-free options and cupcakes with healthier toppings over super sugary ones. Thus, the cupcake brand can turn these insights into advertisements and promotions to increase sales of this particular type of product. And not just that, conjoint analysis can also help businesses segment their customers based on their interests. This allows them to send different messaging that will bring value to each of the segments. 

10. Correspondence Analysis

Also known as reciprocal averaging, correspondence analysis is a method used to analyze the relationship between categorical variables presented within a contingency table. A contingency table is a table that displays two (simple correspondence analysis) or more (multiple correspondence analysis) categorical variables across rows and columns that show the distribution of the data, which is usually answers to a survey or questionnaire on a specific topic. 

This method starts by calculating an “expected value” which is done by multiplying row and column averages and dividing it by the overall original value of the specific table cell. The “expected value” is then subtracted from the original value resulting in a “residual number” which is what allows you to extract conclusions about relationships and distribution. The results of this analysis are later displayed using a map that represents the relationship between the different values. The closest two values are in the map, the bigger the relationship. Let’s put it into perspective with an example. 

Imagine you are carrying out a market research analysis about outdoor clothing brands and how they are perceived by the public. For this analysis, you ask a group of people to match each brand with a certain attribute which can be durability, innovation, quality materials, etc. When calculating the residual numbers, you can see that brand A has a positive residual for innovation but a negative one for durability. This means that brand A is not positioned as a durable brand in the market, something that competitors could take advantage of. 

11. Multidimensional Scaling (MDS)

MDS is a method used to observe the similarities or disparities between objects which can be colors, brands, people, geographical coordinates, and more. The objects are plotted using an “MDS map” that positions similar objects together and disparate ones far apart. The (dis) similarities between objects are represented using one or more dimensions that can be observed using a numerical scale. For example, if you want to know how people feel about the COVID-19 vaccine, you can use 1 for “don’t believe in the vaccine at all”  and 10 for “firmly believe in the vaccine” and a scale of 2 to 9 for in between responses.  When analyzing an MDS map the only thing that matters is the distance between the objects, the orientation of the dimensions is arbitrary and has no meaning at all. 

Multidimensional scaling is a valuable technique for market research, especially when it comes to evaluating product or brand positioning. For instance, if a cupcake brand wants to know how they are positioned compared to competitors, it can define 2-3 dimensions such as taste, ingredients, shopping experience, or more, and do a multidimensional scaling analysis to find improvement opportunities as well as areas in which competitors are currently leading. 

Another business example is in procurement when deciding on different suppliers. Decision makers can generate an MDS map to see how the different prices, delivery times, technical services, and more of the different suppliers differ and pick the one that suits their needs the best. 

A final example proposed by a research paper on "An Improved Study of Multilevel Semantic Network Visualization for Analyzing Sentiment Word of Movie Review Data". Researchers picked a two-dimensional MDS map to display the distances and relationships between different sentiments in movie reviews. They used 36 sentiment words and distributed them based on their emotional distance as we can see in the image below where the words "outraged" and "sweet" are on opposite sides of the map, marking the distance between the two emotions very clearly.

Example of multidimensional scaling analysis

Aside from being a valuable technique to analyze dissimilarities, MDS also serves as a dimension-reduction technique for large dimensional data. 

B. Qualitative Methods

Qualitative data analysis methods are defined as the observation of non-numerical data that is gathered and produced using methods of observation such as interviews, focus groups, questionnaires, and more. As opposed to quantitative methods, qualitative data is more subjective and highly valuable in analyzing customer retention and product development.

12. Text analysis

Text analysis, also known in the industry as text mining, works by taking large sets of textual data and arranging them in a way that makes it easier to manage. By working through this cleansing process in stringent detail, you will be able to extract the data that is truly relevant to your organization and use it to develop actionable insights that will propel you forward.

Modern software accelerate the application of text analytics. Thanks to the combination of machine learning and intelligent algorithms, you can perform advanced analytical processes such as sentiment analysis. This technique allows you to understand the intentions and emotions of a text, for example, if it's positive, negative, or neutral, and then give it a score depending on certain factors and categories that are relevant to your brand. Sentiment analysis is often used to monitor brand and product reputation and to understand how successful your customer experience is. To learn more about the topic check out this insightful article .

By analyzing data from various word-based sources, including product reviews, articles, social media communications, and survey responses, you will gain invaluable insights into your audience, as well as their needs, preferences, and pain points. This will allow you to create campaigns, services, and communications that meet your prospects’ needs on a personal level, growing your audience while boosting customer retention. There are various other “sub-methods” that are an extension of text analysis. Each of them serves a more specific purpose and we will look at them in detail next. 

13. Content Analysis

This is a straightforward and very popular method that examines the presence and frequency of certain words, concepts, and subjects in different content formats such as text, image, audio, or video. For example, the number of times the name of a celebrity is mentioned on social media or online tabloids. It does this by coding text data that is later categorized and tabulated in a way that can provide valuable insights, making it the perfect mix of quantitative and qualitative analysis.

There are two types of content analysis. The first one is the conceptual analysis which focuses on explicit data, for instance, the number of times a concept or word is mentioned in a piece of content. The second one is relational analysis, which focuses on the relationship between different concepts or words and how they are connected within a specific context. 

Content analysis is often used by marketers to measure brand reputation and customer behavior. For example, by analyzing customer reviews. It can also be used to analyze customer interviews and find directions for new product development. It is also important to note, that in order to extract the maximum potential out of this analysis method, it is necessary to have a clearly defined research question. 

14. Thematic Analysis

Very similar to content analysis, thematic analysis also helps in identifying and interpreting patterns in qualitative data with the main difference being that the first one can also be applied to quantitative analysis. The thematic method analyzes large pieces of text data such as focus group transcripts or interviews and groups them into themes or categories that come up frequently within the text. It is a great method when trying to figure out peoples view’s and opinions about a certain topic. For example, if you are a brand that cares about sustainability, you can do a survey of your customers to analyze their views and opinions about sustainability and how they apply it to their lives. You can also analyze customer service calls transcripts to find common issues and improve your service. 

Thematic analysis is a very subjective technique that relies on the researcher’s judgment. Therefore,  to avoid biases, it has 6 steps that include familiarization, coding, generating themes, reviewing themes, defining and naming themes, and writing up. It is also important to note that, because it is a flexible approach, the data can be interpreted in multiple ways and it can be hard to select what data is more important to emphasize. 

15. Narrative Analysis 

A bit more complex in nature than the two previous ones, narrative analysis is used to explore the meaning behind the stories that people tell and most importantly, how they tell them. By looking into the words that people use to describe a situation you can extract valuable conclusions about their perspective on a specific topic. Common sources for narrative data include autobiographies, family stories, opinion pieces, and testimonials, among others. 

From a business perspective, narrative analysis can be useful to analyze customer behaviors and feelings towards a specific product, service, feature, or others. It provides unique and deep insights that can be extremely valuable. However, it has some drawbacks.  

The biggest weakness of this method is that the sample sizes are usually very small due to the complexity and time-consuming nature of the collection of narrative data. Plus, the way a subject tells a story will be significantly influenced by his or her specific experiences, making it very hard to replicate in a subsequent study. 

16. Discourse Analysis

Discourse analysis is used to understand the meaning behind any type of written, verbal, or symbolic discourse based on its political, social, or cultural context. It mixes the analysis of languages and situations together. This means that the way the content is constructed and the meaning behind it is significantly influenced by the culture and society it takes place in. For example, if you are analyzing political speeches you need to consider different context elements such as the politician's background, the current political context of the country, the audience to which the speech is directed, and so on. 

From a business point of view, discourse analysis is a great market research tool. It allows marketers to understand how the norms and ideas of the specific market work and how their customers relate to those ideas. It can be very useful to build a brand mission or develop a unique tone of voice. 

17. Grounded Theory Analysis

Traditionally, researchers decide on a method and hypothesis and start to collect the data to prove that hypothesis. The grounded theory is the only method that doesn’t require an initial research question or hypothesis as its value lies in the generation of new theories. With the grounded theory method, you can go into the analysis process with an open mind and explore the data to generate new theories through tests and revisions. In fact, it is not necessary to collect the data and then start to analyze it. Researchers usually start to find valuable insights as they are gathering the data. 

All of these elements make grounded theory a very valuable method as theories are fully backed by data instead of initial assumptions. It is a great technique to analyze poorly researched topics or find the causes behind specific company outcomes. For example, product managers and marketers might use the grounded theory to find the causes of high levels of customer churn and look into customer surveys and reviews to develop new theories about the causes. 

How To Analyze Data? Top 17 Data Analysis Techniques To Apply

17 top data analysis techniques by datapine

Now that we’ve answered the questions “what is data analysis’”, why is it important, and covered the different data analysis types, it’s time to dig deeper into how to perform your analysis by working through these 17 essential techniques.

1. Collaborate your needs

Before you begin analyzing or drilling down into any techniques, it’s crucial to sit down collaboratively with all key stakeholders within your organization, decide on your primary campaign or strategic goals, and gain a fundamental understanding of the types of insights that will best benefit your progress or provide you with the level of vision you need to evolve your organization.

2. Establish your questions

Once you’ve outlined your core objectives, you should consider which questions will need answering to help you achieve your mission. This is one of the most important techniques as it will shape the very foundations of your success.

To help you ask the right things and ensure your data works for you, you have to ask the right data analysis questions .

3. Data democratization

After giving your data analytics methodology some real direction, and knowing which questions need answering to extract optimum value from the information available to your organization, you should continue with democratization.

Data democratization is an action that aims to connect data from various sources efficiently and quickly so that anyone in your organization can access it at any given moment. You can extract data in text, images, videos, numbers, or any other format. And then perform cross-database analysis to achieve more advanced insights to share with the rest of the company interactively.  

Once you have decided on your most valuable sources, you need to take all of this into a structured format to start collecting your insights. For this purpose, datapine offers an easy all-in-one data connectors feature to integrate all your internal and external sources and manage them at your will. Additionally, datapine’s end-to-end solution automatically updates your data, allowing you to save time and focus on performing the right analysis to grow your company.

data connectors from datapine

4. Think of governance 

When collecting data in a business or research context you always need to think about security and privacy. With data breaches becoming a topic of concern for businesses, the need to protect your client's or subject’s sensitive information becomes critical. 

To ensure that all this is taken care of, you need to think of a data governance strategy. According to Gartner , this concept refers to “ the specification of decision rights and an accountability framework to ensure the appropriate behavior in the valuation, creation, consumption, and control of data and analytics .” In simpler words, data governance is a collection of processes, roles, and policies, that ensure the efficient use of data while still achieving the main company goals. It ensures that clear roles are in place for who can access the information and how they can access it. In time, this not only ensures that sensitive information is protected but also allows for an efficient analysis as a whole. 

5. Clean your data

After harvesting from so many sources you will be left with a vast amount of information that can be overwhelming to deal with. At the same time, you can be faced with incorrect data that can be misleading to your analysis. The smartest thing you can do to avoid dealing with this in the future is to clean the data. This is fundamental before visualizing it, as it will ensure that the insights you extract from it are correct.

There are many things that you need to look for in the cleaning process. The most important one is to eliminate any duplicate observations; this usually appears when using multiple internal and external sources of information. You can also add any missing codes, fix empty fields, and eliminate incorrectly formatted data.

Another usual form of cleaning is done with text data. As we mentioned earlier, most companies today analyze customer reviews, social media comments, questionnaires, and several other text inputs. In order for algorithms to detect patterns, text data needs to be revised to avoid invalid characters or any syntax or spelling errors. 

Most importantly, the aim of cleaning is to prevent you from arriving at false conclusions that can damage your company in the long run. By using clean data, you will also help BI solutions to interact better with your information and create better reports for your organization.

6. Set your KPIs

Once you’ve set your sources, cleaned your data, and established clear-cut questions you want your insights to answer, you need to set a host of key performance indicators (KPIs) that will help you track, measure, and shape your progress in a number of key areas.

KPIs are critical to both qualitative and quantitative analysis research. This is one of the primary methods of data analysis you certainly shouldn’t overlook.

To help you set the best possible KPIs for your initiatives and activities, here is an example of a relevant logistics KPI : transportation-related costs. If you want to see more go explore our collection of key performance indicator examples .

Transportation costs logistics KPIs

7. Omit useless data

Having bestowed your data analysis tools and techniques with true purpose and defined your mission, you should explore the raw data you’ve collected from all sources and use your KPIs as a reference for chopping out any information you deem to be useless.

Trimming the informational fat is one of the most crucial methods of analysis as it will allow you to focus your analytical efforts and squeeze every drop of value from the remaining ‘lean’ information.

Any stats, facts, figures, or metrics that don’t align with your business goals or fit with your KPI management strategies should be eliminated from the equation.

8. Build a data management roadmap

While, at this point, this particular step is optional (you will have already gained a wealth of insight and formed a fairly sound strategy by now), creating a data governance roadmap will help your data analysis methods and techniques become successful on a more sustainable basis. These roadmaps, if developed properly, are also built so they can be tweaked and scaled over time.

Invest ample time in developing a roadmap that will help you store, manage, and handle your data internally, and you will make your analysis techniques all the more fluid and functional – one of the most powerful types of data analysis methods available today.

9. Integrate technology

There are many ways to analyze data, but one of the most vital aspects of analytical success in a business context is integrating the right decision support software and technology.

Robust analysis platforms will not only allow you to pull critical data from your most valuable sources while working with dynamic KPIs that will offer you actionable insights; it will also present them in a digestible, visual, interactive format from one central, live dashboard . A data methodology you can count on.

By integrating the right technology within your data analysis methodology, you’ll avoid fragmenting your insights, saving you time and effort while allowing you to enjoy the maximum value from your business’s most valuable insights.

For a look at the power of software for the purpose of analysis and to enhance your methods of analyzing, glance over our selection of dashboard examples .

10. Answer your questions

By considering each of the above efforts, working with the right technology, and fostering a cohesive internal culture where everyone buys into the different ways to analyze data as well as the power of digital intelligence, you will swiftly start to answer your most burning business questions. Arguably, the best way to make your data concepts accessible across the organization is through data visualization.

11. Visualize your data

Online data visualization is a powerful tool as it lets you tell a story with your metrics, allowing users across the organization to extract meaningful insights that aid business evolution – and it covers all the different ways to analyze data.

The purpose of analyzing is to make your entire organization more informed and intelligent, and with the right platform or dashboard, this is simpler than you think, as demonstrated by our marketing dashboard .

An executive dashboard example showcasing high-level marketing KPIs such as cost per lead, MQL, SQL, and cost per customer.

This visual, dynamic, and interactive online dashboard is a data analysis example designed to give Chief Marketing Officers (CMO) an overview of relevant metrics to help them understand if they achieved their monthly goals.

In detail, this example generated with a modern dashboard creator displays interactive charts for monthly revenues, costs, net income, and net income per customer; all of them are compared with the previous month so that you can understand how the data fluctuated. In addition, it shows a detailed summary of the number of users, customers, SQLs, and MQLs per month to visualize the whole picture and extract relevant insights or trends for your marketing reports .

The CMO dashboard is perfect for c-level management as it can help them monitor the strategic outcome of their marketing efforts and make data-driven decisions that can benefit the company exponentially.

12. Be careful with the interpretation

We already dedicated an entire post to data interpretation as it is a fundamental part of the process of data analysis. It gives meaning to the analytical information and aims to drive a concise conclusion from the analysis results. Since most of the time companies are dealing with data from many different sources, the interpretation stage needs to be done carefully and properly in order to avoid misinterpretations. 

To help you through the process, here we list three common practices that you need to avoid at all costs when looking at your data:

  • Correlation vs. causation: The human brain is formatted to find patterns. This behavior leads to one of the most common mistakes when performing interpretation: confusing correlation with causation. Although these two aspects can exist simultaneously, it is not correct to assume that because two things happened together, one provoked the other. A piece of advice to avoid falling into this mistake is never to trust just intuition, trust the data. If there is no objective evidence of causation, then always stick to correlation. 
  • Confirmation bias: This phenomenon describes the tendency to select and interpret only the data necessary to prove one hypothesis, often ignoring the elements that might disprove it. Even if it's not done on purpose, confirmation bias can represent a real problem, as excluding relevant information can lead to false conclusions and, therefore, bad business decisions. To avoid it, always try to disprove your hypothesis instead of proving it, share your analysis with other team members, and avoid drawing any conclusions before the entire analytical project is finalized.
  • Statistical significance: To put it in short words, statistical significance helps analysts understand if a result is actually accurate or if it happened because of a sampling error or pure chance. The level of statistical significance needed might depend on the sample size and the industry being analyzed. In any case, ignoring the significance of a result when it might influence decision-making can be a huge mistake.

13. Build a narrative

Now, we’re going to look at how you can bring all of these elements together in a way that will benefit your business - starting with a little something called data storytelling.

The human brain responds incredibly well to strong stories or narratives. Once you’ve cleansed, shaped, and visualized your most invaluable data using various BI dashboard tools , you should strive to tell a story - one with a clear-cut beginning, middle, and end.

By doing so, you will make your analytical efforts more accessible, digestible, and universal, empowering more people within your organization to use your discoveries to their actionable advantage.

14. Consider autonomous technology

Autonomous technologies, such as artificial intelligence (AI) and machine learning (ML), play a significant role in the advancement of understanding how to analyze data more effectively.

Gartner predicts that by the end of this year, 80% of emerging technologies will be developed with AI foundations. This is a testament to the ever-growing power and value of autonomous technologies.

At the moment, these technologies are revolutionizing the analysis industry. Some examples that we mentioned earlier are neural networks, intelligent alarms, and sentiment analysis.

15. Share the load

If you work with the right tools and dashboards, you will be able to present your metrics in a digestible, value-driven format, allowing almost everyone in the organization to connect with and use relevant data to their advantage.

Modern dashboards consolidate data from various sources, providing access to a wealth of insights in one centralized location, no matter if you need to monitor recruitment metrics or generate reports that need to be sent across numerous departments. Moreover, these cutting-edge tools offer access to dashboards from a multitude of devices, meaning that everyone within the business can connect with practical insights remotely - and share the load.

Once everyone is able to work with a data-driven mindset, you will catalyze the success of your business in ways you never thought possible. And when it comes to knowing how to analyze data, this kind of collaborative approach is essential.

16. Data analysis tools

In order to perform high-quality analysis of data, it is fundamental to use tools and software that will ensure the best results. Here we leave you a small summary of four fundamental categories of data analysis tools for your organization.

  • Business Intelligence: BI tools allow you to process significant amounts of data from several sources in any format. Through this, you can not only analyze and monitor your data to extract relevant insights but also create interactive reports and dashboards to visualize your KPIs and use them for your company's good. datapine is an amazing online BI software that is focused on delivering powerful online analysis features that are accessible to beginner and advanced users. Like this, it offers a full-service solution that includes cutting-edge analysis of data, KPIs visualization, live dashboards, reporting, and artificial intelligence technologies to predict trends and minimize risk.
  • Statistical analysis: These tools are usually designed for scientists, statisticians, market researchers, and mathematicians, as they allow them to perform complex statistical analyses with methods like regression analysis, predictive analysis, and statistical modeling. A good tool to perform this type of analysis is R-Studio as it offers a powerful data modeling and hypothesis testing feature that can cover both academic and general data analysis. This tool is one of the favorite ones in the industry, due to its capability for data cleaning, data reduction, and performing advanced analysis with several statistical methods. Another relevant tool to mention is SPSS from IBM. The software offers advanced statistical analysis for users of all skill levels. Thanks to a vast library of machine learning algorithms, text analysis, and a hypothesis testing approach it can help your company find relevant insights to drive better decisions. SPSS also works as a cloud service that enables you to run it anywhere.
  • SQL Consoles: SQL is a programming language often used to handle structured data in relational databases. Tools like these are popular among data scientists as they are extremely effective in unlocking these databases' value. Undoubtedly, one of the most used SQL software in the market is MySQL Workbench . This tool offers several features such as a visual tool for database modeling and monitoring, complete SQL optimization, administration tools, and visual performance dashboards to keep track of KPIs.
  • Data Visualization: These tools are used to represent your data through charts, graphs, and maps that allow you to find patterns and trends in the data. datapine's already mentioned BI platform also offers a wealth of powerful online data visualization tools with several benefits. Some of them include: delivering compelling data-driven presentations to share with your entire company, the ability to see your data online with any device wherever you are, an interactive dashboard design feature that enables you to showcase your results in an interactive and understandable way, and to perform online self-service reports that can be used simultaneously with several other people to enhance team productivity.

17. Refine your process constantly 

Last is a step that might seem obvious to some people, but it can be easily ignored if you think you are done. Once you have extracted the needed results, you should always take a retrospective look at your project and think about what you can improve. As you saw throughout this long list of techniques, data analysis is a complex process that requires constant refinement. For this reason, you should always go one step further and keep improving. 

Quality Criteria For Data Analysis

So far we’ve covered a list of methods and techniques that should help you perform efficient data analysis. But how do you measure the quality and validity of your results? This is done with the help of some science quality criteria. Here we will go into a more theoretical area that is critical to understanding the fundamentals of statistical analysis in science. However, you should also be aware of these steps in a business context, as they will allow you to assess the quality of your results in the correct way. Let’s dig in. 

  • Internal validity: The results of a survey are internally valid if they measure what they are supposed to measure and thus provide credible results. In other words , internal validity measures the trustworthiness of the results and how they can be affected by factors such as the research design, operational definitions, how the variables are measured, and more. For instance, imagine you are doing an interview to ask people if they brush their teeth two times a day. While most of them will answer yes, you can still notice that their answers correspond to what is socially acceptable, which is to brush your teeth at least twice a day. In this case, you can’t be 100% sure if respondents actually brush their teeth twice a day or if they just say that they do, therefore, the internal validity of this interview is very low. 
  • External validity: Essentially, external validity refers to the extent to which the results of your research can be applied to a broader context. It basically aims to prove that the findings of a study can be applied in the real world. If the research can be applied to other settings, individuals, and times, then the external validity is high. 
  • Reliability : If your research is reliable, it means that it can be reproduced. If your measurement were repeated under the same conditions, it would produce similar results. This means that your measuring instrument consistently produces reliable results. For example, imagine a doctor building a symptoms questionnaire to detect a specific disease in a patient. Then, various other doctors use this questionnaire but end up diagnosing the same patient with a different condition. This means the questionnaire is not reliable in detecting the initial disease. Another important note here is that in order for your research to be reliable, it also needs to be objective. If the results of a study are the same, independent of who assesses them or interprets them, the study can be considered reliable. Let’s see the objectivity criteria in more detail now. 
  • Objectivity: In data science, objectivity means that the researcher needs to stay fully objective when it comes to its analysis. The results of a study need to be affected by objective criteria and not by the beliefs, personality, or values of the researcher. Objectivity needs to be ensured when you are gathering the data, for example, when interviewing individuals, the questions need to be asked in a way that doesn't influence the results. Paired with this, objectivity also needs to be thought of when interpreting the data. If different researchers reach the same conclusions, then the study is objective. For this last point, you can set predefined criteria to interpret the results to ensure all researchers follow the same steps. 

The discussed quality criteria cover mostly potential influences in a quantitative context. Analysis in qualitative research has by default additional subjective influences that must be controlled in a different way. Therefore, there are other quality criteria for this kind of research such as credibility, transferability, dependability, and confirmability. You can see each of them more in detail on this resource . 

Data Analysis Limitations & Barriers

Analyzing data is not an easy task. As you’ve seen throughout this post, there are many steps and techniques that you need to apply in order to extract useful information from your research. While a well-performed analysis can bring various benefits to your organization it doesn't come without limitations. In this section, we will discuss some of the main barriers you might encounter when conducting an analysis. Let’s see them more in detail. 

  • Lack of clear goals: No matter how good your data or analysis might be if you don’t have clear goals or a hypothesis the process might be worthless. While we mentioned some methods that don’t require a predefined hypothesis, it is always better to enter the analytical process with some clear guidelines of what you are expecting to get out of it, especially in a business context in which data is utilized to support important strategic decisions. 
  • Objectivity: Arguably one of the biggest barriers when it comes to data analysis in research is to stay objective. When trying to prove a hypothesis, researchers might find themselves, intentionally or unintentionally, directing the results toward an outcome that they want. To avoid this, always question your assumptions and avoid confusing facts with opinions. You can also show your findings to a research partner or external person to confirm that your results are objective. 
  • Data representation: A fundamental part of the analytical procedure is the way you represent your data. You can use various graphs and charts to represent your findings, but not all of them will work for all purposes. Choosing the wrong visual can not only damage your analysis but can mislead your audience, therefore, it is important to understand when to use each type of data depending on your analytical goals. Our complete guide on the types of graphs and charts lists 20 different visuals with examples of when to use them. 
  • Flawed correlation : Misleading statistics can significantly damage your research. We’ve already pointed out a few interpretation issues previously in the post, but it is an important barrier that we can't avoid addressing here as well. Flawed correlations occur when two variables appear related to each other but they are not. Confusing correlations with causation can lead to a wrong interpretation of results which can lead to building wrong strategies and loss of resources, therefore, it is very important to identify the different interpretation mistakes and avoid them. 
  • Sample size: A very common barrier to a reliable and efficient analysis process is the sample size. In order for the results to be trustworthy, the sample size should be representative of what you are analyzing. For example, imagine you have a company of 1000 employees and you ask the question “do you like working here?” to 50 employees of which 49 say yes, which means 95%. Now, imagine you ask the same question to the 1000 employees and 950 say yes, which also means 95%. Saying that 95% of employees like working in the company when the sample size was only 50 is not a representative or trustworthy conclusion. The significance of the results is way more accurate when surveying a bigger sample size.   
  • Privacy concerns: In some cases, data collection can be subjected to privacy regulations. Businesses gather all kinds of information from their customers from purchasing behaviors to addresses and phone numbers. If this falls into the wrong hands due to a breach, it can affect the security and confidentiality of your clients. To avoid this issue, you need to collect only the data that is needed for your research and, if you are using sensitive facts, make it anonymous so customers are protected. The misuse of customer data can severely damage a business's reputation, so it is important to keep an eye on privacy. 
  • Lack of communication between teams : When it comes to performing data analysis on a business level, it is very likely that each department and team will have different goals and strategies. However, they are all working for the same common goal of helping the business run smoothly and keep growing. When teams are not connected and communicating with each other, it can directly affect the way general strategies are built. To avoid these issues, tools such as data dashboards enable teams to stay connected through data in a visually appealing way. 
  • Innumeracy : Businesses are working with data more and more every day. While there are many BI tools available to perform effective analysis, data literacy is still a constant barrier. Not all employees know how to apply analysis techniques or extract insights from them. To prevent this from happening, you can implement different training opportunities that will prepare every relevant user to deal with data. 

Key Data Analysis Skills

As you've learned throughout this lengthy guide, analyzing data is a complex task that requires a lot of knowledge and skills. That said, thanks to the rise of self-service tools the process is way more accessible and agile than it once was. Regardless, there are still some key skills that are valuable to have when working with data, we list the most important ones below.

  • Critical and statistical thinking: To successfully analyze data you need to be creative and think out of the box. Yes, that might sound like a weird statement considering that data is often tight to facts. However, a great level of critical thinking is required to uncover connections, come up with a valuable hypothesis, and extract conclusions that go a step further from the surface. This, of course, needs to be complemented by statistical thinking and an understanding of numbers. 
  • Data cleaning: Anyone who has ever worked with data before will tell you that the cleaning and preparation process accounts for 80% of a data analyst's work, therefore, the skill is fundamental. But not just that, not cleaning the data adequately can also significantly damage the analysis which can lead to poor decision-making in a business scenario. While there are multiple tools that automate the cleaning process and eliminate the possibility of human error, it is still a valuable skill to dominate. 
  • Data visualization: Visuals make the information easier to understand and analyze, not only for professional users but especially for non-technical ones. Having the necessary skills to not only choose the right chart type but know when to apply it correctly is key. This also means being able to design visually compelling charts that make the data exploration process more efficient. 
  • SQL: The Structured Query Language or SQL is a programming language used to communicate with databases. It is fundamental knowledge as it enables you to update, manipulate, and organize data from relational databases which are the most common databases used by companies. It is fairly easy to learn and one of the most valuable skills when it comes to data analysis. 
  • Communication skills: This is a skill that is especially valuable in a business environment. Being able to clearly communicate analytical outcomes to colleagues is incredibly important, especially when the information you are trying to convey is complex for non-technical people. This applies to in-person communication as well as written format, for example, when generating a dashboard or report. While this might be considered a “soft” skill compared to the other ones we mentioned, it should not be ignored as you most likely will need to share analytical findings with others no matter the context. 

Data Analysis In The Big Data Environment

Big data is invaluable to today’s businesses, and by using different methods for data analysis, it’s possible to view your data in a way that can help you turn insight into positive action.

To inspire your efforts and put the importance of big data into context, here are some insights that you should know:

  • By 2026 the industry of big data is expected to be worth approximately $273.4 billion.
  • 94% of enterprises say that analyzing data is important for their growth and digital transformation. 
  • Companies that exploit the full potential of their data can increase their operating margins by 60% .
  • We already told you the benefits of Artificial Intelligence through this article. This industry's financial impact is expected to grow up to $40 billion by 2025.

Data analysis concepts may come in many forms, but fundamentally, any solid methodology will help to make your business more streamlined, cohesive, insightful, and successful than ever before.

Key Takeaways From Data Analysis 

As we reach the end of our data analysis journey, we leave a small summary of the main methods and techniques to perform excellent analysis and grow your business.

17 Essential Types of Data Analysis Methods:

  • Cluster analysis
  • Cohort analysis
  • Regression analysis
  • Factor analysis
  • Neural Networks
  • Data Mining
  • Text analysis
  • Time series analysis
  • Decision trees
  • Conjoint analysis 
  • Correspondence Analysis
  • Multidimensional Scaling 
  • Content analysis 
  • Thematic analysis
  • Narrative analysis 
  • Grounded theory analysis
  • Discourse analysis 

Top 17 Data Analysis Techniques:

  • Collaborate your needs
  • Establish your questions
  • Data democratization
  • Think of data governance 
  • Clean your data
  • Set your KPIs
  • Omit useless data
  • Build a data management roadmap
  • Integrate technology
  • Answer your questions
  • Visualize your data
  • Interpretation of data
  • Consider autonomous technology
  • Build a narrative
  • Share the load
  • Data Analysis tools
  • Refine your process constantly 

We’ve pondered the data analysis definition and drilled down into the practical applications of data-centric analytics, and one thing is clear: by taking measures to arrange your data and making your metrics work for you, it’s possible to transform raw information into action - the kind of that will push your business to the next level.

Yes, good data analytics techniques result in enhanced business intelligence (BI). To help you understand this notion in more detail, read our exploration of business intelligence reporting .

And, if you’re ready to perform your own analysis, drill down into your facts and figures while interacting with your data on astonishing visuals, you can try our software for a free, 14-day trial .

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Indian J Anaesth
  • v.60(9); 2016 Sep

Basic statistical tools in research and data analysis

Zulfiqar ali.

Department of Anaesthesiology, Division of Neuroanaesthesiology, Sheri Kashmir Institute of Medical Sciences, Soura, Srinagar, Jammu and Kashmir, India

S Bala Bhaskar

1 Department of Anaesthesiology and Critical Care, Vijayanagar Institute of Medical Sciences, Bellary, Karnataka, India

Statistical methods involved in carrying out a study include planning, designing, collecting data, analysing, drawing meaningful interpretation and reporting of the research findings. The statistical analysis gives meaning to the meaningless numbers, thereby breathing life into a lifeless data. The results and inferences are precise only if proper statistical tests are used. This article will try to acquaint the reader with the basic research tools that are utilised while conducting various studies. The article covers a brief outline of the variables, an understanding of quantitative and qualitative variables and the measures of central tendency. An idea of the sample size estimation, power analysis and the statistical errors is given. Finally, there is a summary of parametric and non-parametric tests used for data analysis.

INTRODUCTION

Statistics is a branch of science that deals with the collection, organisation, analysis of data and drawing of inferences from the samples to the whole population.[ 1 ] This requires a proper design of the study, an appropriate selection of the study sample and choice of a suitable statistical test. An adequate knowledge of statistics is necessary for proper designing of an epidemiological study or a clinical trial. Improper statistical methods may result in erroneous conclusions which may lead to unethical practice.[ 2 ]

Variable is a characteristic that varies from one individual member of population to another individual.[ 3 ] Variables such as height and weight are measured by some type of scale, convey quantitative information and are called as quantitative variables. Sex and eye colour give qualitative information and are called as qualitative variables[ 3 ] [ Figure 1 ].

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g001.jpg

Classification of variables

Quantitative variables

Quantitative or numerical data are subdivided into discrete and continuous measurements. Discrete numerical data are recorded as a whole number such as 0, 1, 2, 3,… (integer), whereas continuous data can assume any value. Observations that can be counted constitute the discrete data and observations that can be measured constitute the continuous data. Examples of discrete data are number of episodes of respiratory arrests or the number of re-intubations in an intensive care unit. Similarly, examples of continuous data are the serial serum glucose levels, partial pressure of oxygen in arterial blood and the oesophageal temperature.

A hierarchical scale of increasing precision can be used for observing and recording the data which is based on categorical, ordinal, interval and ratio scales [ Figure 1 ].

Categorical or nominal variables are unordered. The data are merely classified into categories and cannot be arranged in any particular order. If only two categories exist (as in gender male and female), it is called as a dichotomous (or binary) data. The various causes of re-intubation in an intensive care unit due to upper airway obstruction, impaired clearance of secretions, hypoxemia, hypercapnia, pulmonary oedema and neurological impairment are examples of categorical variables.

Ordinal variables have a clear ordering between the variables. However, the ordered data may not have equal intervals. Examples are the American Society of Anesthesiologists status or Richmond agitation-sedation scale.

Interval variables are similar to an ordinal variable, except that the intervals between the values of the interval variable are equally spaced. A good example of an interval scale is the Fahrenheit degree scale used to measure temperature. With the Fahrenheit scale, the difference between 70° and 75° is equal to the difference between 80° and 85°: The units of measurement are equal throughout the full range of the scale.

Ratio scales are similar to interval scales, in that equal differences between scale values have equal quantitative meaning. However, ratio scales also have a true zero point, which gives them an additional property. For example, the system of centimetres is an example of a ratio scale. There is a true zero point and the value of 0 cm means a complete absence of length. The thyromental distance of 6 cm in an adult may be twice that of a child in whom it may be 3 cm.

STATISTICS: DESCRIPTIVE AND INFERENTIAL STATISTICS

Descriptive statistics[ 4 ] try to describe the relationship between variables in a sample or population. Descriptive statistics provide a summary of data in the form of mean, median and mode. Inferential statistics[ 4 ] use a random sample of data taken from a population to describe and make inferences about the whole population. It is valuable when it is not possible to examine each member of an entire population. The examples if descriptive and inferential statistics are illustrated in Table 1 .

Example of descriptive and inferential statistics

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g002.jpg

Descriptive statistics

The extent to which the observations cluster around a central location is described by the central tendency and the spread towards the extremes is described by the degree of dispersion.

Measures of central tendency

The measures of central tendency are mean, median and mode.[ 6 ] Mean (or the arithmetic average) is the sum of all the scores divided by the number of scores. Mean may be influenced profoundly by the extreme variables. For example, the average stay of organophosphorus poisoning patients in ICU may be influenced by a single patient who stays in ICU for around 5 months because of septicaemia. The extreme values are called outliers. The formula for the mean is

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g003.jpg

where x = each observation and n = number of observations. Median[ 6 ] is defined as the middle of a distribution in a ranked data (with half of the variables in the sample above and half below the median value) while mode is the most frequently occurring variable in a distribution. Range defines the spread, or variability, of a sample.[ 7 ] It is described by the minimum and maximum values of the variables. If we rank the data and after ranking, group the observations into percentiles, we can get better information of the pattern of spread of the variables. In percentiles, we rank the observations into 100 equal parts. We can then describe 25%, 50%, 75% or any other percentile amount. The median is the 50 th percentile. The interquartile range will be the observations in the middle 50% of the observations about the median (25 th -75 th percentile). Variance[ 7 ] is a measure of how spread out is the distribution. It gives an indication of how close an individual observation clusters about the mean value. The variance of a population is defined by the following formula:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g004.jpg

where σ 2 is the population variance, X is the population mean, X i is the i th element from the population and N is the number of elements in the population. The variance of a sample is defined by slightly different formula:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g005.jpg

where s 2 is the sample variance, x is the sample mean, x i is the i th element from the sample and n is the number of elements in the sample. The formula for the variance of a population has the value ‘ n ’ as the denominator. The expression ‘ n −1’ is known as the degrees of freedom and is one less than the number of parameters. Each observation is free to vary, except the last one which must be a defined value. The variance is measured in squared units. To make the interpretation of the data simple and to retain the basic unit of observation, the square root of variance is used. The square root of the variance is the standard deviation (SD).[ 8 ] The SD of a population is defined by the following formula:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g006.jpg

where σ is the population SD, X is the population mean, X i is the i th element from the population and N is the number of elements in the population. The SD of a sample is defined by slightly different formula:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g007.jpg

where s is the sample SD, x is the sample mean, x i is the i th element from the sample and n is the number of elements in the sample. An example for calculation of variation and SD is illustrated in Table 2 .

Example of mean, variance, standard deviation

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g008.jpg

Normal distribution or Gaussian distribution

Most of the biological variables usually cluster around a central value, with symmetrical positive and negative deviations about this point.[ 1 ] The standard normal distribution curve is a symmetrical bell-shaped. In a normal distribution curve, about 68% of the scores are within 1 SD of the mean. Around 95% of the scores are within 2 SDs of the mean and 99% within 3 SDs of the mean [ Figure 2 ].

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g009.jpg

Normal distribution curve

Skewed distribution

It is a distribution with an asymmetry of the variables about its mean. In a negatively skewed distribution [ Figure 3 ], the mass of the distribution is concentrated on the right of Figure 1 . In a positively skewed distribution [ Figure 3 ], the mass of the distribution is concentrated on the left of the figure leading to a longer right tail.

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g010.jpg

Curves showing negatively skewed and positively skewed distribution

Inferential statistics

In inferential statistics, data are analysed from a sample to make inferences in the larger collection of the population. The purpose is to answer or test the hypotheses. A hypothesis (plural hypotheses) is a proposed explanation for a phenomenon. Hypothesis tests are thus procedures for making rational decisions about the reality of observed effects.

Probability is the measure of the likelihood that an event will occur. Probability is quantified as a number between 0 and 1 (where 0 indicates impossibility and 1 indicates certainty).

In inferential statistics, the term ‘null hypothesis’ ( H 0 ‘ H-naught ,’ ‘ H-null ’) denotes that there is no relationship (difference) between the population variables in question.[ 9 ]

Alternative hypothesis ( H 1 and H a ) denotes that a statement between the variables is expected to be true.[ 9 ]

The P value (or the calculated probability) is the probability of the event occurring by chance if the null hypothesis is true. The P value is a numerical between 0 and 1 and is interpreted by researchers in deciding whether to reject or retain the null hypothesis [ Table 3 ].

P values with interpretation

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g011.jpg

If P value is less than the arbitrarily chosen value (known as α or the significance level), the null hypothesis (H0) is rejected [ Table 4 ]. However, if null hypotheses (H0) is incorrectly rejected, this is known as a Type I error.[ 11 ] Further details regarding alpha error, beta error and sample size calculation and factors influencing them are dealt with in another section of this issue by Das S et al .[ 12 ]

Illustration for null hypothesis

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g012.jpg

PARAMETRIC AND NON-PARAMETRIC TESTS

Numerical data (quantitative variables) that are normally distributed are analysed with parametric tests.[ 13 ]

Two most basic prerequisites for parametric statistical analysis are:

  • The assumption of normality which specifies that the means of the sample group are normally distributed
  • The assumption of equal variance which specifies that the variances of the samples and of their corresponding population are equal.

However, if the distribution of the sample is skewed towards one side or the distribution is unknown due to the small sample size, non-parametric[ 14 ] statistical techniques are used. Non-parametric tests are used to analyse ordinal and categorical data.

Parametric tests

The parametric tests assume that the data are on a quantitative (numerical) scale, with a normal distribution of the underlying population. The samples have the same variance (homogeneity of variances). The samples are randomly drawn from the population, and the observations within a group are independent of each other. The commonly used parametric tests are the Student's t -test, analysis of variance (ANOVA) and repeated measures ANOVA.

Student's t -test

Student's t -test is used to test the null hypothesis that there is no difference between the means of the two groups. It is used in three circumstances:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g013.jpg

where X = sample mean, u = population mean and SE = standard error of mean

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g014.jpg

where X 1 − X 2 is the difference between the means of the two groups and SE denotes the standard error of the difference.

  • To test if the population means estimated by two dependent samples differ significantly (the paired t -test). A usual setting for paired t -test is when measurements are made on the same subjects before and after a treatment.

The formula for paired t -test is:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g015.jpg

where d is the mean difference and SE denotes the standard error of this difference.

The group variances can be compared using the F -test. The F -test is the ratio of variances (var l/var 2). If F differs significantly from 1.0, then it is concluded that the group variances differ significantly.

Analysis of variance

The Student's t -test cannot be used for comparison of three or more groups. The purpose of ANOVA is to test if there is any significant difference between the means of two or more groups.

In ANOVA, we study two variances – (a) between-group variability and (b) within-group variability. The within-group variability (error variance) is the variation that cannot be accounted for in the study design. It is based on random differences present in our samples.

However, the between-group (or effect variance) is the result of our treatment. These two estimates of variances are compared using the F-test.

A simplified formula for the F statistic is:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g016.jpg

where MS b is the mean squares between the groups and MS w is the mean squares within groups.

Repeated measures analysis of variance

As with ANOVA, repeated measures ANOVA analyses the equality of means of three or more groups. However, a repeated measure ANOVA is used when all variables of a sample are measured under different conditions or at different points in time.

As the variables are measured from a sample at different points of time, the measurement of the dependent variable is repeated. Using a standard ANOVA in this case is not appropriate because it fails to model the correlation between the repeated measures: The data violate the ANOVA assumption of independence. Hence, in the measurement of repeated dependent variables, repeated measures ANOVA should be used.

Non-parametric tests

When the assumptions of normality are not met, and the sample means are not normally, distributed parametric tests can lead to erroneous results. Non-parametric tests (distribution-free test) are used in such situation as they do not require the normality assumption.[ 15 ] Non-parametric tests may fail to detect a significant difference when compared with a parametric test. That is, they usually have less power.

As is done for the parametric tests, the test statistic is compared with known values for the sampling distribution of that statistic and the null hypothesis is accepted or rejected. The types of non-parametric analysis techniques and the corresponding parametric analysis techniques are delineated in Table 5 .

Analogue of parametric and non-parametric tests

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g017.jpg

Median test for one sample: The sign test and Wilcoxon's signed rank test

The sign test and Wilcoxon's signed rank test are used for median tests of one sample. These tests examine whether one instance of sample data is greater or smaller than the median reference value.

This test examines the hypothesis about the median θ0 of a population. It tests the null hypothesis H0 = θ0. When the observed value (Xi) is greater than the reference value (θ0), it is marked as+. If the observed value is smaller than the reference value, it is marked as − sign. If the observed value is equal to the reference value (θ0), it is eliminated from the sample.

If the null hypothesis is true, there will be an equal number of + signs and − signs.

The sign test ignores the actual values of the data and only uses + or − signs. Therefore, it is useful when it is difficult to measure the values.

Wilcoxon's signed rank test

There is a major limitation of sign test as we lose the quantitative information of the given data and merely use the + or – signs. Wilcoxon's signed rank test not only examines the observed values in comparison with θ0 but also takes into consideration the relative sizes, adding more statistical power to the test. As in the sign test, if there is an observed value that is equal to the reference value θ0, this observed value is eliminated from the sample.

Wilcoxon's rank sum test ranks all data points in order, calculates the rank sum of each sample and compares the difference in the rank sums.

Mann-Whitney test

It is used to test the null hypothesis that two samples have the same median or, alternatively, whether observations in one sample tend to be larger than observations in the other.

Mann–Whitney test compares all data (xi) belonging to the X group and all data (yi) belonging to the Y group and calculates the probability of xi being greater than yi: P (xi > yi). The null hypothesis states that P (xi > yi) = P (xi < yi) =1/2 while the alternative hypothesis states that P (xi > yi) ≠1/2.

Kolmogorov-Smirnov test

The two-sample Kolmogorov-Smirnov (KS) test was designed as a generic method to test whether two random samples are drawn from the same distribution. The null hypothesis of the KS test is that both distributions are identical. The statistic of the KS test is a distance between the two empirical distributions, computed as the maximum absolute difference between their cumulative curves.

Kruskal-Wallis test

The Kruskal–Wallis test is a non-parametric test to analyse the variance.[ 14 ] It analyses if there is any difference in the median values of three or more independent samples. The data values are ranked in an increasing order, and the rank sums calculated followed by calculation of the test statistic.

Jonckheere test

In contrast to Kruskal–Wallis test, in Jonckheere test, there is an a priori ordering that gives it a more statistical power than the Kruskal–Wallis test.[ 14 ]

Friedman test

The Friedman test is a non-parametric test for testing the difference between several related samples. The Friedman test is an alternative for repeated measures ANOVAs which is used when the same parameter has been measured under different conditions on the same subjects.[ 13 ]

Tests to analyse the categorical data

Chi-square test, Fischer's exact test and McNemar's test are used to analyse the categorical or nominal variables. The Chi-square test compares the frequencies and tests whether the observed data differ significantly from that of the expected data if there were no differences between groups (i.e., the null hypothesis). It is calculated by the sum of the squared difference between observed ( O ) and the expected ( E ) data (or the deviation, d ) divided by the expected data by the following formula:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g018.jpg

A Yates correction factor is used when the sample size is small. Fischer's exact test is used to determine if there are non-random associations between two categorical variables. It does not assume random sampling, and instead of referring a calculated statistic to a sampling distribution, it calculates an exact probability. McNemar's test is used for paired nominal data. It is applied to 2 × 2 table with paired-dependent samples. It is used to determine whether the row and column frequencies are equal (that is, whether there is ‘marginal homogeneity’). The null hypothesis is that the paired proportions are equal. The Mantel-Haenszel Chi-square test is a multivariate test as it analyses multiple grouping variables. It stratifies according to the nominated confounding variables and identifies any that affects the primary outcome variable. If the outcome variable is dichotomous, then logistic regression is used.

SOFTWARES AVAILABLE FOR STATISTICS, SAMPLE SIZE CALCULATION AND POWER ANALYSIS

Numerous statistical software systems are available currently. The commonly used software systems are Statistical Package for the Social Sciences (SPSS – manufactured by IBM corporation), Statistical Analysis System ((SAS – developed by SAS Institute North Carolina, United States of America), R (designed by Ross Ihaka and Robert Gentleman from R core team), Minitab (developed by Minitab Inc), Stata (developed by StataCorp) and the MS Excel (developed by Microsoft).

There are a number of web resources which are related to statistical power analyses. A few are:

  • StatPages.net – provides links to a number of online power calculators
  • G-Power – provides a downloadable power analysis program that runs under DOS
  • Power analysis for ANOVA designs an interactive site that calculates power or sample size needed to attain a given power for one effect in a factorial ANOVA design
  • SPSS makes a program called SamplePower. It gives an output of a complete report on the computer screen which can be cut and paste into another document.

It is important that a researcher knows the concepts of the basic statistical methods used for conduct of a research study. This will help to conduct an appropriately well-designed study leading to valid and reliable results. Inappropriate use of statistical techniques may lead to faulty conclusions, inducing errors and undermining the significance of the article. Bad statistics may lead to bad research, and bad research may lead to unethical practice. Hence, an adequate knowledge of statistics and the appropriate use of statistical tests are important. An appropriate knowledge about the basic statistical methods will go a long way in improving the research designs and producing quality medical research which can be utilised for formulating the evidence-based guidelines.

Financial support and sponsorship

Conflicts of interest.

There are no conflicts of interest.

data analysis and research findings

What is Data Analysis? (Types, Methods, and Tools)

' src=

  • Couchbase Product Marketing December 17, 2023

Data analysis is the process of cleaning, transforming, and interpreting data to uncover insights, patterns, and trends. It plays a crucial role in decision making, problem solving, and driving innovation across various domains. 

In addition to further exploring the role data analysis plays this blog post will discuss common data analysis techniques, delve into the distinction between quantitative and qualitative data, explore popular data analysis tools, and discuss the steps involved in the data analysis process. 

By the end, you should have a deeper understanding of data analysis and its applications, empowering you to harness the power of data to make informed decisions and gain actionable insights.

Why is Data Analysis Important?

Data analysis is important across various domains and industries. It helps with:

  • Decision Making : Data analysis provides valuable insights that support informed decision making, enabling organizations to make data-driven choices for better outcomes.
  • Problem Solving : Data analysis helps identify and solve problems by uncovering root causes, detecting anomalies, and optimizing processes for increased efficiency.
  • Performance Evaluation : Data analysis allows organizations to evaluate performance, track progress, and measure success by analyzing key performance indicators (KPIs) and other relevant metrics.
  • Gathering Insights : Data analysis uncovers valuable insights that drive innovation, enabling businesses to develop new products, services, and strategies aligned with customer needs and market demand.
  • Risk Management : Data analysis helps mitigate risks by identifying risk factors and enabling proactive measures to minimize potential negative impacts.

By leveraging data analysis, organizations can gain a competitive advantage, improve operational efficiency, and make smarter decisions that positively impact the bottom line.

Quantitative vs. Qualitative Data

In data analysis, you’ll commonly encounter two types of data: quantitative and qualitative. Understanding the differences between these two types of data is essential for selecting appropriate analysis methods and drawing meaningful insights. Here’s an overview of quantitative and qualitative data:

Quantitative Data

Quantitative data is numerical and represents quantities or measurements. It’s typically collected through surveys, experiments, and direct measurements. This type of data is characterized by its ability to be counted, measured, and subjected to mathematical calculations. Examples of quantitative data include age, height, sales figures, test scores, and the number of website users.

Quantitative data has the following characteristics:

  • Numerical : Quantitative data is expressed in numerical values that can be analyzed and manipulated mathematically.
  • Objective : Quantitative data is objective and can be measured and verified independently of individual interpretations.
  • Statistical Analysis : Quantitative data lends itself well to statistical analysis. It allows for applying various statistical techniques, such as descriptive statistics, correlation analysis, regression analysis, and hypothesis testing.
  • Generalizability : Quantitative data often aims to generalize findings to a larger population. It allows for making predictions, estimating probabilities, and drawing statistical inferences.

Qualitative Data

Qualitative data, on the other hand, is non-numerical and is collected through interviews, observations, and open-ended survey questions. It focuses on capturing rich, descriptive, and subjective information to gain insights into people’s opinions, attitudes, experiences, and behaviors. Examples of qualitative data include interview transcripts, field notes, survey responses, and customer feedback.

Qualitative data has the following characteristics:

  • Descriptive : Qualitative data provides detailed descriptions, narratives, or interpretations of phenomena, often capturing context, emotions, and nuances.
  • Subjective : Qualitative data is subjective and influenced by the individuals’ perspectives, experiences, and interpretations.
  • Interpretive Analysis : Qualitative data requires interpretive techniques, such as thematic analysis, content analysis, and discourse analysis, to uncover themes, patterns, and underlying meanings.
  • Contextual Understanding : Qualitative data emphasizes understanding the social, cultural, and contextual factors that shape individuals’ experiences and behaviors.
  • Rich Insights : Qualitative data enables researchers to gain in-depth insights into complex phenomena and explore research questions in greater depth.

In summary, quantitative data represents numerical quantities and lends itself well to statistical analysis, while qualitative data provides rich, descriptive insights into subjective experiences and requires interpretive analysis techniques. Understanding the differences between quantitative and qualitative data is crucial for selecting appropriate analysis methods and drawing meaningful conclusions in research and data analysis.

Types of Data Analysis

Different types of data analysis techniques serve different purposes. In this section, we’ll explore four types of data analysis: descriptive, diagnostic, predictive, and prescriptive, and go over how you can use them.

Descriptive Analysis

Descriptive analysis involves summarizing and describing the main characteristics of a dataset. It focuses on gaining a comprehensive understanding of the data through measures such as central tendency (mean, median, mode), dispersion (variance, standard deviation), and graphical representations (histograms, bar charts). For example, in a retail business, descriptive analysis may involve analyzing sales data to identify average monthly sales, popular products, or sales distribution across different regions.

Diagnostic Analysis

Diagnostic analysis aims to understand the causes or factors influencing specific outcomes or events. It involves investigating relationships between variables and identifying patterns or anomalies in the data. Diagnostic analysis often uses regression analysis, correlation analysis, and hypothesis testing to uncover the underlying reasons behind observed phenomena. For example, in healthcare, diagnostic analysis could help determine factors contributing to patient readmissions and identify potential improvements in the care process.

Predictive Analysis

Predictive analysis focuses on making predictions or forecasts about future outcomes based on historical data. It utilizes statistical models, machine learning algorithms, and time series analysis to identify patterns and trends in the data. By applying predictive analysis, businesses can anticipate customer behavior, market trends, or demand for products and services. For example, an e-commerce company might use predictive analysis to forecast customer churn and take proactive measures to retain customers.

Prescriptive Analysis

Prescriptive analysis takes predictive analysis a step further by providing recommendations or optimal solutions based on the predicted outcomes. It combines historical and real-time data with optimization techniques, simulation models, and decision-making algorithms to suggest the best course of action. Prescriptive analysis helps organizations make data-driven decisions and optimize their strategies. For example, a logistics company can use prescriptive analysis to determine the most efficient delivery routes, considering factors like traffic conditions, fuel costs, and customer preferences.

In summary, data analysis plays a vital role in extracting insights and enabling informed decision making. Descriptive analysis helps understand the data, diagnostic analysis uncovers the underlying causes, predictive analysis forecasts future outcomes, and prescriptive analysis provides recommendations for optimal actions. These different data analysis techniques are valuable tools for businesses and organizations across various industries.

Data Analysis Methods

In addition to the data analysis types discussed earlier, you can use various methods to analyze data effectively. These methods provide a structured approach to extract insights, detect patterns, and derive meaningful conclusions from the available data. Here are some commonly used data analysis methods:

Statistical Analysis 

Statistical analysis involves applying statistical techniques to data to uncover patterns, relationships, and trends. It includes methods such as hypothesis testing, regression analysis, analysis of variance (ANOVA), and chi-square tests. Statistical analysis helps organizations understand the significance of relationships between variables and make inferences about the population based on sample data. For example, a market research company could conduct a survey to analyze the relationship between customer satisfaction and product price. They can use regression analysis to determine whether there is a significant correlation between these variables.

Data Mining

Data mining refers to the process of discovering patterns and relationships in large datasets using techniques such as clustering, classification, association analysis, and anomaly detection. It involves exploring data to identify hidden patterns and gain valuable insights. For example, a telecommunications company could analyze customer call records to identify calling patterns and segment customers into groups based on their calling behavior. 

Text Mining

Text mining involves analyzing unstructured data , such as customer reviews, social media posts, or emails, to extract valuable information and insights. It utilizes techniques like natural language processing (NLP), sentiment analysis, and topic modeling to analyze and understand textual data. For example, consider how a hotel chain might analyze customer reviews from various online platforms to identify common themes and sentiment patterns to improve customer satisfaction.

Time Series Analysis

Time series analysis focuses on analyzing data collected over time to identify trends, seasonality, and patterns. It involves techniques such as forecasting, decomposition, and autocorrelation analysis to make predictions and understand the underlying patterns in the data.

For example, an energy company could analyze historical electricity consumption data to forecast future demand and optimize energy generation and distribution.

Data Visualization

Data visualization is the graphical representation of data to communicate patterns, trends, and insights visually. It uses charts, graphs, maps, and other visual elements to present data in a visually appealing and easily understandable format. For example, a sales team might use a line chart to visualize monthly sales trends and identify seasonal patterns in their sales data.

These are just a few examples of the data analysis methods you can use. Your choice should depend on the nature of the data, the research question or problem, and the desired outcome.

How to Analyze Data

Analyzing data involves following a systematic approach to extract insights and derive meaningful conclusions. Here are some steps to guide you through the process of analyzing data effectively:

Define the Objective : Clearly define the purpose and objective of your data analysis. Identify the specific question or problem you want to address through analysis.

Prepare and Explore the Data : Gather the relevant data and ensure its quality. Clean and preprocess the data by handling missing values, duplicates, and formatting issues. Explore the data using descriptive statistics and visualizations to identify patterns, outliers, and relationships.

Apply Analysis Techniques : Choose the appropriate analysis techniques based on your data and research question. Apply statistical methods, machine learning algorithms, and other analytical tools to derive insights and answer your research question.

Interpret the Results : Analyze the output of your analysis and interpret the findings in the context of your objective. Identify significant patterns, trends, and relationships in the data. Consider the implications and practical relevance of the results.

Communicate and Take Action : Communicate your findings effectively to stakeholders or intended audiences. Present the results clearly and concisely, using visualizations and reports. Use the insights from the analysis to inform decision making.

Remember, data analysis is an iterative process, and you may need to revisit and refine your analysis as you progress. These steps provide a general framework to guide you through the data analysis process and help you derive meaningful insights from your data.

Data Analysis Tools

Data analysis tools are software applications and platforms designed to facilitate the process of analyzing and interpreting data . These tools provide a range of functionalities to handle data manipulation, visualization, statistical analysis, and machine learning. Here are some commonly used data analysis tools:

Spreadsheet Software

Tools like Microsoft Excel, Google Sheets, and Apple Numbers are used for basic data analysis tasks. They offer features for data entry, manipulation, basic statistical functions, and simple visualizations.

Business Intelligence (BI) Platforms

BI platforms like Microsoft Power BI, Tableau, and Looker integrate data from multiple sources, providing comprehensive views of business performance through interactive dashboards, reports, and ad hoc queries.

Programming Languages and Libraries

Programming languages like R and Python, along with their associated libraries (e.g., NumPy, SciPy, scikit-learn), offer extensive capabilities for data analysis. They provide flexibility, customizability, and access to a wide range of statistical and machine-learning algorithms.

Cloud-Based Analytics Platforms

Cloud-based platforms like Google Cloud Platform (BigQuery, Data Studio), Microsoft Azure (Azure Analytics, Power BI), and Amazon Web Services (AWS Analytics, QuickSight) provide scalable and collaborative environments for data storage, processing, and analysis. They have a wide range of analytical capabilities for handling large datasets.

Data Mining and Machine Learning Tools

Tools like RapidMiner, KNIME, and Weka automate the process of data preprocessing, feature selection, model training, and evaluation. They’re designed to extract insights and build predictive models from complex datasets.

Text Analytics Tools

Text analytics tools, such as Natural Language Processing (NLP) libraries in Python (NLTK, spaCy) or platforms like RapidMiner Text Mining Extension, enable the analysis of unstructured text data . They help extract information, sentiment, and themes from sources like customer reviews or social media.

Choosing the right data analysis tool depends on analysis complexity, dataset size, required functionalities, and user expertise. You might need to use a combination of tools to leverage their combined strengths and address specific analysis needs. 

By understanding the power of data analysis, you can leverage it to make informed decisions, identify opportunities for improvement, and drive innovation within your organization. Whether you’re working with quantitative data for statistical analysis or qualitative data for in-depth insights, it’s important to select the right analysis techniques and tools for your objectives.

To continue learning about data analysis, review the following resources:

  • What is Big Data Analytics?
  • Operational Analytics
  • JSON Analytics + Real-Time Insights
  • Database vs. Data Warehouse: Differences, Use Cases, Examples
  • Couchbase Capella Columnar Product Blog
  • Posted in: Analytics , Application Design , Best Practices and Tutorials
  • Tagged in: data analytics , data visualization , time series

Posted by Couchbase Product Marketing

Leave a reply cancel reply.

You must be logged in to post a comment.

Check your inbox or spam folder to confirm your subscription.

PW Skills | Blog

Data Analysis Techniques in Research – Methods, Tools & Examples

' src=

Varun Saharawat is a seasoned professional in the fields of SEO and content writing. With a profound knowledge of the intricate aspects of these disciplines, Varun has established himself as a valuable asset in the world of digital marketing and online content creation.

data analysis techniques in research

Data analysis techniques in research are essential because they allow researchers to derive meaningful insights from data sets to support their hypotheses or research objectives.

Data Analysis Techniques in Research : While various groups, institutions, and professionals may have diverse approaches to data analysis, a universal definition captures its essence. Data analysis involves refining, transforming, and interpreting raw data to derive actionable insights that guide informed decision-making for businesses.

Data Analytics Course

A straightforward illustration of data analysis emerges when we make everyday decisions, basing our choices on past experiences or predictions of potential outcomes.

If you want to learn more about this topic and acquire valuable skills that will set you apart in today’s data-driven world, we highly recommend enrolling in the Data Analytics Course by Physics Wallah . And as a special offer for our readers, use the coupon code “READER” to get a discount on this course.

Table of Contents

What is Data Analysis?

Data analysis is the systematic process of inspecting, cleaning, transforming, and interpreting data with the objective of discovering valuable insights and drawing meaningful conclusions. This process involves several steps:

  • Inspecting : Initial examination of data to understand its structure, quality, and completeness.
  • Cleaning : Removing errors, inconsistencies, or irrelevant information to ensure accurate analysis.
  • Transforming : Converting data into a format suitable for analysis, such as normalization or aggregation.
  • Interpreting : Analyzing the transformed data to identify patterns, trends, and relationships.

Types of Data Analysis Techniques in Research

Data analysis techniques in research are categorized into qualitative and quantitative methods, each with its specific approaches and tools. These techniques are instrumental in extracting meaningful insights, patterns, and relationships from data to support informed decision-making, validate hypotheses, and derive actionable recommendations. Below is an in-depth exploration of the various types of data analysis techniques commonly employed in research:

1) Qualitative Analysis:

Definition: Qualitative analysis focuses on understanding non-numerical data, such as opinions, concepts, or experiences, to derive insights into human behavior, attitudes, and perceptions.

  • Content Analysis: Examines textual data, such as interview transcripts, articles, or open-ended survey responses, to identify themes, patterns, or trends.
  • Narrative Analysis: Analyzes personal stories or narratives to understand individuals’ experiences, emotions, or perspectives.
  • Ethnographic Studies: Involves observing and analyzing cultural practices, behaviors, and norms within specific communities or settings.

2) Quantitative Analysis:

Quantitative analysis emphasizes numerical data and employs statistical methods to explore relationships, patterns, and trends. It encompasses several approaches:

Descriptive Analysis:

  • Frequency Distribution: Represents the number of occurrences of distinct values within a dataset.
  • Central Tendency: Measures such as mean, median, and mode provide insights into the central values of a dataset.
  • Dispersion: Techniques like variance and standard deviation indicate the spread or variability of data.

Diagnostic Analysis:

  • Regression Analysis: Assesses the relationship between dependent and independent variables, enabling prediction or understanding causality.
  • ANOVA (Analysis of Variance): Examines differences between groups to identify significant variations or effects.

Predictive Analysis:

  • Time Series Forecasting: Uses historical data points to predict future trends or outcomes.
  • Machine Learning Algorithms: Techniques like decision trees, random forests, and neural networks predict outcomes based on patterns in data.

Prescriptive Analysis:

  • Optimization Models: Utilizes linear programming, integer programming, or other optimization techniques to identify the best solutions or strategies.
  • Simulation: Mimics real-world scenarios to evaluate various strategies or decisions and determine optimal outcomes.

Specific Techniques:

  • Monte Carlo Simulation: Models probabilistic outcomes to assess risk and uncertainty.
  • Factor Analysis: Reduces the dimensionality of data by identifying underlying factors or components.
  • Cohort Analysis: Studies specific groups or cohorts over time to understand trends, behaviors, or patterns within these groups.
  • Cluster Analysis: Classifies objects or individuals into homogeneous groups or clusters based on similarities or attributes.
  • Sentiment Analysis: Uses natural language processing and machine learning techniques to determine sentiment, emotions, or opinions from textual data.

Also Read: AI and Predictive Analytics: Examples, Tools, Uses, Ai Vs Predictive Analytics

Data Analysis Techniques in Research Examples

To provide a clearer understanding of how data analysis techniques are applied in research, let’s consider a hypothetical research study focused on evaluating the impact of online learning platforms on students’ academic performance.

Research Objective:

Determine if students using online learning platforms achieve higher academic performance compared to those relying solely on traditional classroom instruction.

Data Collection:

  • Quantitative Data: Academic scores (grades) of students using online platforms and those using traditional classroom methods.
  • Qualitative Data: Feedback from students regarding their learning experiences, challenges faced, and preferences.

Data Analysis Techniques Applied:

1) Descriptive Analysis:

  • Calculate the mean, median, and mode of academic scores for both groups.
  • Create frequency distributions to represent the distribution of grades in each group.

2) Diagnostic Analysis:

  • Conduct an Analysis of Variance (ANOVA) to determine if there’s a statistically significant difference in academic scores between the two groups.
  • Perform Regression Analysis to assess the relationship between the time spent on online platforms and academic performance.

3) Predictive Analysis:

  • Utilize Time Series Forecasting to predict future academic performance trends based on historical data.
  • Implement Machine Learning algorithms to develop a predictive model that identifies factors contributing to academic success on online platforms.

4) Prescriptive Analysis:

  • Apply Optimization Models to identify the optimal combination of online learning resources (e.g., video lectures, interactive quizzes) that maximize academic performance.
  • Use Simulation Techniques to evaluate different scenarios, such as varying student engagement levels with online resources, to determine the most effective strategies for improving learning outcomes.

5) Specific Techniques:

  • Conduct Factor Analysis on qualitative feedback to identify common themes or factors influencing students’ perceptions and experiences with online learning.
  • Perform Cluster Analysis to segment students based on their engagement levels, preferences, or academic outcomes, enabling targeted interventions or personalized learning strategies.
  • Apply Sentiment Analysis on textual feedback to categorize students’ sentiments as positive, negative, or neutral regarding online learning experiences.

By applying a combination of qualitative and quantitative data analysis techniques, this research example aims to provide comprehensive insights into the effectiveness of online learning platforms.

Also Read: Learning Path to Become a Data Analyst in 2024

Data Analysis Techniques in Quantitative Research

Quantitative research involves collecting numerical data to examine relationships, test hypotheses, and make predictions. Various data analysis techniques are employed to interpret and draw conclusions from quantitative data. Here are some key data analysis techniques commonly used in quantitative research:

1) Descriptive Statistics:

  • Description: Descriptive statistics are used to summarize and describe the main aspects of a dataset, such as central tendency (mean, median, mode), variability (range, variance, standard deviation), and distribution (skewness, kurtosis).
  • Applications: Summarizing data, identifying patterns, and providing initial insights into the dataset.

2) Inferential Statistics:

  • Description: Inferential statistics involve making predictions or inferences about a population based on a sample of data. This technique includes hypothesis testing, confidence intervals, t-tests, chi-square tests, analysis of variance (ANOVA), regression analysis, and correlation analysis.
  • Applications: Testing hypotheses, making predictions, and generalizing findings from a sample to a larger population.

3) Regression Analysis:

  • Description: Regression analysis is a statistical technique used to model and examine the relationship between a dependent variable and one or more independent variables. Linear regression, multiple regression, logistic regression, and nonlinear regression are common types of regression analysis .
  • Applications: Predicting outcomes, identifying relationships between variables, and understanding the impact of independent variables on the dependent variable.

4) Correlation Analysis:

  • Description: Correlation analysis is used to measure and assess the strength and direction of the relationship between two or more variables. The Pearson correlation coefficient, Spearman rank correlation coefficient, and Kendall’s tau are commonly used measures of correlation.
  • Applications: Identifying associations between variables and assessing the degree and nature of the relationship.

5) Factor Analysis:

  • Description: Factor analysis is a multivariate statistical technique used to identify and analyze underlying relationships or factors among a set of observed variables. It helps in reducing the dimensionality of data and identifying latent variables or constructs.
  • Applications: Identifying underlying factors or constructs, simplifying data structures, and understanding the underlying relationships among variables.

6) Time Series Analysis:

  • Description: Time series analysis involves analyzing data collected or recorded over a specific period at regular intervals to identify patterns, trends, and seasonality. Techniques such as moving averages, exponential smoothing, autoregressive integrated moving average (ARIMA), and Fourier analysis are used.
  • Applications: Forecasting future trends, analyzing seasonal patterns, and understanding time-dependent relationships in data.

7) ANOVA (Analysis of Variance):

  • Description: Analysis of variance (ANOVA) is a statistical technique used to analyze and compare the means of two or more groups or treatments to determine if they are statistically different from each other. One-way ANOVA, two-way ANOVA, and MANOVA (Multivariate Analysis of Variance) are common types of ANOVA.
  • Applications: Comparing group means, testing hypotheses, and determining the effects of categorical independent variables on a continuous dependent variable.

8) Chi-Square Tests:

  • Description: Chi-square tests are non-parametric statistical tests used to assess the association between categorical variables in a contingency table. The Chi-square test of independence, goodness-of-fit test, and test of homogeneity are common chi-square tests.
  • Applications: Testing relationships between categorical variables, assessing goodness-of-fit, and evaluating independence.

These quantitative data analysis techniques provide researchers with valuable tools and methods to analyze, interpret, and derive meaningful insights from numerical data. The selection of a specific technique often depends on the research objectives, the nature of the data, and the underlying assumptions of the statistical methods being used.

Also Read: Analysis vs. Analytics: How Are They Different?

Data Analysis Methods

Data analysis methods refer to the techniques and procedures used to analyze, interpret, and draw conclusions from data. These methods are essential for transforming raw data into meaningful insights, facilitating decision-making processes, and driving strategies across various fields. Here are some common data analysis methods:

  • Description: Descriptive statistics summarize and organize data to provide a clear and concise overview of the dataset. Measures such as mean, median, mode, range, variance, and standard deviation are commonly used.
  • Description: Inferential statistics involve making predictions or inferences about a population based on a sample of data. Techniques such as hypothesis testing, confidence intervals, and regression analysis are used.

3) Exploratory Data Analysis (EDA):

  • Description: EDA techniques involve visually exploring and analyzing data to discover patterns, relationships, anomalies, and insights. Methods such as scatter plots, histograms, box plots, and correlation matrices are utilized.
  • Applications: Identifying trends, patterns, outliers, and relationships within the dataset.

4) Predictive Analytics:

  • Description: Predictive analytics use statistical algorithms and machine learning techniques to analyze historical data and make predictions about future events or outcomes. Techniques such as regression analysis, time series forecasting, and machine learning algorithms (e.g., decision trees, random forests, neural networks) are employed.
  • Applications: Forecasting future trends, predicting outcomes, and identifying potential risks or opportunities.

5) Prescriptive Analytics:

  • Description: Prescriptive analytics involve analyzing data to recommend actions or strategies that optimize specific objectives or outcomes. Optimization techniques, simulation models, and decision-making algorithms are utilized.
  • Applications: Recommending optimal strategies, decision-making support, and resource allocation.

6) Qualitative Data Analysis:

  • Description: Qualitative data analysis involves analyzing non-numerical data, such as text, images, videos, or audio, to identify themes, patterns, and insights. Methods such as content analysis, thematic analysis, and narrative analysis are used.
  • Applications: Understanding human behavior, attitudes, perceptions, and experiences.

7) Big Data Analytics:

  • Description: Big data analytics methods are designed to analyze large volumes of structured and unstructured data to extract valuable insights. Technologies such as Hadoop, Spark, and NoSQL databases are used to process and analyze big data.
  • Applications: Analyzing large datasets, identifying trends, patterns, and insights from big data sources.

8) Text Analytics:

  • Description: Text analytics methods involve analyzing textual data, such as customer reviews, social media posts, emails, and documents, to extract meaningful information and insights. Techniques such as sentiment analysis, text mining, and natural language processing (NLP) are used.
  • Applications: Analyzing customer feedback, monitoring brand reputation, and extracting insights from textual data sources.

These data analysis methods are instrumental in transforming data into actionable insights, informing decision-making processes, and driving organizational success across various sectors, including business, healthcare, finance, marketing, and research. The selection of a specific method often depends on the nature of the data, the research objectives, and the analytical requirements of the project or organization.

Also Read: Quantitative Data Analysis: Types, Analysis & Examples

Data Analysis Tools

Data analysis tools are essential instruments that facilitate the process of examining, cleaning, transforming, and modeling data to uncover useful information, make informed decisions, and drive strategies. Here are some prominent data analysis tools widely used across various industries:

1) Microsoft Excel:

  • Description: A spreadsheet software that offers basic to advanced data analysis features, including pivot tables, data visualization tools, and statistical functions.
  • Applications: Data cleaning, basic statistical analysis, visualization, and reporting.

2) R Programming Language:

  • Description: An open-source programming language specifically designed for statistical computing and data visualization.
  • Applications: Advanced statistical analysis, data manipulation, visualization, and machine learning.

3) Python (with Libraries like Pandas, NumPy, Matplotlib, and Seaborn):

  • Description: A versatile programming language with libraries that support data manipulation, analysis, and visualization.
  • Applications: Data cleaning, statistical analysis, machine learning, and data visualization.

4) SPSS (Statistical Package for the Social Sciences):

  • Description: A comprehensive statistical software suite used for data analysis, data mining, and predictive analytics.
  • Applications: Descriptive statistics, hypothesis testing, regression analysis, and advanced analytics.

5) SAS (Statistical Analysis System):

  • Description: A software suite used for advanced analytics, multivariate analysis, and predictive modeling.
  • Applications: Data management, statistical analysis, predictive modeling, and business intelligence.

6) Tableau:

  • Description: A data visualization tool that allows users to create interactive and shareable dashboards and reports.
  • Applications: Data visualization , business intelligence , and interactive dashboard creation.

7) Power BI:

  • Description: A business analytics tool developed by Microsoft that provides interactive visualizations and business intelligence capabilities.
  • Applications: Data visualization, business intelligence, reporting, and dashboard creation.

8) SQL (Structured Query Language) Databases (e.g., MySQL, PostgreSQL, Microsoft SQL Server):

  • Description: Database management systems that support data storage, retrieval, and manipulation using SQL queries.
  • Applications: Data retrieval, data cleaning, data transformation, and database management.

9) Apache Spark:

  • Description: A fast and general-purpose distributed computing system designed for big data processing and analytics.
  • Applications: Big data processing, machine learning, data streaming, and real-time analytics.

10) IBM SPSS Modeler:

  • Description: A data mining software application used for building predictive models and conducting advanced analytics.
  • Applications: Predictive modeling, data mining, statistical analysis, and decision optimization.

These tools serve various purposes and cater to different data analysis needs, from basic statistical analysis and data visualization to advanced analytics, machine learning, and big data processing. The choice of a specific tool often depends on the nature of the data, the complexity of the analysis, and the specific requirements of the project or organization.

Also Read: How to Analyze Survey Data: Methods & Examples

Importance of Data Analysis in Research

The importance of data analysis in research cannot be overstated; it serves as the backbone of any scientific investigation or study. Here are several key reasons why data analysis is crucial in the research process:

  • Data analysis helps ensure that the results obtained are valid and reliable. By systematically examining the data, researchers can identify any inconsistencies or anomalies that may affect the credibility of the findings.
  • Effective data analysis provides researchers with the necessary information to make informed decisions. By interpreting the collected data, researchers can draw conclusions, make predictions, or formulate recommendations based on evidence rather than intuition or guesswork.
  • Data analysis allows researchers to identify patterns, trends, and relationships within the data. This can lead to a deeper understanding of the research topic, enabling researchers to uncover insights that may not be immediately apparent.
  • In empirical research, data analysis plays a critical role in testing hypotheses. Researchers collect data to either support or refute their hypotheses, and data analysis provides the tools and techniques to evaluate these hypotheses rigorously.
  • Transparent and well-executed data analysis enhances the credibility of research findings. By clearly documenting the data analysis methods and procedures, researchers allow others to replicate the study, thereby contributing to the reproducibility of research findings.
  • In fields such as business or healthcare, data analysis helps organizations allocate resources more efficiently. By analyzing data on consumer behavior, market trends, or patient outcomes, organizations can make strategic decisions about resource allocation, budgeting, and planning.
  • In public policy and social sciences, data analysis is instrumental in developing and evaluating policies and interventions. By analyzing data on social, economic, or environmental factors, policymakers can assess the effectiveness of existing policies and inform the development of new ones.
  • Data analysis allows for continuous improvement in research methods and practices. By analyzing past research projects, identifying areas for improvement, and implementing changes based on data-driven insights, researchers can refine their approaches and enhance the quality of future research endeavors.

However, it is important to remember that mastering these techniques requires practice and continuous learning. That’s why we highly recommend the Data Analytics Course by Physics Wallah . Not only does it cover all the fundamentals of data analysis, but it also provides hands-on experience with various tools such as Excel, Python, and Tableau. Plus, if you use the “ READER ” coupon code at checkout, you can get a special discount on the course.

For Latest Tech Related Information, Join Our Official Free Telegram Group : PW Skills Telegram Group

Data Analysis Techniques in Research FAQs

What are the 5 techniques for data analysis.

The five techniques for data analysis include: Descriptive Analysis Diagnostic Analysis Predictive Analysis Prescriptive Analysis Qualitative Analysis

What are techniques of data analysis in research?

Techniques of data analysis in research encompass both qualitative and quantitative methods. These techniques involve processes like summarizing raw data, investigating causes of events, forecasting future outcomes, offering recommendations based on predictions, and examining non-numerical data to understand concepts or experiences.

What are the 3 methods of data analysis?

The three primary methods of data analysis are: Qualitative Analysis Quantitative Analysis Mixed-Methods Analysis

What are the four types of data analysis techniques?

The four types of data analysis techniques are: Descriptive Analysis Diagnostic Analysis Predictive Analysis Prescriptive Analysis

card-img

  • Analysis of Algorithm In DAA

analysis of algorithm in daa

Analysis of Algorithm in DAA is a crucial component of computational complexity theory, which offers a theoretical approximation of the…

  • Quantitative Data Analysis: Types, Analysis & Examples

analysis of quantitative data

Analysis of Quantitative data enables you to transform raw data points, typically organised in spreadsheets, into actionable insights. Refer to the…

  • What Is Business BI?

Business BI

Business BI is a technology-driven process that helps businesses in collecting, storing, and analyzing data to make better decisions, In…

right adv

Related Articles

  • How to Analysis of Survey Data: Methods & Examples
  • Analytics & Insights: The Difference Between Data, Analytics and Insights
  • What is the difference between Big Data Analysis and Data Analytics?
  • Why is Data Analytics Skills Important?
  • 11 Data Analyst Skills You Need to Get Hired in 2024
  • BI & Analytics: What’s The Difference?
  • Storytelling with Data: Communicating Insights Effectively

bottom banner

  • Affiliate Program

Wordvice

  • UNITED STATES
  • 台灣 (TAIWAN)
  • TÜRKIYE (TURKEY)
  • Academic Editing Services
  • - Research Paper
  • - Journal Manuscript
  • - Dissertation
  • - College & University Assignments
  • Admissions Editing Services
  • - Application Essay
  • - Personal Statement
  • - Recommendation Letter
  • - Cover Letter
  • - CV/Resume
  • Business Editing Services
  • - Business Documents
  • - Report & Brochure
  • - Website & Blog
  • Writer Editing Services
  • - Script & Screenplay
  • Our Editors
  • Client Reviews
  • Editing & Proofreading Prices
  • Wordvice Points
  • Partner Discount
  • Plagiarism Checker
  • APA Citation Generator
  • MLA Citation Generator
  • Chicago Citation Generator
  • Vancouver Citation Generator
  • - APA Style
  • - MLA Style
  • - Chicago Style
  • - Vancouver Style
  • Writing & Editing Guide
  • Academic Resources
  • Admissions Resources

How to Write the Results/Findings Section in Research

data analysis and research findings

What is the research paper Results section and what does it do?

The Results section of a scientific research paper represents the core findings of a study derived from the methods applied to gather and analyze information. It presents these findings in a logical sequence without bias or interpretation from the author, setting up the reader for later interpretation and evaluation in the Discussion section. A major purpose of the Results section is to break down the data into sentences that show its significance to the research question(s).

The Results section appears third in the section sequence in most scientific papers. It follows the presentation of the Methods and Materials and is presented before the Discussion section —although the Results and Discussion are presented together in many journals. This section answers the basic question “What did you find in your research?”

What is included in the Results section?

The Results section should include the findings of your study and ONLY the findings of your study. The findings include:

  • Data presented in tables, charts, graphs, and other figures (may be placed into the text or on separate pages at the end of the manuscript)
  • A contextual analysis of this data explaining its meaning in sentence form
  • All data that corresponds to the central research question(s)
  • All secondary findings (secondary outcomes, subgroup analyses, etc.)

If the scope of the study is broad, or if you studied a variety of variables, or if the methodology used yields a wide range of different results, the author should present only those results that are most relevant to the research question stated in the Introduction section .

As a general rule, any information that does not present the direct findings or outcome of the study should be left out of this section. Unless the journal requests that authors combine the Results and Discussion sections, explanations and interpretations should be omitted from the Results.

How are the results organized?

The best way to organize your Results section is “logically.” One logical and clear method of organizing research results is to provide them alongside the research questions—within each research question, present the type of data that addresses that research question.

Let’s look at an example. Your research question is based on a survey among patients who were treated at a hospital and received postoperative care. Let’s say your first research question is:

results section of a research paper, figures

“What do hospital patients over age 55 think about postoperative care?”

This can actually be represented as a heading within your Results section, though it might be presented as a statement rather than a question:

Attitudes towards postoperative care in patients over the age of 55

Now present the results that address this specific research question first. In this case, perhaps a table illustrating data from a survey. Likert items can be included in this example. Tables can also present standard deviations, probabilities, correlation matrices, etc.

Following this, present a content analysis, in words, of one end of the spectrum of the survey or data table. In our example case, start with the POSITIVE survey responses regarding postoperative care, using descriptive phrases. For example:

“Sixty-five percent of patients over 55 responded positively to the question “ Are you satisfied with your hospital’s postoperative care ?” (Fig. 2)

Include other results such as subcategory analyses. The amount of textual description used will depend on how much interpretation of tables and figures is necessary and how many examples the reader needs in order to understand the significance of your research findings.

Next, present a content analysis of another part of the spectrum of the same research question, perhaps the NEGATIVE or NEUTRAL responses to the survey. For instance:

  “As Figure 1 shows, 15 out of 60 patients in Group A responded negatively to Question 2.”

After you have assessed the data in one figure and explained it sufficiently, move on to your next research question. For example:

  “How does patient satisfaction correspond to in-hospital improvements made to postoperative care?”

results section of a research paper, figures

This kind of data may be presented through a figure or set of figures (for instance, a paired T-test table).

Explain the data you present, here in a table, with a concise content analysis:

“The p-value for the comparison between the before and after groups of patients was .03% (Fig. 2), indicating that the greater the dissatisfaction among patients, the more frequent the improvements that were made to postoperative care.”

Let’s examine another example of a Results section from a study on plant tolerance to heavy metal stress . In the Introduction section, the aims of the study are presented as “determining the physiological and morphological responses of Allium cepa L. towards increased cadmium toxicity” and “evaluating its potential to accumulate the metal and its associated environmental consequences.” The Results section presents data showing how these aims are achieved in tables alongside a content analysis, beginning with an overview of the findings:

“Cadmium caused inhibition of root and leave elongation, with increasing effects at higher exposure doses (Fig. 1a-c).”

The figure containing this data is cited in parentheses. Note that this author has combined three graphs into one single figure. Separating the data into separate graphs focusing on specific aspects makes it easier for the reader to assess the findings, and consolidating this information into one figure saves space and makes it easy to locate the most relevant results.

results section of a research paper, figures

Following this overall summary, the relevant data in the tables is broken down into greater detail in text form in the Results section.

  • “Results on the bio-accumulation of cadmium were found to be the highest (17.5 mg kgG1) in the bulb, when the concentration of cadmium in the solution was 1×10G2 M and lowest (0.11 mg kgG1) in the leaves when the concentration was 1×10G3 M.”

Captioning and Referencing Tables and Figures

Tables and figures are central components of your Results section and you need to carefully think about the most effective way to use graphs and tables to present your findings . Therefore, it is crucial to know how to write strong figure captions and to refer to them within the text of the Results section.

The most important advice one can give here as well as throughout the paper is to check the requirements and standards of the journal to which you are submitting your work. Every journal has its own design and layout standards, which you can find in the author instructions on the target journal’s website. Perusing a journal’s published articles will also give you an idea of the proper number, size, and complexity of your figures.

Regardless of which format you use, the figures should be placed in the order they are referenced in the Results section and be as clear and easy to understand as possible. If there are multiple variables being considered (within one or more research questions), it can be a good idea to split these up into separate figures. Subsequently, these can be referenced and analyzed under separate headings and paragraphs in the text.

To create a caption, consider the research question being asked and change it into a phrase. For instance, if one question is “Which color did participants choose?”, the caption might be “Color choice by participant group.” Or in our last research paper example, where the question was “What is the concentration of cadmium in different parts of the onion after 14 days?” the caption reads:

 “Fig. 1(a-c): Mean concentration of Cd determined in (a) bulbs, (b) leaves, and (c) roots of onions after a 14-day period.”

Steps for Composing the Results Section

Because each study is unique, there is no one-size-fits-all approach when it comes to designing a strategy for structuring and writing the section of a research paper where findings are presented. The content and layout of this section will be determined by the specific area of research, the design of the study and its particular methodologies, and the guidelines of the target journal and its editors. However, the following steps can be used to compose the results of most scientific research studies and are essential for researchers who are new to preparing a manuscript for publication or who need a reminder of how to construct the Results section.

Step 1 : Consult the guidelines or instructions that the target journal or publisher provides authors and read research papers it has published, especially those with similar topics, methods, or results to your study.

  • The guidelines will generally outline specific requirements for the results or findings section, and the published articles will provide sound examples of successful approaches.
  • Note length limitations on restrictions on content. For instance, while many journals require the Results and Discussion sections to be separate, others do not—qualitative research papers often include results and interpretations in the same section (“Results and Discussion”).
  • Reading the aims and scope in the journal’s “ guide for authors ” section and understanding the interests of its readers will be invaluable in preparing to write the Results section.

Step 2 : Consider your research results in relation to the journal’s requirements and catalogue your results.

  • Focus on experimental results and other findings that are especially relevant to your research questions and objectives and include them even if they are unexpected or do not support your ideas and hypotheses.
  • Catalogue your findings—use subheadings to streamline and clarify your report. This will help you avoid excessive and peripheral details as you write and also help your reader understand and remember your findings. Create appendices that might interest specialists but prove too long or distracting for other readers.
  • Decide how you will structure of your results. You might match the order of the research questions and hypotheses to your results, or you could arrange them according to the order presented in the Methods section. A chronological order or even a hierarchy of importance or meaningful grouping of main themes or categories might prove effective. Consider your audience, evidence, and most importantly, the objectives of your research when choosing a structure for presenting your findings.

Step 3 : Design figures and tables to present and illustrate your data.

  • Tables and figures should be numbered according to the order in which they are mentioned in the main text of the paper.
  • Information in figures should be relatively self-explanatory (with the aid of captions), and their design should include all definitions and other information necessary for readers to understand the findings without reading all of the text.
  • Use tables and figures as a focal point to tell a clear and informative story about your research and avoid repeating information. But remember that while figures clarify and enhance the text, they cannot replace it.

Step 4 : Draft your Results section using the findings and figures you have organized.

  • The goal is to communicate this complex information as clearly and precisely as possible; precise and compact phrases and sentences are most effective.
  • In the opening paragraph of this section, restate your research questions or aims to focus the reader’s attention to what the results are trying to show. It is also a good idea to summarize key findings at the end of this section to create a logical transition to the interpretation and discussion that follows.
  • Try to write in the past tense and the active voice to relay the findings since the research has already been done and the agent is usually clear. This will ensure that your explanations are also clear and logical.
  • Make sure that any specialized terminology or abbreviation you have used here has been defined and clarified in the  Introduction section .

Step 5 : Review your draft; edit and revise until it reports results exactly as you would like to have them reported to your readers.

  • Double-check the accuracy and consistency of all the data, as well as all of the visual elements included.
  • Read your draft aloud to catch language errors (grammar, spelling, and mechanics), awkward phrases, and missing transitions.
  • Ensure that your results are presented in the best order to focus on objectives and prepare readers for interpretations, valuations, and recommendations in the Discussion section . Look back over the paper’s Introduction and background while anticipating the Discussion and Conclusion sections to ensure that the presentation of your results is consistent and effective.
  • Consider seeking additional guidance on your paper. Find additional readers to look over your Results section and see if it can be improved in any way. Peers, professors, or qualified experts can provide valuable insights.

One excellent option is to use a professional English proofreading and editing service  such as Wordvice, including our paper editing service . With hundreds of qualified editors from dozens of scientific fields, Wordvice has helped thousands of authors revise their manuscripts and get accepted into their target journals. Read more about the  proofreading and editing process  before proceeding with getting academic editing services and manuscript editing services for your manuscript.

As the representation of your study’s data output, the Results section presents the core information in your research paper. By writing with clarity and conciseness and by highlighting and explaining the crucial findings of their study, authors increase the impact and effectiveness of their research manuscripts.

For more articles and videos on writing your research manuscript, visit Wordvice’s Resources page.

Wordvice Resources

  • How to Write a Research Paper Introduction 
  • Which Verb Tenses to Use in a Research Paper
  • How to Write an Abstract for a Research Paper
  • How to Write a Research Paper Title
  • Useful Phrases for Academic Writing
  • Common Transition Terms in Academic Papers
  • Active and Passive Voice in Research Papers
  • 100+ Verbs That Will Make Your Research Writing Amazing
  • Tips for Paraphrasing in Research Papers

Logo for Rhode Island College Digital Publishing

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Qualitative Data Analysis

23 Presenting the Results of Qualitative Analysis

Mikaila Mariel Lemonik Arthur

Qualitative research is not finished just because you have determined the main findings or conclusions of your study. Indeed, disseminating the results is an essential part of the research process. By sharing your results with others, whether in written form as scholarly paper or an applied report or in some alternative format like an oral presentation, an infographic, or a video, you ensure that your findings become part of the ongoing conversation of scholarship in your field, forming part of the foundation for future researchers. This chapter provides an introduction to writing about qualitative research findings. It will outline how writing continues to contribute to the analysis process, what concerns researchers should keep in mind as they draft their presentations of findings, and how best to organize qualitative research writing

As you move through the research process, it is essential to keep yourself organized. Organizing your data, memos, and notes aids both the analytical and the writing processes. Whether you use electronic or physical, real-world filing and organizational systems, these systems help make sense of the mountains of data you have and assure you focus your attention on the themes and ideas you have determined are important (Warren and Karner 2015). Be sure that you have kept detailed notes on all of the decisions you have made and procedures you have followed in carrying out research design, data collection, and analysis, as these will guide your ultimate write-up.

First and foremost, researchers should keep in mind that writing is in fact a form of thinking. Writing is an excellent way to discover ideas and arguments and to further develop an analysis. As you write, more ideas will occur to you, things that were previously confusing will start to make sense, and arguments will take a clear shape rather than being amorphous and poorly-organized. However, writing-as-thinking cannot be the final version that you share with others. Good-quality writing does not display the workings of your thought process. It is reorganized and revised (more on that later) to present the data and arguments important in a particular piece. And revision is totally normal! No one expects the first draft of a piece of writing to be ready for prime time. So write rough drafts and memos and notes to yourself and use them to think, and then revise them until the piece is the way you want it to be for sharing.

Bergin (2018) lays out a set of key concerns for appropriate writing about research. First, present your results accurately, without exaggerating or misrepresenting. It is very easy to overstate your findings by accident if you are enthusiastic about what you have found, so it is important to take care and use appropriate cautions about the limitations of the research. You also need to work to ensure that you communicate your findings in a way people can understand, using clear and appropriate language that is adjusted to the level of those you are communicating with. And you must be clear and transparent about the methodological strategies employed in the research. Remember, the goal is, as much as possible, to describe your research in a way that would permit others to replicate the study. There are a variety of other concerns and decision points that qualitative researchers must keep in mind, including the extent to which to include quantification in their presentation of results, ethics, considerations of audience and voice, and how to bring the richness of qualitative data to life.

Quantification, as you have learned, refers to the process of turning data into numbers. It can indeed be very useful to count and tabulate quantitative data drawn from qualitative research. For instance, if you were doing a study of dual-earner households and wanted to know how many had an equal division of household labor and how many did not, you might want to count those numbers up and include them as part of the final write-up. However, researchers need to take care when they are writing about quantified qualitative data. Qualitative data is not as generalizable as quantitative data, so quantification can be very misleading. Thus, qualitative researchers should strive to use raw numbers instead of the percentages that are more appropriate for quantitative research. Writing, for instance, “15 of the 20 people I interviewed prefer pancakes to waffles” is a simple description of the data; writing “75% of people prefer pancakes” suggests a generalizable claim that is not likely supported by the data. Note that mixing numbers with qualitative data is really a type of mixed-methods approach. Mixed-methods approaches are good, but sometimes they seduce researchers into focusing on the persuasive power of numbers and tables rather than capitalizing on the inherent richness of their qualitative data.

A variety of issues of scholarly ethics and research integrity are raised by the writing process. Some of these are unique to qualitative research, while others are more universal concerns for all academic and professional writing. For example, it is essential to avoid plagiarism and misuse of sources. All quotations that appear in a text must be properly cited, whether with in-text and bibliographic citations to the source or with an attribution to the research participant (or the participant’s pseudonym or description in order to protect confidentiality) who said those words. Where writers will paraphrase a text or a participant’s words, they need to make sure that the paraphrase they develop accurately reflects the meaning of the original words. Thus, some scholars suggest that participants should have the opportunity to read (or to have read to them, if they cannot read the text themselves) all sections of the text in which they, their words, or their ideas are presented to ensure accuracy and enable participants to maintain control over their lives.

Audience and Voice

When writing, researchers must consider their audience(s) and the effects they want their writing to have on these audiences. The designated audience will dictate the voice used in the writing, or the individual style and personality of a piece of text. Keep in mind that the potential audience for qualitative research is often much more diverse than that for quantitative research because of the accessibility of the data and the extent to which the writing can be accessible and interesting. Yet individual pieces of writing are typically pitched to a more specific subset of the audience.

Let us consider one potential research study, an ethnography involving participant-observation of the same children both when they are at daycare facility and when they are at home with their families to try to understand how daycare might impact behavior and social development. The findings of this study might be of interest to a wide variety of potential audiences: academic peers, whether at your own academic institution, in your broader discipline, or multidisciplinary; people responsible for creating laws and policies; practitioners who run or teach at day care centers; and the general public, including both people who are interested in child development more generally and those who are themselves parents making decisions about child care for their own children. And the way you write for each of these audiences will be somewhat different. Take a moment and think through what some of these differences might look like.

If you are writing to academic audiences, using specialized academic language and working within the typical constraints of scholarly genres, as will be discussed below, can be an important part of convincing others that your work is legitimate and should be taken seriously. Your writing will be formal. Even if you are writing for students and faculty you already know—your classmates, for instance—you are often asked to imitate the style of academic writing that is used in publications, as this is part of learning to become part of the scholarly conversation. When speaking to academic audiences outside your discipline, you may need to be more careful about jargon and specialized language, as disciplines do not always share the same key terms. For instance, in sociology, scholars use the term diffusion to refer to the way new ideas or practices spread from organization to organization. In the field of international relations, scholars often used the term cascade to refer to the way ideas or practices spread from nation to nation. These terms are describing what is fundamentally the same concept, but they are different terms—and a scholar from one field might have no idea what a scholar from a different field is talking about! Therefore, while the formality and academic structure of the text would stay the same, a writer with a multidisciplinary audience might need to pay more attention to defining their terms in the body of the text.

It is not only other academic scholars who expect to see formal writing. Policymakers tend to expect formality when ideas are presented to them, as well. However, the content and style of the writing will be different. Much less academic jargon should be used, and the most important findings and policy implications should be emphasized right from the start rather than initially focusing on prior literature and theoretical models as you might for an academic audience. Long discussions of research methods should also be minimized. Similarly, when you write for practitioners, the findings and implications for practice should be highlighted. The reading level of the text will vary depending on the typical background of the practitioners to whom you are writing—you can make very different assumptions about the general knowledge and reading abilities of a group of hospital medical directors with MDs than you can about a group of case workers who have a post-high-school certificate. Consider the primary language of your audience as well. The fact that someone can get by in spoken English does not mean they have the vocabulary or English reading skills to digest a complex report. But the fact that someone’s vocabulary is limited says little about their intellectual abilities, so try your best to convey the important complexity of the ideas and findings from your research without dumbing them down—even if you must limit your vocabulary usage.

When writing for the general public, you will want to move even further towards emphasizing key findings and policy implications, but you also want to draw on the most interesting aspects of your data. General readers will read sociological texts that are rich with ethnographic or other kinds of detail—it is almost like reality television on a page! And this is a contrast to busy policymakers and practitioners, who probably want to learn the main findings as quickly as possible so they can go about their busy lives. But also keep in mind that there is a wide variation in reading levels. Journalists at publications pegged to the general public are often advised to write at about a tenth-grade reading level, which would leave most of the specialized terminology we develop in our research fields out of reach. If you want to be accessible to even more people, your vocabulary must be even more limited. The excellent exercise of trying to write using the 1,000 most common English words, available at the Up-Goer Five website ( https://www.splasho.com/upgoer5/ ) does a good job of illustrating this challenge (Sanderson n.d.).

Another element of voice is whether to write in the first person. While many students are instructed to avoid the use of the first person in academic writing, this advice needs to be taken with a grain of salt. There are indeed many contexts in which the first person is best avoided, at least as long as writers can find ways to build strong, comprehensible sentences without its use, including most quantitative research writing. However, if the alternative to using the first person is crafting a sentence like “it is proposed that the researcher will conduct interviews,” it is preferable to write “I propose to conduct interviews.” In qualitative research, in fact, the use of the first person is far more common. This is because the researcher is central to the research project. Qualitative researchers can themselves be understood as research instruments, and thus eliminating the use of the first person in writing is in a sense eliminating information about the conduct of the researchers themselves.

But the question really extends beyond the issue of first-person or third-person. Qualitative researchers have choices about how and whether to foreground themselves in their writing, not just in terms of using the first person, but also in terms of whether to emphasize their own subjectivity and reflexivity, their impressions and ideas, and their role in the setting. In contrast, conventional quantitative research in the positivist tradition really tries to eliminate the author from the study—which indeed is exactly why typical quantitative research avoids the use of the first person. Keep in mind that emphasizing researchers’ roles and reflexivity and using the first person does not mean crafting articles that provide overwhelming detail about the author’s thoughts and practices. Readers do not need to hear, and should not be told, which database you used to search for journal articles, how many hours you spent transcribing, or whether the research process was stressful—save these things for the memos you write to yourself. Rather, readers need to hear how you interacted with research participants, how your standpoint may have shaped the findings, and what analytical procedures you carried out.

Making Data Come Alive

One of the most important parts of writing about qualitative research is presenting the data in a way that makes its richness and value accessible to readers. As the discussion of analysis in the prior chapter suggests, there are a variety of ways to do this. Researchers may select key quotes or images to illustrate points, write up specific case studies that exemplify their argument, or develop vignettes (little stories) that illustrate ideas and themes, all drawing directly on the research data. Researchers can also write more lengthy summaries, narratives, and thick descriptions.

Nearly all qualitative work includes quotes from research participants or documents to some extent, though ethnographic work may focus more on thick description than on relaying participants’ own words. When quotes are presented, they must be explained and interpreted—they cannot stand on their own. This is one of the ways in which qualitative research can be distinguished from journalism. Journalism presents what happened, but social science needs to present the “why,” and the why is best explained by the researcher.

So how do authors go about integrating quotes into their written work? Julie Posselt (2017), a sociologist who studies graduate education, provides a set of instructions. First of all, authors need to remain focused on the core questions of their research, and avoid getting distracted by quotes that are interesting or attention-grabbing but not so relevant to the research question. Selecting the right quotes, those that illustrate the ideas and arguments of the paper, is an important part of the writing process. Second, not all quotes should be the same length (just like not all sentences or paragraphs in a paper should be the same length). Include some quotes that are just phrases, others that are a sentence or so, and others that are longer. We call longer quotes, generally those more than about three lines long, block quotes , and they are typically indented on both sides to set them off from the surrounding text. For all quotes, be sure to summarize what the quote should be telling or showing the reader, connect this quote to other quotes that are similar or different, and provide transitions in the discussion to move from quote to quote and from topic to topic. Especially for longer quotes, it is helpful to do some of this writing before the quote to preview what is coming and other writing after the quote to make clear what readers should have come to understand. Remember, it is always the author’s job to interpret the data. Presenting excerpts of the data, like quotes, in a form the reader can access does not minimize the importance of this job. Be sure that you are explaining the meaning of the data you present.

A few more notes about writing with quotes: avoid patchwriting, whether in your literature review or the section of your paper in which quotes from respondents are presented. Patchwriting is a writing practice wherein the author lightly paraphrases original texts but stays so close to those texts that there is little the author has added. Sometimes, this even takes the form of presenting a series of quotes, properly documented, with nothing much in the way of text generated by the author. A patchwriting approach does not build the scholarly conversation forward, as it does not represent any kind of new contribution on the part of the author. It is of course fine to paraphrase quotes, as long as the meaning is not changed. But if you use direct quotes, do not edit the text of the quotes unless how you edit them does not change the meaning and you have made clear through the use of ellipses (…) and brackets ([])what kinds of edits have been made. For example, consider this exchange from Matthew Desmond’s (2012:1317) research on evictions:

The thing was, I wasn’t never gonna let Crystal come and stay with me from the get go. I just told her that to throw her off. And she wasn’t fittin’ to come stay with me with no money…No. Nope. You might as well stay in that shelter.

A paraphrase of this exchange might read “She said that she was going to let Crystal stay with her if Crystal did not have any money.” Paraphrases like that are fine. What is not fine is rewording the statement but treating it like a quote, for instance writing:

The thing was, I was not going to let Crystal come and stay with me from beginning. I just told her that to throw her off. And it was not proper for her to come stay with me without any money…No. Nope. You might as well stay in that shelter.

But as you can see, the change in language and style removes some of the distinct meaning of the original quote. Instead, writers should leave as much of the original language as possible. If some text in the middle of the quote needs to be removed, as in this example, ellipses are used to show that this has occurred. And if a word needs to be added to clarify, it is placed in square brackets to show that it was not part of the original quote.

Data can also be presented through the use of data displays like tables, charts, graphs, diagrams, and infographics created for publication or presentation, as well as through the use of visual material collected during the research process. Note that if visuals are used, the author must have the legal right to use them. Photographs or diagrams created by the author themselves—or by research participants who have signed consent forms for their work to be used, are fine. But photographs, and sometimes even excerpts from archival documents, may be owned by others from whom researchers must get permission in order to use them.

A large percentage of qualitative research does not include any data displays or visualizations. Therefore, researchers should carefully consider whether the use of data displays will help the reader understand the data. One of the most common types of data displays used by qualitative researchers are simple tables. These might include tables summarizing key data about cases included in the study; tables laying out the characteristics of different taxonomic elements or types developed as part of the analysis; tables counting the incidence of various elements; and 2×2 tables (two columns and two rows) illuminating a theory. Basic network or process diagrams are also commonly included. If data displays are used, it is essential that researchers include context and analysis alongside data displays rather than letting them stand by themselves, and it is preferable to continue to present excerpts and examples from the data rather than just relying on summaries in the tables.

If you will be using graphs, infographics, or other data visualizations, it is important that you attend to making them useful and accurate (Bergin 2018). Think about the viewer or user as your audience and ensure the data visualizations will be comprehensible. You may need to include more detail or labels than you might think. Ensure that data visualizations are laid out and labeled clearly and that you make visual choices that enhance viewers’ ability to understand the points you intend to communicate using the visual in question. Finally, given the ease with which it is possible to design visuals that are deceptive or misleading, it is essential to make ethical and responsible choices in the construction of visualization so that viewers will interpret them in accurate ways.

The Genre of Research Writing

As discussed above, the style and format in which results are presented depends on the audience they are intended for. These differences in styles and format are part of the genre of writing. Genre is a term referring to the rules of a specific form of creative or productive work. Thus, the academic journal article—and student papers based on this form—is one genre. A report or policy paper is another. The discussion below will focus on the academic journal article, but note that reports and policy papers follow somewhat different formats. They might begin with an executive summary of one or a few pages, include minimal background, focus on key findings, and conclude with policy implications, shifting methods and details about the data to an appendix. But both academic journal articles and policy papers share some things in common, for instance the necessity for clear writing, a well-organized structure, and the use of headings.

So what factors make up the genre of the academic journal article in sociology? While there is some flexibility, particularly for ethnographic work, academic journal articles tend to follow a fairly standard format. They begin with a “title page” that includes the article title (often witty and involving scholarly inside jokes, but more importantly clearly describing the content of the article); the authors’ names and institutional affiliations, an abstract , and sometimes keywords designed to help others find the article in databases. An abstract is a short summary of the article that appears both at the very beginning of the article and in search databases. Abstracts are designed to aid readers by giving them the opportunity to learn enough about an article that they can determine whether it is worth their time to read the complete text. They are written about the article, and thus not in the first person, and clearly summarize the research question, methodological approach, main findings, and often the implications of the research.

After the abstract comes an “introduction” of a page or two that details the research question, why it matters, and what approach the paper will take. This is followed by a literature review of about a quarter to a third the length of the entire paper. The literature review is often divided, with headings, into topical subsections, and is designed to provide a clear, thorough overview of the prior research literature on which a paper has built—including prior literature the new paper contradicts. At the end of the literature review it should be made clear what researchers know about the research topic and question, what they do not know, and what this new paper aims to do to address what is not known.

The next major section of the paper is the section that describes research design, data collection, and data analysis, often referred to as “research methods” or “methodology.” This section is an essential part of any written or oral presentation of your research. Here, you tell your readers or listeners “how you collected and interpreted your data” (Taylor, Bogdan, and DeVault 2016:215). Taylor, Bogdan, and DeVault suggest that the discussion of your research methods include the following:

  • The particular approach to data collection used in the study;
  • Any theoretical perspective(s) that shaped your data collection and analytical approach;
  • When the study occurred, over how long, and where (concealing identifiable details as needed);
  • A description of the setting and participants, including sampling and selection criteria (if an interview-based study, the number of participants should be clearly stated);
  • The researcher’s perspective in carrying out the study, including relevant elements of their identity and standpoint, as well as their role (if any) in research settings; and
  • The approach to analyzing the data.

After the methods section comes a section, variously titled but often called “data,” that takes readers through the analysis. This section is where the thick description narrative; the quotes, broken up by theme or topic, with their interpretation; the discussions of case studies; most data displays (other than perhaps those outlining a theoretical model or summarizing descriptive data about cases); and other similar material appears. The idea of the data section is to give readers the ability to see the data for themselves and to understand how this data supports the ultimate conclusions. Note that all tables and figures included in formal publications should be titled and numbered.

At the end of the paper come one or two summary sections, often called “discussion” and/or “conclusion.” If there is a separate discussion section, it will focus on exploring the overall themes and findings of the paper. The conclusion clearly and succinctly summarizes the findings and conclusions of the paper, the limitations of the research and analysis, any suggestions for future research building on the paper or addressing these limitations, and implications, be they for scholarship and theory or policy and practice.

After the end of the textual material in the paper comes the bibliography, typically called “works cited” or “references.” The references should appear in a consistent citation style—in sociology, we often use the American Sociological Association format (American Sociological Association 2019), but other formats may be used depending on where the piece will eventually be published. Care should be taken to ensure that in-text citations also reflect the chosen citation style. In some papers, there may be an appendix containing supplemental information such as a list of interview questions or an additional data visualization.

Note that when researchers give presentations to scholarly audiences, the presentations typically follow a format similar to that of scholarly papers, though given time limitations they are compressed. Abstracts and works cited are often not part of the presentation, though in-text citations are still used. The literature review presented will be shortened to only focus on the most important aspects of the prior literature, and only key examples from the discussion of data will be included. For long or complex papers, sometimes only one of several findings is the focus of the presentation. Of course, presentations for other audiences may be constructed differently, with greater attention to interesting elements of the data and findings as well as implications and less to the literature review and methods.

Concluding Your Work

After you have written a complete draft of the paper, be sure you take the time to revise and edit your work. There are several important strategies for revision. First, put your work away for a little while. Even waiting a day to revise is better than nothing, but it is best, if possible, to take much more time away from the text. This helps you forget what your writing looks like and makes it easier to find errors, mistakes, and omissions. Second, show your work to others. Ask them to read your work and critique it, pointing out places where the argument is weak, where you may have overlooked alternative explanations, where the writing could be improved, and what else you need to work on. Finally, read your work out loud to yourself (or, if you really need an audience, try reading to some stuffed animals). Reading out loud helps you catch wrong words, tricky sentences, and many other issues. But as important as revision is, try to avoid perfectionism in writing (Warren and Karner 2015). Writing can always be improved, no matter how much time you spend on it. Those improvements, however, have diminishing returns, and at some point the writing process needs to conclude so the writing can be shared with the world.

Of course, the main goal of writing up the results of a research project is to share with others. Thus, researchers should be considering how they intend to disseminate their results. What conferences might be appropriate? Where can the paper be submitted? Note that if you are an undergraduate student, there are a wide variety of journals that accept and publish research conducted by undergraduates. Some publish across disciplines, while others are specific to disciplines. Other work, such as reports, may be best disseminated by publication online on relevant organizational websites.

After a project is completed, be sure to take some time to organize your research materials and archive them for longer-term storage. Some Institutional Review Board (IRB) protocols require that original data, such as interview recordings, transcripts, and field notes, be preserved for a specific number of years in a protected (locked for paper or password-protected for digital) form and then destroyed, so be sure that your plans adhere to the IRB requirements. Be sure you keep any materials that might be relevant for future related research or for answering questions people may ask later about your project.

And then what? Well, then it is time to move on to your next research project. Research is a long-term endeavor, not a one-time-only activity. We build our skills and our expertise as we continue to pursue research. So keep at it.

  • Find a short article that uses qualitative methods. The sociological magazine Contexts is a good place to find such pieces. Write an abstract of the article.
  • Choose a sociological journal article on a topic you are interested in that uses some form of qualitative methods and is at least 20 pages long. Rewrite the article as a five-page research summary accessible to non-scholarly audiences.
  • Choose a concept or idea you have learned in this course and write an explanation of it using the Up-Goer Five Text Editor ( https://www.splasho.com/upgoer5/ ), a website that restricts your writing to the 1,000 most common English words. What was this experience like? What did it teach you about communicating with people who have a more limited English-language vocabulary—and what did it teach you about the utility of having access to complex academic language?
  • Select five or more sociological journal articles that all use the same basic type of qualitative methods (interviewing, ethnography, documents, or visual sociology). Using what you have learned about coding, code the methods sections of each article, and use your coding to figure out what is common in how such articles discuss their research design, data collection, and analysis methods.
  • Return to an exercise you completed earlier in this course and revise your work. What did you change? How did revising impact the final product?
  • Find a quote from the transcript of an interview, a social media post, or elsewhere that has not yet been interpreted or explained. Write a paragraph that includes the quote along with an explanation of its sociological meaning or significance.

The style or personality of a piece of writing, including such elements as tone, word choice, syntax, and rhythm.

A quotation, usually one of some length, which is set off from the main text by being indented on both sides rather than being placed in quotation marks.

A classification of written or artistic work based on form, content, and style.

A short summary of a text written from the perspective of a reader rather than from the perspective of an author.

Social Data Analysis Copyright © 2021 by Mikaila Mariel Lemonik Arthur is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

The Difference Between Analysis & Findings in a Research Paper

25 jun 2018.

The Difference Between Analysis & Findings in a Research Paper

The exact format of a research paper varies across disciplines, but they share certain features in common. They have the following sections, which may have different names in different fields: introduction, literature review (these first two are often combined), methodology, data analysis, results or findings, discussion and conclusion. These last two are also often combined into one section.

Explore this article

  • Basic Description of Analysis and Findings
  • What is Needed to Write the Analysis and Findings Sections
  • Who Should Write the Analysis and Findings Sections
  • Style of the Analysis and Findings Sections

1 Basic Description of Analysis and Findings

In the analysis section, you describe what you did with your data. If it is a quantitative paper, this will include details of statistical procedures. If it is a qualitative paper, it may include a SWOT analysis which looks at the strengths, weaknesses, opportunities and threats of the statistical data. As an example, a SWOT analysis can be used in business applications to determine a future business path based on current analysis. In the findings or results section, you report what the analysis revealed but only the factual matter of the results, not their implication or meaning. The findings are the research questions that you found answers for during your research.

2 What is Needed to Write the Analysis and Findings Sections

To write the analysis section, you need to know what the analysis findings are. You do not necessarily need the specific data unless the analysis changed as a result of looking at that data. To write the findings section, you need to have already performed the data analysis.

3 Who Should Write the Analysis and Findings Sections

If the paper has more than one author as many research papers do, then different people may write the analysis and findings sections. The author who writes the analysis section should be knowledgeable about the methods used. If it is a quantitative paper, he or she may be a statistician or data analyst. The author who writes the findings section should be knowledgeable about the way findings in the field are reported. This author of the findings section will often also be the lead author of the paper.

4 Style of the Analysis and Findings Sections

The analysis section often includes a justification of the methods used. As it is often technical in nature, it may be skipped by many readers. By contrast, the findings section is purely descriptive and should be easily understood by all members of the paper's targeted audience. The findings section might be written in past tense and should be clear and concise enough for that audience to understand the reported results. Looking over the appropriate style guide for your research's paper or reading similar research sections in other papers are two ways to guide the writing of these sections.

  • 1 Monash University: Reporting and Discussing Your Findings
  • 2 American Psychological Association: Discussing Your Findings
  • 3 Sacred Heart University Library: The Results Organizing Academic Research Papers
  • 4 University of Southern California: Presenting Finds Qualitative

About the Author

Peter Flom is a statistician and a learning-disabled adult. He has been writing for many years and has been published in many academic journals in fields such as psychology, drug addiction, epidemiology and others. He holds a Ph.D. in psychometrics from Fordham University.

Related Articles

Similarities Between Essays & Research Papers

Similarities Between Essays & Research Papers

How to Write an Executive Summary for College Papers

How to Write an Executive Summary for College Papers

How to Put Together an Ethnographic Research Paper

How to Put Together an Ethnographic Research Paper

How to Write a Discussion for an APA Style Paper

How to Write a Discussion for an APA Style Paper

Policy Analysis Methods

Policy Analysis Methods

How to Write a Book Report in APA Format

How to Write a Book Report in APA Format

The Difference Between an Abstract & a Full-Text Article

The Difference Between an Abstract & a Full-Text Article

8 Steps in Writing a Process Paragraph

8 Steps in Writing a Process Paragraph

How to Write a Table of Contents in APA Style

How to Write a Table of Contents in APA Style

Define Primary & Secondary Data

Define Primary & Secondary Data

How to Write an Analyzing Essay

How to Write an Analyzing Essay

How to Write a School Project

How to Write a School Project

The Differences in a Research Report and Research Paper

The Differences in a Research Report and Research Paper

What Is a Narrative Response?

What Is a Narrative Response?

How to Write Book Titles in an Essay

How to Write Book Titles in an Essay

How to Analyze Journal Articles

How to Analyze Journal Articles

How to Write the Abstract for a Sociology Paper

How to Write the Abstract for a Sociology Paper

How to Write the Title of a News Article in a Paper

How to Write the Title of a News Article in a Paper

How to Write Limitations in a Report

How to Write Limitations in a Report

Who Do Jews Believe Wrote the Bible?

Who Do Jews Believe Wrote the Bible?

Regardless of how old we are, we never stop learning. Classroom is the educational resource for people of all ages. Whether you’re studying times tables or applying to college, Classroom has the answers.

  • Accessibility
  • Terms of Use
  • Privacy Policy
  • Copyright Policy
  • Manage Preferences

© 2020 Leaf Group Ltd. / Leaf Group Media, All Rights Reserved. Based on the Word Net lexical database for the English Language. See disclaimer .

College & Research Libraries ( C&RL ) is the official, bi-monthly, online-only scholarly research journal of the Association of College & Research Libraries, a division of the American Library Association.

C&RL is now on Instragram! Follow us today.

Leo S. Lo is Dean, College of University Libraries and Learning Sciences at the University of New Mexico, email: [email protected] .

data analysis and research findings

C&RL News

ALA JobLIST

Advertising Information

  • Research is an Activity and a Subject of Study: A Proposed Metaconcept and Its Practical Application (76749 views)
  • Information Code-Switching: A Study of Language Preferences in Academic Libraries (40031 views)
  • Three Perspectives on Information Literacy in Academia: Talking to Librarians, Faculty, and Students (28489 views)

Evaluating AI Literacy in Academic Libraries: A Survey Study with a Focus on U.S. Employees

Leo S. Lo *

This survey investigates artificial intelligence (AI) literacy among academic library employees, predominantly in the United States, with a total of 760 respondents. The findings reveal a modest self-rated understanding of AI concepts, limited hands-on experience with AI tools, and notable gaps in discussing ethical implications and collaborating on AI projects. Despite recognizing the benefits, readiness for implementation appears low among participants. Respondents emphasize the need for comprehensive training and the establishment of ethical guidelines. The study proposes a framework defining core components of AI literacy tailored for libraries. The results offer insights to guide professional development and policy formulation as libraries increasingly integrate AI into their services and operations.

Introduction

In a world increasingly dictated by algorithms, artificial intelligence (AI) is not merely a technological phenomenon, it is a transformative force that redefines our intellectual, social, and professional landscapes (McKinsey and Company, 2023). The rapid integration of AI in our everyday lives has profound implications for higher education, a sector entrusted with preparing individuals to navigate, contribute to, and thrive in this AI-driven era. From personalized learning environments to automated administrative tasks, AI’s influence in higher education is omnipresent and its potential boundless. However, this potential can only be harnessed effectively if those at the frontline of academia—our educators, researchers, administrators, and, notably, academic library employees—are equipped with the necessary AI literacy (UNESCO, 2021). Without an understanding of AI’s principles, capabilities, and ethical considerations, higher education risks falling prey to AI’s pitfalls rather than leveraging its benefits.

The potential risks and benefits underscore a pressing need to scrutinize and elevate AI literacy within the higher education community—a task that begins with understanding its current state. As facilitators of information and knowledge, academic library employees stand at the crossroads of this AI revolution, making their AI literacy an imperative, not a choice, for the future of higher education.

AI Literacy: Context and Background

In an era marked by exponential growth in digital technology, the concept of literacy has evolved beyond traditional reading and writing skills to encompass a wide array of digital competencies. One such competency, which is gaining critical importance in higher education, is AI literacy. With AI systems beginning to permeate every facet of university operations—from learning management systems to research analytics—the ability to understand and navigate these AI tools has become an essential skill for academic library employees.

AI literacy, a subset of digital literacy, specifically pertains to understanding AI’s principles, applications, and ethical considerations. It involves not only the ability to use AI tools effectively, but also the capacity to evaluate their outputs critically, to understand their underlying mechanisms, and to contemplate their ethical and societal implications. AI literacy is not just for computer professionals; as Lo (2023b) and Cetindamar et al. (2022) emphasize, operationalizing AI literacy for non-specialists is essential.

The significance of AI literacy in higher education is underscored by several contemporary trends and challenges. Companies and governments globally are engaged in fierce competition to stay at the forefront of AI integration. Concurrently, the rapid proliferation of AI is giving rise to a host of ethical and privacy concerns that require informed stewardship (Cox, 2022). Furthermore, the COVID-19 pandemic has accelerated the digital transformation of higher education, leading to an increased reliance on AI technologies for remote learning and operations. This reliance further points to the necessity of AI literacy among academic library employees, who play a pivotal role in facilitating online learning and research.

As artificial intelligence proliferates across higher education, developing AI literacy is increasingly recognized as a priority to prepare students, faculty, staff, and administrators to harness AI’s potential, while mitigating risks (Ng et al., 2021). Hervieux and Wheatley’s (2021) 2019 study (n=163) found that academic librarians require more training regarding artificial intelligence and its potential applications in libraries. The U.S. Department of Education’s recent report (2023) on AI emphasizes the growing importance of AI literacy for educators and students, highlighting the necessity of understanding and integrating AI technologies in educational settings. This report aligns with the broader discourse on AI literacy and emphasizes the need to equip library professionals with skills needed to evaluate and utilize AI tools effectively (Lo, 2023a).

While efforts to promote AI literacy are growing, the required content for different target groups remains ambigu­ous. Some promising measurement tools have been proposed, such as Pinski and Benlian’s (2023) multidimensional scale assessing perceived knowledge of AI technology, processes, collaboration, and design. However, further validation of AI literacy assessments is required. Developing rigorous definitions and measurements is crucial for implementing effective AI literacy initiatives.

Ridley and Pawlick-Potts (2021) put forth the concept of algorithmic literacy, involving understanding algorithms and their influence, recognizing their uses, assessing their impacts, and positioning individuals as active agents rather than passive recipients of algorithmic decision-making. They propose libraries can contribute to algorithmic literacy by integrating it into information literacy education and supporting explainable AI.

Ocaña-Fernández et al. (2019) argued curriculum and skills training changes are critical to prepare students and faculty for an AI future, though also warn about digital inequality issues. Laupichler et al.’s (2022) scoping review reveals efforts to teach foundational AI literacy to non-specialists are still in formative stages. Proposed essential skills vary considerably across frameworks, and robust evaluations of AI literacy programs are lacking. Findings indicate that carefully designed AI literacy courses show promise for knowledge gains; however, research substantiating appropriate frameworks, core competencies and effective instructional approaches for diverse audiences remains an open need.

Within libraries, Heck et al. (2019) discussed the interplay of information literacy and AI. They propose that AI could aid information literacy teaching through timely feedback and tracking skill development, but note that common evaluation approaches would need establishing first. Information literacy empowers learners to actively engage with, not just passively consume from, AI systems. Lo (2023c) proposed a framework to utilize prompt engineering to enhance information literacy and critical thinking skills.

Oliphant (2015) examined intelligent agents for library reference services. The analysis found they rapidly retrieve information but lack human evaluation abilities. Findings suggest librarians will need to guide users in critically evaluating AI-generated results, indicating that information literacy instruction remains crucial. Furthermore, Lund et al. (2023) discuss the ethical implications of using large language models, such as ChatGPT, in scholarly publishing, emphasizing the need for ethical considerations and the potential impact of AI on research practices.

While research is still emerging, initial findings highlight the need for rigorous, tailored AI literacy initiatives encompassing technical skills, critical perspectives, and ethical considerations. As AI becomes further entwined with education and work, developing validated frameworks, assessments, and instructional approaches to enhance multidimensional AI literacy across contexts and roles is an urgent priority. This study seeks to contribute by investigating AI literacy specifically among academic library employees.

Purpose of the Study

The rapid pace of AI development and integration in higher education heightens the need to address this research gap. As AI continues to evolve and permeate further into academic libraries, the demand for AI-literate library employees will only increase. Failure to understand the current state of AI literacy, and to identify the gaps, could result in a significant skills deficit that would impedes the effective utilization of AI in academic libraries.

In light of this, the purpose of this study is to embark on an investigation of AI literacy among academic library employees. The study seeks to answer the following critical research questions:

  • What is the current level of AI literacy among academic library employees?
  • What gaps exist in their AI literacy, and how can these gaps be addressed through professional development and training programs?
  • What are their perceptions of generative AI, and what implications do they foresee for the library profession?

By addressing these questions, this study aims to fill a research gap and provide insights that can inform policy and practice in higher education. It strives to shed light on the competencies that academic library employees possess, identify the gaps that need to be addressed, and propose strategies for enhancing AI literacy among this essential group of higher education professionals.

Theoretic Framework

The Technological Pedagogical Content Knowledge (TPACK) framework developed by Mishra and Koehler (2006) serves as the theoretical foundation for this study. TPACK has also been advocated as a useful decision-making structure for librarians evaluating instructional technologies (Sobel & Grotti, 2013).

Mishra and Koehler (2006) explain that TPACK involves flexible, context-specific application of technology, pedagogy, and content knowledge. It goes beyond isolated knowledge of the concepts to an integrated understanding. TPACK development requires moving past viewing technology as an “add-on” and focusing on the connections between technology, content, and pedagogy in particular educational contexts.

In the context of this study, the researcher applied the TPACK framework to examine AI literacy specifically among academic library professionals. The three key components of the TPACK framework are interpreted as:

  • Technological Knowledge (TK)—Knowledge about AI itself, including its principles, capabilities, and limitations. This encompasses understanding AI as a technology and its potential applications in library settings.
  • Pedagogical Knowledge (PK)—Knowledge about how AI can be used to enhance library services and facilitate learning. This relates to understanding how AI can be integrated into library services to improve user experience, streamline operations, and support learning.
  • Content Knowledge (CK)—Knowledge about the library’s content and services. This involves perceiving the potential impact of AI on the library’s content and services, and how AI can enhance their management and delivery.

This tailored application of the TPACK framework will allow a multidimensional assessment of AI literacy among academic library employees. It facilitates examining employees’ understanding of AI as a technology (TK), perceptions of how AI can enhance library services (PK), and the potential impact of AI on the library’s content and services (CK).

Significance of the Study

The significance of this study lies in its potential to contribute to academic library policy, practice, and theory in several ways. Firstly, it utilizes the TPACK framework to evaluate AI literacy among academic library employees, identifying competencies, gaps, and necessary strategies. This insight is crucial for designing effective professional development programs, as well as for resource allocation. Secondly, it adds to the discourse on digital literacy in higher education by specifically focusing on AI literacy, aiding in understanding its role and implications. Thirdly, the study provides insights into the ethical, practical, and opportunity dimensions of AI technology integration in libraries, informing best practices and guidelines for its responsible use. Lastly, by applying the TPACK framework to AI literacy in libraries, the study expands its theoretical applications and offers a robust basis for future research in technology integration in academic settings.

Methodology

Research design.

This study employs a survey-based approach to explore AI literacy among academic library employees, chosen for its ability to quickly gather extensive data across a geographically diverse group. The method aligns with the TPACK framework, highlighting the integration of technological, pedagogical, and content knowledge. Surveys facilitate the collection of standardized data, allowing for comparisons across different roles and demographics. This design is particularly effective for descriptive research in higher education, making it suitable for assessing the current state of AI literacy in academic libraries.

Participants

The researcher utilized a comprehensive approach to recruit a diverse group of academic library employees for the survey. This involved posting on professional listservs across various roles and regions in librarianship (Appendix A), as well directly contacting directors of prominent library associations: the Association of Research Libraries (ARL), the Greater Western Library Alliance (GWLA), and the New Mexico Consortium of Academic Libraries (NMCAL). These organizations represent a broad spectrum of academic libraries in terms of size, location, and type. The directors were requested to share the survey with their staff, thus ensuring a wide-reaching and representative sample for the study.

Data Collection

Data collection was facilitated through a custom-designed survey instrument, which was built and administered using the Qualtrics platform (Appendix B). The survey itself was developed to address the study’s research questions and was structured into four main sections, each focusing on a specific aspect of AI literacy among academic library employees.

The first section sought to capture respondents’ understanding and knowledge of AI, including their familiarity with AI concepts and terminology. The second section focused on respondents’ practical skills and experiences with AI tools and applications in professional settings. The third section aimed to identify areas of AI literacy where respondents felt less confident, signaling potential gaps in knowledge or skills that could be addressed through professional development initiatives. Finally, the last section explored respondents’ perspectives on the ethical implications and challenges presented by AI technologies in the library context.

The survey employed a mix of question types to engage respondents and capture nuanced data. These included Likert-scale questions, multiple choice, and open-ended questions. Prior to the full-scale administration, the survey was pilot-tested with a small group of academic library employees to ensure clarity, relevance, and appropriateness of the questions.

The survey questions were designed to tap into different dimensions of the TPACK framework. For instance, questions asking about practical experiences with AI tools and self-identified areas of improvement indirectly assess the intersection of technological and pedagogical knowledge (TPK), as they relate to AI.

Upon finalizing the survey, an invitation to participate, along with a link to the survey, was distributed via the listservs and direct outreach methods. The survey remained open for two weeks, with reminders sent out at regular intervals to maximize the response rate.

Limitations

While the study offers insights into AI literacy among academic library employees, it is crucial to acknowledge its limitations. Firstly, given the survey’s self-report nature, the findings may be subject to social desirability bias, where respondents might have over- or under-estimated their knowledge or skills in AI.

Secondly, despite best efforts to reach a wide range of academic library employees, the sample may not be entirely representative of the population. The voluntary nature of participation, coupled with the distribution methods used, may have skewed the sample towards those with an existing interest or engagement in AI.

Moreover, while the use of professional listservs and direct outreach to library directors helped widen our reach, this strategy might have excluded those academic library employees who are less active, or not included, in these communication channels. The inclusion of Canadian libraries through the Association of Research Libraries suggests a small number of non-U.S. respondents.

Finally, the rapidly evolving nature of AI and its applications in libraries means that our findings provide a snapshot at a specific point in time. As AI continues to advance and integrate more deeply into academic libraries, the landscape of AI literacy among library employees is likely to shift, necessitating ongoing research in this area.

These limitations, while important to note, do not invalidate our findings. Instead, they offer points of consideration for interpreting the results and highlight areas for future research to build on our understanding of AI literacy among academic library employees.

Results and Analysis

Descriptive statistics.

The survey drew a diverse response: 760 participants started the survey, 605 completed it. The participants represented a cross-section of the academic library landscape, with the majority (45.20%) serving in Research Universities. A significant proportion also hailed from institutions offering both graduate and undergraduate programs (29.64%) and undergraduate-focused Colleges or Universities (10.76%). Community Colleges and specialized professional schools (e.g., Law, Medical) were represented as well, albeit to a lesser extent.

Over half of the respondents (61.25%) were from libraries affiliated with the Association of Research Libraries (ARL), signifying an extensive representation from research-intensive institutions. Respondents were predominantly from larger academic institutions. Those serving in institutions with enrollments of 30,000 or more made up the largest group (30.67%), closely followed by those in institutions with enrollments ranging from 10,000 to 29,999 (34.66%).

As for professional roles, the survey drew heavily from the library specialists or professionals (60.99%) who directly support the academic community’s research, learning, and teaching needs. Middle (20.00%) and senior (9.09%) management personnel were also well-represented, providing a leadership perspective to the survey insights.

Table 1

Role or Position in Organization

Role or Position in Organization

Percentage of Respondents

Number of Respondents

Senior management (e.g. Director, Dean, associate dean/director)

9.09%

55

Middle management (e.g. department head, supervisor, coordinator)

20.00%

121

Specialist or professional (e.g., librarian, analyst, consultant)

60.99%

369

Support staff or administrative

8.93%

54

Other

0.99%

6

Most of the respondents were primarily involved in Reference and Research Services (25.17%) or Library Instruction and Information Literacy (24.34%)—two areas integral to the academic support infrastructure.

In terms of professional experience, participants exhibited a broad range, from novices with less than a year’s experience (2.81%) to seasoned veterans with over 20 years in the field (22.68%).

Table 2

Primary Work Area in Academic Librarianship

Primary Work Area in Academic Librarianship

Percentage of Respondents

Number of Respondents

Administration or management

10.93%

66

Reference and research services

25.17%

152

Technical services (e.g., acquisitions, cataloging, metadata)

8.11%

49

Collection development and management

4.64%

28

Library instruction and information literacy

24.34%

147

Electronic resources and digital services

4.30%

26

Systems and IT services

3.64%

22

Archives and special collections

3.31%

20

Outreach, marketing, and communications

1.66%

10

Other

13.91%

84

Table 3

Years of Experience as a Library Employee

Years of Experience as a Library Employee

Percentage of Respondents

Number of Respondents

Less than 1 year

2.81%

17

1–5 years

21.19%

128

6–10 years

19.54%

118

11–15 years

19.04%

115

16–20 years

14.74%

89

More than 20 years

22.68%

137

The survey group was highly educated, with most holding a master’s degree in library and information science (65.51%), and a significant number having completed a doctoral degree or a master’s in another field.

The survey also collected demographic information. A substantial majority identified as female (71.97%), and the largest age group was 35–44 years (27.97%). While the majority identified as White (76.11%), other ethnicities, including Asian, Black or African American, and Hispanic or Latino, were also represented.

This diverse participant profile offers a broad-based view of AI literacy in the academic library landscape, setting the stage for insightful findings and discussions.

Table 4

Level of Understanding of AI Concepts and Principles

Level of Understanding of AI Concepts and Principles

% of Respondents

Number of Respondents

1 (Very Low)

7.50%

57

2

20.13%

153

3 (Moderate)

45.39%

345

4

23.29%

177

5 (Very High)

3.68%

28

RQ 1 AI Literacy Levels

At a broad level, participants expressed a modest understanding of AI concepts and principles, with a significant portion rating their knowledge at an average level. However, the number of respondents professing a high understanding of AI was quite small, revealing a potential area for further training and education.

A similar pattern was observed when participants were queried about their understanding of generative AI specifically. This suggests that while librarians have begun to grasp AI and its potential, there is a considerable scope for growth in terms of knowledge and implementation (Figure 1).

Figure 1

Understanding of Generative AI

Regarding the familiarity with AI tools, most participants had a moderate level of experience (30.94%). Only a handful of participants reported a high level of familiarity (3.87%), signaling an opportunity for more hands-on training with these tools.

In examining the prevalence of AI usage in the library sector, the researcher found a varied landscape. While some technologies have found significant adoption, others remain relatively unused. Notably, Chatbots and text or data mining tools were the most widely used AI technologies.

Participants’ understanding of specific AI concepts followed a similar trend. More straightforward concepts such as Machine Learning and Natural Language Processing had a higher average rating, whereas complex areas like Deep Learning and Generative Adversarial Networks were less understood. This trend underscores the need for targeted educational programs on AI in library settings.

Table 5

Understanding of Specific AI Concepts

AI Concept

Average Rating

Machine Learning

2.50

Natural Language Processing (NLP)

2.38

Neural Network

1.93

Deep Learning

1.79

Generative Adversarial Networks (GANs)

1.37

Notably, there was almost a nine percent drop in responses from the previous questions to the questions that asked about the more technical aspects of AI. This could signify a gap in knowledge or comfort level with these topics among the participants.

In the professional sphere, AI tools have yet to become a staple in library work. The majority of participants do not frequently use these tools, with 41.79% never using generative AI tools and 28.01% using them less than once a month. This might be attributed to a lack of familiarity, resources, or perceived need. However, for those who do use them, text generation and research assistance are the primary use cases.

Concerns about ethical issues, quality, and accuracy of generated content, as well as data privacy, were prevalent among the participants. This finding indicates that while there’s interest in AI technologies, the perceived challenges are significant barriers to full implementation and adoption.

In their personal lives, AI tools have yet to make a significant impact among the participants. The majority (63.98%) reported using these tools either ‘less than once a month’ or ‘never.’ This could potentially reflect the current state of AI integration in non-professional or leisurely activities, and may change as AI continues to permeate our everyday lives.

A chi-square test of independence was performed to examine the relation between the position of the respondent and the understanding of AI concepts and principles. The relation between these variables was significant, χ 2 (16, N = 760) = 26.31, p = .05. This means that the understanding of AI concepts and principles varies depending on the position of the respondent.

The distributions suggest that—while there is a significant association between the position of the respondent and their understanding of AI concepts and principles—the majority of respondents across all positions have a moderate understanding of AI. However, there are differences in the proportions of respondents who rate their understanding as high or very high, with Senior Management and Middle Management having higher proportions than the other groups.

There is also a significant relation between the area of academic librarianship and the understanding of AI concepts and principles, χ²(36, N = 760) = 68.64, p = .00084. This means that the understanding of AI concepts and principles varies depending on the area of academic librarianship. The distributions show that there are differences in the proportions of respondents who rate their understanding as high or very high, with Administration or management and Library Instruction and Information Literacy having higher proportions than the other groups.

Furthermore, a Chi-Square test shows that the relation between the payment for a premium version of at least one of the AI tools and the understanding of AI concepts and principles is significant, χ²(4, N = 539) = 85.42, p < .001. The distributions suggest that respondents who have paid for a premium version of at least one of the AI tools have a higher understanding of AI concepts and principles compared to those who have not. This could be because those who have paid for a premium version of an AI tool are more likely to use AI in their work or personal life, which could enhance their understanding of AI. Alternatively, those with a higher understanding of AI might be more likely to see the value in paying for a premium version of an AI tool.

It’s important to note that these findings are based on the respondents’ self-rated understanding of AI, which may not accurately reflect their actual understanding. Further research could involve assessing the respondents’ understanding of AI through objective measures. Additionally, other factors not considered in this analysis, such as the respondent’s educational background, years of experience, and exposure to AI in their work, could also influence their understanding of AI.

RQ2 Identifying Gaps

In this section, the researcher delved deeper into the gaps in knowledge and confidence among academic library professionals regarding AI applications. These gaps highlight the urgent need for targeted professional development and training in AI literacy.

Confidence Levels in Various Aspects of AI

The survey data pointed to moderate levels of confidence across a spectrum of AI-related tasks, indicating room for growth and learning. For evaluating ethical implications of using AI, a modest 30.12% of respondents felt somewhat confident (levels 4 and 5 combined), while 29.50% were not confident (levels 1 and 2 combined), and the largest group (39.38%) remained neutral.

Discussing AI integration revealed similar patterns. Here, 31.1% reported high confidence, 34.85% expressed low confidence, and the remaining 33.06% were neutral. These distributions suggest an overall hesitation or lack of assurance in discussing and ethically implementing AI, potentially indicative of inadequate training or exposure to these topics.

When it came to collaborating on AI-related projects, fewer respondents (31.39%) felt confident, while 40.16% reported low confidence, and 28.46% chose a neutral stance. This might point to the necessity of not only individual proficiency in AI but also the need for collaborative skills and shared understanding among teams working with AI.

Troubleshooting AI tools and applications emerged as the most significant gap, with 69.76% rating their confidence as low and only 10.9% expressing high confidence. This highlights an essential area for targeted training, as troubleshooting is a fundamental aspect of successful technology implementation.

Table 6

Confidence Levels in Various Aspects of AI

Aspect

% at Confidence Level 1

% at Confidence Level 2

% at Confidence Level 3

% at Confidence Level 4

% at Confidence Level 5

Evaluating Ethical Implications of AI

12.48%

17.02%

39.38%

24.64%

6.48%

Participating in AI Discussions

13.29%

21.56%

33.06%

20.75%

11.35%

Collaborating on AI Projects

15.77%

24.39%

28.46%

21.63%

9.76%

Troubleshooting AI Tools

41.79%

27.97%

19.35%

9.76%

1.14%

Providing Guidance on AI Resources

25.65%

24.51%

25.81%

20.13%

3.90%

Reflecting on Professional Development and Training in AI

Approximately one-third of survey participants have engaged in AI-focused professional development, showcasing several key themes:

  • Modes of Training: Librarians access training via various formats, including webinars, workshops, and self-guided learning. Online options are popular, providing accessibility for diverse professionals.
  • AI Tools and Applications: Training sessions mainly introduce tools like ChatGPT and others, with an emphasis on functionality and applications in academia.
  • Ethical Implications: Sessions often address ethical concerns such as bias and privacy, and the potential misuse of ‘black box’ AI models.
  • Integration into Librarian Workflows: Programs explore AI’s integration into library work, including instruction, cataloging, and citation analysis.
  • AI Literacy: There is a recurring focus on understanding and teaching AI concepts, tied to broader information literacy discussions.
  • AI in Instruction: Training includes using AI tools in library instruction and understanding its impacts on academic integrity.
  • Community of Practice: Responses highlight collaborative learning, suggesting a communal approach to understanding AI’s challenges and opportunities.
  • Self-guided Learning: Some librarians actively pursue independent learning opportunities, reflecting a proactive stance on AI professional development.

The findings emphasize the multifaceted nature of AI in libraries, underlining the need for ongoing, comprehensive professional development. This includes addressing both technical and ethical aspects, equipping librarians with practical AI skills, and fostering a supportive community of practice.

A Chi-square test examining the relationship between the respondents’ positions and their participation in any training focused on generative AI (χ²(4, N = 595) = 26.72, p < .001) indicates a significant association. Upon examining the data, the proportion of respondents who have participated in training or professional development programs focused on generative AI is highest among those in Senior Management (47.27%), followed by Specialist or Professional (37.40%), Middle Management (29.75%), and Other (16.67%). The proportion is lowest among Support Staff or Administrative (3.70%).

This suggests that individuals in higher positions, such as Senior Management and Specialist or Professional roles, are more likely to have participated in training or professional development programs focused on generative AI. This could be due to a variety of reasons, such as these roles potentially requiring a more in-depth understanding of AI and its applications, or these individuals having more access to resources and opportunities for such training. On the other hand, Support Staff or Administrative personnel are less likely to have participated in such programs, which could be due to less perceived need or fewer opportunities for training in these roles.

These findings highlight the importance of providing access to training and professional development opportunities focused on AI across all roles in an organization, not just those in higher positions or those directly involved in AI-related tasks. This could help ensure a more widespread understanding and utilization of AI across the organization.

Despite these efforts, many participants did not feel adequately prepared to utilize generative AI tools professionally. A notable 62.91% disagreed to some extent with the statement: “I feel adequately prepared to use generative AI tools in my professional work as a librarian,” underscoring the need for more effective training programs.

Interestingly, the areas identified for further training weren’t just about understanding the basics of AI. Participants showed a clear demand for advanced understanding of AI concepts and techniques (13.53%), familiarity with AI tools and applications in libraries (14.21%), and addressing privacy and data security concerns related to generative AI (14.36%). This suggests that librarians are looking to move beyond a basic understanding and are keen to engage more deeply with AI.

Preferred formats for professional development opportunities leaned towards remote and flexible learning opportunities, such as online courses or webinars (26.02%) and self-paced learning modules (22.44%). This preference reflects the current trend towards digital and remote learning, providing a clear direction for future training programs.

Notably, almost half of the participants (43.99%) rated the need for academic librarians to receive training on AI tools and applications within the next twelve months as ‘extremely important.’ This emphasis on urgency indicates a significant and immediate gap to be addressed.

In summary, a deeper analysis of the data reveals a landscape where academic librarians possess moderate to low confidence in understanding, discussing, and handling AI-related tasks, despite some exposure to professional development in AI. This finding indicates the need for more comprehensive, in-depth, and accessible AI training programs. By addressing these knowledge gaps, the library community can effectively embrace AI’s potential and navigate its challenges.

RQ 3 Perceptions

The comprehensive results of our survey, as illustrated in Table 7, offer a detailed portrait of librarians’ perceptions towards the integration of generative AI tools in library services and operations.

Table 7

Perceptions Towards the Integration of Generative AI Tools In Library Services

Statement

1

2

3

4

5

To what extent do you agree or disagree with the following statement: “I believe generative AI tools have the potential to benefit library services and operations.” (1 = strongly disagree, 5 = strongly agree)

3.32%

10.96%

35.88%

27.91%

21.93%

How important do you think it is for your library to invest in the exploration and implementation of generative AI tools? (1 = not at all important, 5 = extremely important)

7.24%

15.95%

29.93%

28.78%

18.09%

In your opinion, how prepared is your library to adopt generative AI tools and applications in the next 12 months? (1 = not at all prepared, 5 = extremely prepared)

32.28%

37.75%

23.84%

4.80%

1.32%

To what extent do you think generative AI tools and applications will have a significant impact on academic libraries within the next 12 months? (1 = no impact, 5 = major impact)

2.81%

20.03%

36.09%

26.16%

14.90%

How urgent do you feel it is for your library to address the potential ethical and privacy concerns related to the use of generative AI tools and applications? (1 = not at all urgent, 5 = extremely urgent)

2.15%

5.46%

18.05%

29.47%

44.87%

When considering the potential benefits of AI, the responses indicate a degree of ambivalence, with 35.88% choosing a neutral stance. However, when we combine the categories of those who ‘agree’ and ‘strongly agree,’ we see that a significant portion, 49.84%, view AI as beneficial to a certain extent. Similarly, on the question of the importance of investment in AI, there is a notable inclination towards agreement, with 46.87% agreeing that investment is important to some degree.

However, this optimism is juxtaposed with concerns about readiness. When asked how prepared they feel to adopt generative AI tools within the forthcoming year, 70.03% of respondents (those who ‘strongly disagree’ or ‘disagree’) admit a lack of preparedness. This suggests that despite recognizing the potential value of AI, there are considerable obstacles to be overcome before implementation becomes feasible.

The uncertainty surrounding AI’s impact on libraries in the short-term further illuminates this complexity. A significant proportion of librarians (36.09%) chose a neutral response when asked to predict the impact of AI on academic libraries within the next twelve months. Nonetheless, there is a considerable group (41.06% who ‘agree’ or ‘strongly agree’) who foresee significant short-term impact.

A key finding from the survey was the collective recognition of the urgency to address ethical and privacy issues tied to AI usage. In fact, 74.34% of respondents, spanning ‘agree’ and ‘strongly agree,’ underscored the urgent need to address potential ethical and privacy concerns related to AI, highlighting the weight of responsibility librarians feel in maintaining the integrity of their services in the age of AI (Figure 2).

Figure 2

Perceived Urgency for Addressing Ethical and Privacy Concerns of Generative AI in Libraries

The qualitative responses provide a rich understanding of the perceptions of generative AI among library professionals and the implications they foresee for the library profession. The responses were categorized into several key themes, each of which is discussed below with relevant quotes from the respondents.

Ethical and Privacy Concerns

A significant theme that emerged from the responses was the ethical and privacy concerns associated with the use of generative AI tools in libraries. Respondents expressed apprehension about potential misuse of data and violations of privacy. As one respondent noted, “Library leaders should not rush to implement AI tools without listening to their in-house experts and operational managers.” Another respondent cautioned, “We need to be cautious about adopting technologies or practices within our own workflows that pose significant ethical questions, privacy concerns.”

Need for Education and Training

The need for education and training on AI for librarians was another prevalent theme. Respondents emphasized the importance of understanding AI tools and their implications before implementing them. One respondent suggested: “quickly education on AI is needed for librarians. As with anything else, there will be early adopters and then a range of adoption over time.” Another respondent highlighted the need for an AI specialist, stating, “I also think it would be valuable to have an AI librarian, someone who can be a resource for the rest of the staff.”

Potential for Misuse

Respondents expressed concern about the potential for misuse of AI tools, such as generating false citations or over-reliance on AI systems. They emphasized the importance of critical thinking skills, and cautioned against replacing human judgment and learning processes with AI. As one respondent put it, “Critical thinking skills and learning processes are vital and should not be replaced by AI.” Another respondent warned: “there are potential risks from misuse such as false citations being provided or too much dependence on systems.”

Concerns about Implementation

Several respondents expressed doubts about the ability of libraries to quickly and effectively implement AI tools. They cited issues such as frequent updates and refinements to AI tools, the need for significant investment, and the potential for AI to be used in ways that do not benefit the library or its users. One respondent noted, “the concern I have with AI tools is the frequent updates and refinements that occur. For libraries with small staff size, it seems daunting to keep up.”

Role of AI in Libraries

Some respondents suggested specific ways in which AI could be used in libraries, such as for collection development, instruction, and answering frequently asked questions. However, they also cautioned against viewing AI as a panacea for all library challenges. One respondent stated: “using them for FAQs will be more useful than answering a complicated reference question.”

Concerns about AI’s Impact on the Profession

Some respondents expressed concern that the use of AI could lead to job displacement or a devaluation of the human elements of librarianship. They suggested that AI should be used to complement, not replace, human librarians. One respondent expressed that, “I could see a future where only top research institutions have human reference librarians as a concierge service.”

Need for Critical Evaluation

Respondents emphasized the need for critical evaluation of AI tools, including understanding their limitations and potential biases. They suggested that libraries should not rush to implement AI without fully understanding its implications. One respondent advised: “the framing of AI usage as a forgone conclusion is concerning. It’s a tool, not a solution, and should not be implemented without due consideration.”

AI Literacy

Some respondents suggested that libraries have a role to play in teaching AI literacy to students and other library users. They emphasized the importance of understanding how AI tools work and how to use them responsibly. One respondent stated: “I think we need to teach AI literacy to students.” Another respondent echoed this sentiment, saying, “it is essential that we prepare our students to use generative AI tools responsibly.”

The perceptions of generative AI among library professionals are multifaceted, encompassing both the potential benefits and challenges of these technologies. While there is recognition of the potential of AI to enhance library services, there is also a strong emphasis on the need for ethical considerations, education and training, critical evaluation, and responsible use of these tools. The implications for the library profession are significant, with concerns about job displacement, the need for new skills and roles, and the potential for changes in library practices and services. These findings highlight the need for ongoing dialogue and research on the use of generative AI in libraries.

While library employees acknowledge the potential advantages of AI in library services, they also express concerns regarding readiness, and emphasize the urgency to address ethical and privacy considerations. These findings indicate the need for support systems, training, and resources to address readiness gaps, alongside rigorous discussion, and guidelines to navigate ethical and privacy issues as libraries explore the possibilities of AI integration.

Discussions

The survey results cast light on the current state of artificial intelligence literacy, training needs, and perceptions within the academic library community. The findings reveal a landscape of recognition for the potential of AI technologies, yet, simultaneously, a lack of in-depth understanding and preparedness for their adoption.

A detailed examination of the data reveals that a considerable number of library professionals self-assess their understanding of AI as sitting around, or below, the middle. While this does suggest a basic level of familiarity with AI concepts and principles, it likely falls short of the proficiency required to navigate the rapidly evolving AI landscape confidently and competently. This gap in understanding holds implications for the library field as AI continues to infiltrate various sectors and increasingly permeates library services and operations.

Moreover, an analysis of the familiarity of library professionals with AI tools lends further credence to this call for more comprehensive AI education initiatives. An understanding of AI extends beyond mere theoretical comprehension—it necessitates hands-on familiarity with AI tools and the ability to use and apply them in practice. Direct interaction with AI technologies provides an avenue for library professionals to bolster their practical understanding and thus equip them to incorporate these tools into their work more effectively.

However, formulating training initiatives that address these gaps is a multifaceted task. The AI usage in libraries is as diverse as the scope of AI applications themselves. From customer service chatbots, and text or data mining tools, to advanced technologies like neural networks and deep learning systems—each offers unique applications and therefore requires distinct expertise and understanding. Accordingly, training programs must be flexible and comprehensive, encompassing the full range of potential AI applications while also delving deep enough to provide a solid grasp of each specific tool’s functionality and potential uses.

The study also sheds light on the varying degrees of understanding across different AI concepts. Participants generally exhibited a higher level of comprehension for simpler AI concepts. However, their understanding waned when it came to more complex concepts, often the bedrock of cutting-edge AI applications. This variation in comprehension underscores the need for a stratified approach to AI education. Such an approach could start with foundational concepts and gradually progress towards more advanced topics, providing a scaffold on which a deeper understanding of AI can be built.

Addressing the AI literacy gap in the library sector thus requires a concerted approach—one that offers comprehensive and layered educational strategies that bolster both theoretical understanding and practical familiarity with AI. The aim should not only be to impart knowledge, but to empower library professionals to confidently navigate the AI landscape, to adopt and adapt AI technologies in their work effectively and—crucially —responsibly. Through such training and professional development initiatives, libraries can harness the potential of AI, ensuring they continue to be at the forefront of technological advancements.

As the focus shifts to the professional use of AI tools in libraries, the data reveal that their adoption is not yet commonplace. The use of AI tools—such as text generation and research assistance—are most reported, reflecting the immediate utility these technologies offer to librarians. However, a significant proportion of participants do not frequently use AI tools, indicating barriers to adoption. These barriers could include a lack of understanding or familiarity with these tools, a perceived lack of necessity for their use, or limitations in resources necessary for implementation and maintenance. To overcome these barriers, the field may need more than just providing education and resources. Demonstrating the tangible benefits and efficiencies AI tools can bring to library work could play a pivotal role in their wider adoption.

The data show a strong enthusiasm among librarians for professional development related to AI. While introductory training modalities are popular, the findings reveal a demand for more advanced, hands-on training. This need aligns with the complexity and rapid evolution of AI technologies, which require a deeper understanding to be fully leveraged in library contexts.

Furthermore, the findings highlight the importance of ethical considerations and the potential benefits of fostering communities of practice in AI training. With the increasing integration of AI technology into library services, the issues related to AI ethics will likely become more complex. Proactively addressing these concerns through in-depth, focused training can help libraries continue to serve as ethical stewards of information. Communities of practice provide a platform for shared learning, mutual support, and the pooling of resources, equipping librarians to better navigate the intricacies of AI integration.

Importantly, the data show that the diversity in librarians’ roles and contexts necessitates a tailored approach to AI training. Libraries differ in their services, target audiences, resources, and strategic goals, and so do their AI training needs. A one-size-fits-all approach to AI training may fall short. Future AI training could therefore take these variations into account, offering specialized tracks or modules catering to specific roles or institutional contexts.

Likewise, the perceptions surrounding the use of generative AI tools in libraries are intricate and multifaceted. While the potential benefits of AI are acknowledged and the importance of investing in its implementation recognized, there is also a pronounced lack of readiness to adopt these tools. This readiness gap could stem from various factors, such as a lack of technical skills, insufficient funding, or institutional resistance. Future research should delve into these possibilities to better understand and address this gap.

Library professionals express uncertainty about the short-term implications of AI for libraries. This could reflect the novelty of these technologies and a lack of clear use cases, or it could echo the experiences of early adopters. The findings also emphasize a heightened sense of urgency in addressing the ethical and privacy concerns associated with AI technologies. These concerns underline the necessity for ongoing dialogue, education, and policy development around AI use in libraries.

Conclusions and Future Directions

The results reveal an intricate landscape of AI understanding, usage, and perception in the library field. While the benefits of AI tools are acknowledged, a comprehensive understanding and readiness to implement these technologies remain less than ideal. This reality underlines the pressing need for an investment in targeted educational strategies and ongoing professional development initiatives.

Crucially, the wide variance in AI literacy, understanding of AI concepts, and hands-on familiarity with AI tools among library professionals points towards the need for a stratified and tailored approach to AI education. Future training programs must aim beyond just knowledge acquisition—they must equip library professionals with the capabilities to apply AI technologies in their roles effectively, ethically, and responsibly. Ethical and privacy concerns emerged as significant considerations in the adoption of AI technologies in libraries. Our findings reinforce the crucial role that libraries have historically played, and must continue to play, in advocating for ethical information practices.

The readiness gap in AI adoption uncovered by the study suggests a disconnect between understanding the potential of AI and the ability to harness it effectively. This invites a deeper investigation into potential barriers, including technical proficiency, resource allocation, and institutional culture, among others.

Framework and Key Competencies

This study presents a framework for defining AI literacy in academic libraries, encapsulating seven key competencies:

  • Understanding AI System Capabilities and Limitations: Recognizing what AI can and cannot do, knowing its strengths and weaknesses.
  • Identifying and Evaluating AI Use Cases: Discovering and assessing potential AI applications in library settings.
  • Utilizing AI Tools Effectively and Appropriately: Applying AI technologies in library operations.
  • Critically Assessing AI Quality, Biases, and Ethics: Evaluating AI for accuracy, fairness, and ethical considerations.
  • Engaging in Informed AI Discussions and Collaborations: Participating knowledgeably in conversations and cooperative efforts involving AI.
  • Recognizing Data Privacy and Security Issues: Understanding and addressing concerns related to data protection and security in AI systems.
  • Anticipating AI’s Impacts on Library Stakeholders: Preparing for how AI will affect library users and staff.

This multidimensional definition of AI literacy for libraries provides a foundation for developing comprehensive training programs and curricula. For instance, the need to understand AI system capabilities and limitations highlighted in the definition indicates that introductory AI education should provide a solid grounding in how common AI technologies like machine learning work, where they excel, and their constraints. This conceptual comprehension equips librarians to set realistic expectations when evaluating or implementing AI.

The definition also accentuates that gaining practical skills to use AI tools appropriately should be a core training component. Hands-on learning focused on identifying appropriate applications, utilizing AI technologies effectively, and critically evaluating outputs can empower librarians to harness AI purposefully.

Moreover, emphasizing critical perspectives and ethical considerations reflects that AI training for librarians should move beyond technical proficiency. Incorporating modules examining biases, privacy implications, misinformation risks, and societal impacts is key for fostering responsible AI integration.

Likewise, the collaborative dimension of the definition demonstrates that cultivating soft skills for productive AI discussions and teamwork should be part of the curriculum. AI literacy has an important social element that training programs need to nurture.

Overall, this definition provides a skills framework that can inform multipronged, context-sensitive AI training tailored to librarians’ diverse needs. It constitutes an actionable guide for developing AI curricula and professional development that advance both technical and social aspects of AI literacy.

Future Research

Based on the findings and limitations of the current study, the following are specific recommendations for future research:

  • Longitudinal Studies: This study provides a snapshot of AI literacy among academic library employees at a specific point in time. Future research could conduct longitudinal studies to track changes in AI literacy over time, which would provide insights into the effectiveness of interventions and the evolution of AI literacy in the library profession.
  • Comparative Studies: This study focused on academic library employees. Future research could conduct comparative studies to examine AI literacy among different types of library employees (e.g., public library employees, school library employees), or among library employees in different countries. Such studies could provide insights into the factors that influence AI literacy and the strategies that are effective in different contexts.
  • Intervention Studies: This study identified the need for education and training on AI. Future research could design and evaluate interventions aimed at enhancing AI literacy among library employees. Such studies could provide evidence-based recommendations for the development of training programs and resources.
  • Ethical Considerations: This study highlighted ethical concerns about the use of AI in libraries. Future research could delve deeper into these ethical issues, examining the perspectives of different stakeholders (e.g., library users, library administrators) and exploring strategies for addressing these concerns.
  • Impact of AI on Library Services: This study explored library employees’ perceptions of the potential impact of AI on library services. Future research could examine the actual impact of AI on library services, assessing the effectiveness of AI in enhancing user experience, streamlining operations, and supporting learning.

By pursuing these avenues for future research, we can continue to deepen our understanding of AI literacy in the library profession, inform strategies for enhancing AI literacy, and promote the effective and ethical use of AI in libraries.

Cetindamar, D., Kitto, K., Wu, M., Zhang, Y., Abedin, B., & Knight, S. (2021). Explicating AI literacy of employees at digital workplaces. IEEE Transactions on Engineering Management , 68(5), 1259–1271.

Cox, A. (2022). The ethics of AI for information professionals: Eight scenarios.  Journal of the Australian Library and Information Association , 71(3), 201–214.

Heck, T., Weisel, L., & Kullmann, S. (2019). Information literacy and its interplay with AI . In A. Botte, P. Libbrecht, & M. Rittberger (Eds.), Learning Information Literacy Across the Globe (pp. 129–131). https://doi.org/10.25656/01:17891

Hervieux, S., & Wheatley, A. (2021). Perceptions of artificial intelligence: A survey of academic librarians in Canada and the United States.  The Journal of Academic Librarianship , 47(1), 102270.

Laupichler, M. C., Aster, A., Schirch, J., & Raupach, T. (2022). Artificial intelligence literacy in higher and adult education: A scoping literature review. Computers and Education: Artificial Intelligence , 3, 100101. https://doi.org/10.1016/j.caeai.2022.100101

Lo, L. S. (2023a). An initial interpretation of the U.S. Department of Education’s AI report: Implications and recommendations for Academic Libraries. The Journal of Academic Librarianship , 49(5), 102761. https://doi.org/10.1016/j.acalib.2023.102761

Lo, L. S. (2023b). The art and science of prompt engineering: A new literacy in the information age. Internet Reference Services Quarterly , 27(4), 203–210. https://doi.org/10.1080/10875301.2023.2227621

Lo, L. S. (2023c). The clear path: A framework for enhancing information literacy through prompt engineering. The Journal of Academic Librarianship , 49(4), 102720. https://doi.org/10.1016/j.acalib.2023.102720

Lund, B. D., Wang, T., Mannuru, N. R., Nie, B., Shimray, S., & Wang, Z. (2023). ChatGPT and a new academic reality: artificial intelligence‐written research papers and the ethics of the large language models in scholarly publishing. Journal of the Association for Information Science and Technology , 74(5), 570–581. https://doi.org/10.1002/asi.24750

McKinsey & Company. (2023). The state of AI in 2023 : Generative AI’s breakout year . McKinsey & Company. https://www.mckinsey.com/business-functions/mckinsey-analytics/our-insights/the-state-of-ai-in-2023-generative-ais-breakout-year

Mishra, P., & Koehler, M.J. (2006). Technological pedagogical content knowledge: A framework for teacher knowledge. Teachers College Record , 108(6), 1017–1054.

Mishra, P. (2019). Considering contextual knowledge: The TPACK diagram gets an upgrade. Journal of Digital Learning in Teacher Education , 35(2), 76–78. https://doi.org/10.1080/21532974.2019.1588611

Ng, D. T. K., Leung, J. K. L., Chu, S. K. W., & Qiao, M. S. (2021). Conceptualizing AI literacy: An exploratory review. Computers and Education: Artificial Intelligence , 2, 100041. https://doi.org/10.1016/j.caeai.2021.100041

Ocaña-Fernández, Y., Valenzuela-Fernández, L., & Garro-Aburto, L. (2019). Artificial intelligence and its implications in higher education. Propósitos y Representaciones , 7(2), 536–568. https://doi.org/10.20511/pyr2019.v7n2.274

Oliphant, T. (2015). Social media and web 2.0 in information literacy education in libraries: New directions for self-directed learning in the digital age. Journal of Information Literacy , 9(2), 37–49.

Pinski, M., & Benlian, A. (2023). AI literacy—Towards measuring human competency in artificial intelligence. Proceedings of the 56th Hawaii International Conference on System Sciences, 165–174. https://doi.org/10.24251/HICSS.2023.012

Ridley, M., & Pawlick-Potts, D. (2021). Algorithmic literacy and the role for libraries. Information Technology and Libraries , 40(2), 1–15. https://doi.org/10.6017/ital.v40i2.12963

Sobel, K., & Grotti, M.G. (2013). Using the TPACK framework to facilitate decision making on instructional technologies. Journal of Electronic Resources Librarianship , 25(4), 255–262. https://doi.org/10.1080/1941126X.2013.847671

UNESCO. (2021). AI and education: Guidance for policy-makers . United Nations Educational, Scientific and Cultural Organization. https://unesdoc.unesco.org/ark:/48223/pf0000376709

U.S. Department of Education. (2023). (rep.). Artificial Intelligence and the Future of Teaching and Learning: Insights and Recommendations . Retrieved from https://www2.ed.gov/documents/ai-report/ai-report.pdf .

Appendix A. Recruitment—Listservs

  • American Indian Library Association (AILA)
  • American Libraries Association (ALA) Members
  • Asian Pacific American Librarians Association (APALA)
  • □ Members
  • □ University Libraries Section
  • □ Distance and Online Learning Section
  • □ Instruction Section
  • Association of Research Libraries (ARL) Directors Listserv
  • Black Caucus American Library Association (BCALA)
  • Chinese American Librarians Association (CALA)
  • Greater Western Library Alliance (GWLA) Directors’ listserv
  • Minnesota Institute Graduates (MIECL)
  • New Mexico Consortium of Academic Libraries (NMCAL) Directors’ Listserv

Appendix B. AI and Academic Librarianship

Survey flow.

Standard: Block 1 (1 Question)

Block: Knowledge and Familiarity (12 Questions)

Standard: Perceived Competence and Gaps in AI Literacy (5 Questions)

Standard: Training on Generative AI for Librarians (6 Questions)

Standard: Desired Use of Generative AI in Libraries (7 Questions)

Standard: Demographic (10 Questions)

Standard: End of Survey (1 Question)

Start of Block: Block 1

Q1.1 Introduction

Dr. Leo Lo from the University of New Mexico is conducting a research project. You are invited to participate in a research study aiming to assess AI literacy among academic library employees, identify gaps in AI literacy that require further professional development and training, and understand the differences in AI literacy levels across different roles and demographic factors. Before you begin the survey, please read this Informed Consent Form carefully. Your participation in this study is voluntary, and you may choose to withdraw at any time without any consequences.

Artificial Intelligence (AI) refers to the development of computer systems and software that can perform tasks that would typically require human intelligence. These tasks may include problem-solving, learning, understanding natural language, recognizing patterns, perception, and decision-making

You are being asked to participate based of the following inclusion and exclusion criteria:

Inclusion Criteria:

  • Currently employed as an employee in a college or university library setting.
  • Willing and able to provide informed consent for participation in the study.

The Exclusion Criteria are as Follows:

  • Librarian employees working in non-academic library settings (e.g., public libraries, school libraries, special libraries).
  • Individuals who are not currently library employees or who are employed in non-library roles within academic institutions.

The purpose of this study is to evaluate the current AI literacy levels of academic librarians and identify areas where further training and development may be needed. The findings will help inform the design of targeted professional development programs and contribute to the understanding of AI literacy in the library profession.

If you agree to participate in this study, you will be asked to complete an online survey that will take approximately 15–20 minutes to complete. The survey includes questions about your AI knowledge, familiarity with AI tools and applications, perceived competence in using AI, and your opinions on training needs.

Potential Risks and Discomforts

There are no known risks or discomforts associated with participating in this study. Some questions might cause minor discomfort due to self-reflection, but you are free to skip any questions you prefer not to answer. Benefits While there are no direct benefits to you for participating in this study, your responses will help contribute to a better understanding of AI literacy among academic librarians and inform the development of relevant professional training programs.

Confidentiality

Your responses will be anonymous, and no personally identifiable information will be collected. Data will be stored securely on password-protected devices or encrypted cloud storage services, with access limited to the research team. The results of this study will be reported in aggregate form, and no individual responses will be identifiable. Your information collected for this project will NOT be used or shared for future research, even if we remove the identifiable information like your name.

Voluntary Participation and Withdrawal

Your participation in this study is voluntary, and you may choose to withdraw at any time without any consequences. Please note that if you decide to withdraw from the study, the data that has already been collected from you will be kept and used. This is necessary to maintain the integrity of the study and ensure that the data collected is reliable and valid.

Contact Information

If you have any questions or concerns about this study, please contact the principal investigator, Leo Lo, at [email protected] . If you have questions regarding your rights as a research participant, or about what you should do in case of any harm to you, or if you want to obtain information or offer input, please contact the UNM Office of the IRB (OIRB) at (505) 277-2644 or irb.unm.edu

By clicking “I agree” below, you acknowledge that you have read and understood the information provided above, had an opportunity to ask questions, and voluntarily agree to participate.

I agree (1)

I do not agree (2)

Skip To: End of Survey If Q1.1 = I do not agree

End of Block: Block 1

Start of Block: Knowledge and Familiarity

Q2.1 Artificial Intelligence

(AI) refers to the development of computer systems and software that can perform tasks that would typically require human intelligence. These tasks may include problem-solving, learning, understanding natural language, recognizing patterns, perception, and decision-making

Please rate your overall understanding of AI concepts and principles (using a Likert scale, e.g., 1 = very low, 5 = very high)

Q2.2 On a scale of 1 to 5, how would you rate your understanding of generative AI ? (1 = not at all knowledgeable, 5 = extremely knowledgeable)

Q2.3 Rate your familiarity with generative AI tools (e.g., ChatGPT, DALL-E, etc.) (using a Likert scale, e.g., 1 = not familiar, 5 = very familiar)

Q2.4 Which of the following AI technologies or applications have you encountered or used in your role as an academic librarian? (Select all that apply)

  • □ Chatbots (1)
  • □ Text or data mining tools (2)
  • □ Recommender systems (3)
  • □ Image or object recognition (4)
  • □ Automated content summarization (5)
  • □ Sentiment analysis (6)
  • □ Speech recognition or synthesis (7)
  • □ Other(please specify) (8) __________________________________________________

Q2.5 For each of the following AI concepts, indicate your understanding of the concept by selecting the appropriate response.

I don’t know what it is (1)

I know what it is but can’t explain it (2)

I can explain it at a basic level (3)

I can explain it in detail (4)

Machine Learning (1)

Natural Language Processing (NLP) (2)

Neural Network (3)

Deep Learning (4)

Generative Adversarial Networks (GANs) (5)

Q2.6 Which of the following generative AI tools have you used at least a few times? (Select all that apply)

  • □ Text generation (e.g., ChatGPT) (1)
  • □ Image generation (e.g., DALL-E, Mid Journey) (2)
  • □ Music generation (e.g., OpenAI’s MuseNet) (3)
  • □ Video generation (e.g. Synthesia) (4)
  • □ Presentation generation (e.g. Tome) (5)
  • □ Voice generation (e.g. Murf) (6)
  • □ Data synthesis for research purposes (7)
  • □ Other (please specify) (8) __________________________________________________

Display This Question:

If If Which of the following generative AI tools have you used at least a few times? (Select all that a… q://QID5/SelectedChoicesCount Is Greater Than 0

Q2.7 Have you ever paid for a premium version of at least one of the AI tools (for example, ChatGPT Plus; or Mid Journey subscription plan, etc.)

Q2.8 How frequently do you use generative AI tools in your professional work? (Select one)

Several times per week (2)

A few times per month (4)

Monthly (5)

Less than once a month (6)

Q2.9 For what purposes do you use generative AI tools in your professional work? (Select all that apply)

  • □ Content creation (e.g., blog posts, social media updates) (1)
  • □ Research assistance (e.g., literature reviews, data synthesis) (2)
  • □ Data analysis or visualization (3)
  • □ Cataloging or metadata generation (4)
  • □ User support or assistance (e.g., chatbots, virtual reference) (5)
  • □ Other (please specify) (6) __________________________________________________

Q2.10 On a scale of 1 to 5, how would you rate how reliable  generative AI tools have been in fulfilling your professional needs? (1 = not at all reliable, 5 = extremely reliable) 

Please explain your choice. 

1 (1) __________________________________________________

2 (2) __________________________________________________

3 (3) __________________________________________________

4 (4) __________________________________________________

5 (5) __________________________________________________

Q2.11 What level of concern do you have for the following potential challenges in implementing generative AI technologies in academic libraries? (Rate each challenge on a scale of 1 to 5, where 1 = not at all concerned and 5 = extremely concerned)

1 (1)

2 (2)

3 (3)

4 (4)

5 (5)

Obtaining adequate funding and resources for AI implementation (1)

Ethical concerns, such as bias and fairness (2)

Intellectual property and copyright issues (3)

Staff resistance or lack of buy-in (4)

Quality and accuracy of generated content (5)

Ensuring accessibility and inclusivity of AI tools for all users (6)

Potential job displacement due to automation (7)

Data privacy and security (8)

Technical expertise and resource requirements (9)

Other (please specify) (10)

Q2.12 How frequently do you use generative AI tools in your personal life ? (Select one)

End of Block: Knowledge and Familiarity

Start of Block: Perceived Competence and Gaps in AI Literacy

Q3.1 On a scale of 1 to 5, how confident are you in your ability to evaluate the ethical implications of using AI in your library? (1 = not at all confident, 5 = extremely confident)

Q3.2 On a scale of 1 to 5, how confident are you in your ability to participate in discussions about AI integration within your library? (1 = not at all confident, 5 = extremely confident)

Q3.3 On a scale of 1 to 5, how confident are you in your ability to collaborate with colleagues on AI-related projects in your library? (1 = not at all confident, 5 = extremely confident)

Q3.4 On a scale of 1 to 5, how confident are you in your ability to troubleshoot issues related to AI tools and applications used in your library? (1 = not at all confident, 5 = extremely confident)

Q3.5 On a scale of 1 to 5, how confident are you in your ability to provide guidance to library users about AI resources and tools ? (1 = not at all confident, 5 = extremely confident)

End of Block: Perceived Competence and Gaps in AI Literacy

Start of Block: Training on Generative AI for Librarians

Q4.1 Have you ever participated in any training or professional development programs focused on generative AI?

If Q4.1 = Yes

Q4.2 Please briefly describe the nature and content of the training or professional development program(s) you attended.

________________________________________________________________

Q4.3 To what extent do you agree or disagree with the following statement: “ I feel adequately prepared to use generative AI tools in my professional work as a librarian .” (1 = strongly disagree, 5 = strongly agree)

Q4.4 In which of the following areas do you feel the need for additional training or professional development related to AI? (Select all that apply)

  • □ Basic understanding of AI concepts and terminology (1)
  • □ Advanced understanding of AI concepts and techniques (2)
  • □ Familiarity with AI tools and applications in libraries (3)
  • □ Ethical considerations of AI in libraries (4)
  • □ Collaborating on AI-related projects (5)
  • □ Addressing privacy and data security concerns related to generative AI (6)
  • □ Troubleshooting AI tools and applications (7)
  • □ Providing guidance to library users about AI resources (8)
  • □ Other (please specify) (9) __________________________________________________

Q4.5 What types of professional development opportunities related to AI would be most beneficial to you? (Select all that apply)

  • □ Online courses or webinars (1)
  • □ In-person workshops or seminars (2)
  • □ Conference presentations or panel discussions (3)
  • □ Self-paced learning modules (4)
  • □ Mentoring or coaching (5)
  • □ Peer learning groups or communities of practice (6)
  • □ Other (please specify) (7) __________________________________________________

Q4.6 How important do you think it is for academic librarians to receive training on generative AI tools and applications in the next 12 months ? (1 = not at all important, 5 = extremely important)

End of Block: Training on Generative AI for Librarians

Start of Block: Desired Use of Generative AI in Libraries

Q5.1 To what extent do you agree or disagree with the following statement: “ I believe generative AI tools have the potential to benefit library services and operations .” (1 = strongly disagree, 5 = strongly agree)

Q5.2 How important do you think it is for your library to invest in the exploration and implementation of generative AI tools ? (1 = not at all important, 5 = extremely important)

Q5.3 If you have any additional thoughts or suggestions on how your library could or should use (or not use) generative AI tools, please share them here.

Q5.4 How soon do you think your library should prioritize implementing generative AI tools and applications? (Select one)

Immediately (1)

Within the next 6 months (2)

Within the next year (3)

Within the next 2–3 years (4)

More than 3 years from now (5)

Not a priority at all (6)

Q5.5 In your opinion, how prepared is your library to adopt generative AI tools and applications in the next 12 months? (1 = not at all prepared, 5 = extremely prepared)

Q5.6 To what extent do you think generative AI tools and applications will have a significant impact on academic libraries within the next 12 months ? (1 = no impact, 5 = major impact)

Q5.7 How urgent do you feel it is for your library to address the potential ethical and privacy concerns related to the use of generative AI tools and applications? (1 = not at all urgent, 5 = extremely urgent)

End of Block: Desired Use of Generative AI in Libraries

Start of Block: Demographic

Q6.1 In which type of academic institution is your library located? (Select one)

Community college (1)

College or university (primarily undergraduate) (2)

College or university (graduate and undergraduate) (3)

Research university (4)

Specialized or professional school (e.g., law, medical) (5)

Other (please specify) (6) __________________________________________________

Q6.2 Is your library an ARL member library?

Q6.3 Approximately how many students are enrolled at your institution? (Select one)

Fewer than 1,000 (1)

1,000–4,999 (2)

5,000–9,999 (3)

10,000–19,999 (4)

20,000–29,999 (5)

30,000 or more (6)

Q6.4 What is your current role or position in your organization? (Select one)

Senior management (e.g. Director, Dean, associate dean/director) (1)

Middle management (e.g. department head, supervisor, coordinator) (2)

Specialist or professional (e.g., librarian, analyst, consultant) (3)

Support staff or administrative (4)

Other (please specify) (5) __________________________________________________

Q6.5 In which area of academic librarianship do you primarily work? (Select one)

Administration or management (1)

Reference and research services (2)

Technical services (e.g., acquisitions, cataloging, metadata) (3)

Collection development and management (4)

Library instruction and information literacy (5)

Electronic resources and digital services (6)

Systems and IT services (7)

Archives and special collections (8)

Outreach, marketing, and communications (9)

Other (please specify) (10) __________________________________________________

Q6.6 How many years of experience do you have as a library employee?

Less than 1 year (1)

1–5 years (2)

6–10 years (3)

11–15 years (4)

16–20 years (5)

More than 20 years (6)

Q6.7 What is the highest level of education you have completed? (Select one)

High school diploma or equivalent (1)

Some college or associate degree (2)

Bachelor’s degree (3)

Master’s degree in library and information science (e.g., MLIS, MSLS) (4)

Master’s degree in another field (5)

Doctoral degree (e.g., PhD, EdD) (6)

Other (please specify) (7) __________________________________________________

Q6.8 What is your gender? (Select one)

Non-binary / third gender (3)

Prefer not to say (4)

Q6.9 What is your age range?

Under 25 (1)

65 and above (5)

Q6.10 How do you describe your ethnicity? (Select one or more)

  • □ American Indian or Alaskan Native (1)
  • □ Asian (2)
  • □ Black or African American (3)
  • □ Hawaiian or Other Pacific Islander (4)
  • □ Hispanic or Latino (5)
  • □ White (6)
  • □ Prefer not to say (7)
  • □ Other (8) __________________________________________________

End of Block: Demographic

Start of Block: End of Survey

Q7.1 Thank you for participating in our survey!

Your input is incredibly valuable to us and will contribute to our understanding of AI literacy among academic librarians. We appreciate the time and effort you have taken to share your experiences and opinions. The information gathered will help inform future professional development opportunities and address potential gaps in AI knowledge and skills.

We will carefully analyze the responses and share the findings with the academic library community. If you have any further comments or questions about the survey, please do not hesitate to contact us at [email protected].

Once again, thank you for your contribution to this important research. Your insights will help shape the future of AI in academic libraries.

Best regards,

University of New Mexico

End of Block: End of Survey

* Leo S. Lo is Dean, College of University Libraries and Learning Sciences at the University of New Mexico, email: [email protected] . ©2024 Leo S. Lo, Attribution-NonCommercial (https://creativecommons.org/licenses/by-nc/4.0/) CC BY-NC.

Creative Commons License

Article Views (Last 12 Months)

Contact ACRL for article usage statistics from 2010-April 2017.

Article Views (By Year/Month)

2024
January: 0
February: 0
March: 0
April: 0
May: 0
June: 3
July: 522

© 2024 Association of College and Research Libraries , a division of the American Library Association

Print ISSN: 0010-0870 | Online ISSN: 2150-6701

ALA Privacy Policy

ISSN: 2150-6701

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 01 July 2024

Analysis of anthropometric outcomes in Indian children during the COVID-19 pandemic using National Family Health Survey data

  • Amit Summan 1 ,
  • Arindam Nandi   ORCID: orcid.org/0000-0002-3967-2424 1 , 2 &
  • Ramanan Laxminarayan   ORCID: orcid.org/0000-0002-1390-9016 3 , 4  

Communications Medicine volume  4 , Article number:  127 ( 2024 ) Cite this article

147 Accesses

2 Altmetric

Metrics details

  • Epidemiology
  • Paediatric research
  • Public health
  • Viral infection

Disruptions in food, health, and economic systems during the COVID-19 pandemic may have adversely affected child health. There is currently limited research on the potential effects of the COVID-19 pandemic on stunting, wasting, and underweight status of young children.

We examine the short-term associations between the pandemic and anthropometric outcomes of under-5 children ( n  = 232,920) in India, using data from the National Family Health Survey (2019–2021). Children surveyed after March 2020 are considered as the post-COVID group, while those surveyed earlier are considered as pre-COVID. Potential biases arising from differences in socioeconomic characteristics of the two groups are mitigated using propensity score matching methods.

Post-COVID children surveyed in 2020 and 2021 have 1.2% higher underweight rates, 1.2% lower wasting rates, 0.1 lower height-for-age z-scores (HAZ), and 0.04 lower weight-for-height z-scores as compared with matched pre-COVID children. Post-COVID children surveyed in 2020 have 1.6%, 4.6%, and 2.4% higher stunting, underweight, and wasting rates, respectively, and 0.07 lower HAZ, as compared with matched pre-COVID children. Reductions in nutritional status are largest among children from households in the poorest wealth quintiles.

Conclusions

These findings indicate a trend towards a recovery in child anthropometric outcomes in 2021 after the initial post-pandemic reductions. The resilience of health and food systems to shocks such as COVID-19 should be strengthened while immediate investments are required to decrease child malnutrition and improve broader child health outcomes.

Plain language summary

This study examined how the COVID-19 pandemic affected the health of children under five years of age in India. We compared children surveyed before and after the pandemic. We find that children surveyed after the pandemic began in 2020 had decreased height and weight when compared to pre-pandemic measurements. In 2021, these outcomes improved but some outcomes, primarily weight, did not recover completely. These effects were most pronounced in the poorest households. Overall, our findings suggest that some of the effects of the pandemic may be short-term, but these require further study. Investments are required to reduce child malnutrition and improve the resilience of health and foods systems to shocks.

Similar content being viewed by others

data analysis and research findings

Trends in underweight, stunting, and wasting prevalence and inequality among children under three in Indian states, 1993–2016

data analysis and research findings

Trends and patterns of the double burden of malnutrition (DBM) in Peru: a pooled analysis of 129,159 mother–child dyads

data analysis and research findings

Small area variation in child undernutrition across 640 districts and 543 parliamentary constituencies in India

Introduction.

The COVID-19 pandemic posed unprecedented challenges to health and economic systems globally. Governments around the world responded with non-pharmaceutical interventions (NPIs) such as lockdowns and travel restrictions – especially before widespread vaccine availability—which limited mobility and caused economic shocks 1 , 2 , 3 . Health systems were overwhelmed and resources were diverted from routine to COVID-19-related care 4 . These breakdowns potentially affected health in excess of the direct effects due to COVID-19 infection—as of 2021, 15 million excess deaths globally have been attributed to the combined direct and indirect effects of COVID-19 5 , of which 3–4 million deaths were estimated to be in India 5 , 6 , 7 , 8 , 9 . Newer data at the regional level support these excess mortality estimates. For example, surveillance data from Madurai, a large city in India, shows that all-cause deaths were 30% higher than expected levels between March, 2020 and July, 2021 10 .

Beyond pediatric COVID-19 infections, the pandemic in India may have affected child health in additional ways. India entered a complete national lockdown on March 25, 2020, which suspended public transit and prohibited all gatherings, and only allowed essential services to operate. After the lockdown was lifted on June 1, 2020, local containment measures and other NPI restrictions continued for several months. During this time, resources were diverted from maternal and child healthcare programs to pandemic-related care, which may have adversely affected birth outcomes and early childhood health 11 , 12 . A modeling study estimated that during the six months starting in May 2020, reduced access to antenatal and postnatal care, immunization, and preventative child healthcare due to the pandemic could have resulted in 253,500 additional child deaths and 12,200 additional maternal deaths across 118 low- and middle-income countries (LMICs) 13 . In India, coverage and timely receipt rates of routine childhood vaccines were estimated to reduce by 2–10% due to the pandemic 14 . There were also large reductions in antenatal care seeking, emergency obstetric care delivery, and institutional childbirth rates because of the pandemic in India 11 .

Food system disruptions may have also affected child health and nutrition. Global food production and delivery systems operated at limited capacity due to worker shortages and supply chain bottlenecks 15 . In 2020, the number of food-insecure people rose by 28% (211 million) globally 16 . In India, wheat prices increased by 4% and rice prices increased by 11% from March 2020 to May 2020, and the national economy contracted by 6.6% through the end of 2020 17 , 18 . A lack of food security may have impacted child nutrition. Modeling estimates projected that pandemic-induced disruptions in economic, food, and health systems could have resulted in an additional 9.3 million low weight-for-height (wasting) and 2.6 million low height-for-age (stunted) children by 2022 in LMICs 15 . Stunting, wasting, and underweight status are associated with higher levels of morbidity and mortality in childhood 15 , 19 , 20 . These effects may represent a lower bound of the negative consequences since poor nutrition during the first 1000 days of life could have a lasting impact on health, schooling, and economic outcomes into later childhood and adulthood 21 , 22 .

There is limited research on the potential effects of the COVID-19 pandemic on stunting, wasting, and underweight status of young children. Predictive modeling studies 13 , 15 , 23 , 24 based on projections of the early trajectory of the pandemic may not accurately reflect the true impact of the pandemic on child health due to the inherent uncertainties of such analysis. Empirical investigation of the impact remains limited due to a lack of data. A yet unpublished study using national data 25 estimated that children born during the COVID-19 pandemic in India weighed 8.98 g less than children born before the pandemic. Another study based on data from a single health center in Mumbai found that the rate of preterm birth—babies born alive before 37 weeks of pregnancy are completed—decreased from 14% to 10% from the first wave to the second wave of the pandemic, although the authors did not control for confounding factors 26 . Preterm birth is a known contributing factor for stunting, wasting, and underweight status in infants 27 .

In this study, we provide the first national estimates of the associations of the COVID-19 pandemic with anthropometric outcomes of children under the age of five years in India during late 2020 and early 2021. Even before the pandemic, India had among the highest prevalence of childhood undernutrition globally. An estimated 36% of India’s 120 million under-5 children were underweight in 2016 28 . Considering that underweight prevalence has improved slowly over recent years, from 43% during 1998–1999, the COVID-19 pandemic could potentially roll back progress by several years 28 .

We used data from the fifth round of the National Family Health Survey, 2019–2021 (NFHS-5) of India. We considered under-5 children who were surveyed after March 25, 2020 (the first date of national lockdown) as the post-COVID group, i.e., those who experienced systemic shocks such as food insecurity, reduced access to healthcare, lower immunization rates, and economic instability due to the pandemic. In comparison, under-5 children surveyed prior to March 25, 2020, were considered as the pre-COVID group. To our knowledge, these are the first national estimates of the associations of the COVID-19 pandemic with child nutritional status in India or any large low- and middle-income country. Our findings indicate that Indian children measured after the pandemic had higher stunting, underweight, and wasting rates, and lower height-for-age z-scores as compared with similar children who were measured before the pandemic.

Data and outcome variables

We used data from the fifth round of the National Family Health Survey (NFHS-5) 29 , a cross-sectional, nationally representative demographic and health survey in India conducted by the International Institute for Population Sciences, which is supported by the Ministry of Health Family Welfare. Phase 1 of the survey was conducted from June 2019 to January 2020, covering 22 states and union territories (UTs) and phase 2 was conducted from January 2020 to April 2021, covering the remaining 14 states and 3 UTs (Supplementary Table  7 ). Due to COVID-19 lockdowns, survey activities ceased in April 2020 and resumed in November 2020. The survey collected data from 232,920 children under the age of five years (those born since 2016) in 636,699 households across 707 districts of India.

We examined the following growth outcomes of under-5 children: stunting, wasting, underweight, height for age z-scores, and weight for age z-scores (WHZ). Z-scores were calculated based on WHO Child Growth Standards. While wasting or underweight status can reflect both recent acute weight loss or a measure of cumulative malnutrition from birth, stunting is considered to be a function of cumulative infections and nutrition since birth or even from the in-utero stage. Stunting, wasting, and underweight status, were binary variables with a value of 1 if the child was more than two standard deviations lower in height-for-age, weight-for-height, and weight-for-age from the WHO reference median, respectively.

We considered the effects of the COVID-19 pandemic to begin on March 25, 2020, which was the first date of the national lockdown in India. We considered children surveyed by NFHS-5 after the start of the pandemic (those surveyed from November 2020 to April 2021) to be in the post-COVID group. Children surveyed before the pandemic (June 2019–March 2020) were included in the pre-COVID group. These definitions were based on the timing of data collection and not based on whether a child was infected with COVID-19 or exposed to COVID-19 (close contact with an infected person) as NFHS-5 did not collect data on infections or exposure. Our analysis therefore captured the broad population-level effect of the pandemic and related shocks to the health, food supply, and economic systems on child health.

We used publicly available anonymized data from NFHS-5 survey that received ethics clearance from the International Institute for Population Sciences of India. No separate ethics clearance was necessary for this study due to the anonymized nature and public availability of the data.

Propensity score matching analysis

We used propensity score matching (PSM) to estimate the associations of COVID-19 with child health outcomes. PSM is a quasi-experimental approach used to analyze the effects of interventions in non-experimental data 30 , 31 . In observational data, background characteristics such as socioeconomic or demographic factors often differ systematically between the intervention and control groups. If these differences are also correlated with the outcome indicator, a comparison of unadjusted group means or ordinary least square estimates of the association between the intervention status and outcome will be biased. Children in the post-COVID group were solely from NFHS-5 phase 2 states as compared with the pre-COVID group that had children from both phase 1 and phase 2 states. If inherent differences between the two groups (e.g., standard of living) are not adequately accounted for, they could influence perceived differences in child growth outcomes. For example, if phase 2 of the survey consists of richer states or those with better health systems on average, least squares estimates of the negative association between the pandemic and child growth outcomes may be smaller in magnitude than the true parameter.

PSM reduces the differences in observed characteristics of the two groups. It matches each post-COVID child with a child who was pre-COVID but had a similar probability of ‘being post-COVID’ based on observable characteristics. After matching, the difference in outcomes between post-COVID and pre-COVID children would be attributable to the pandemic assuming that unobservable factors were evenly distributed between the two groups. The average difference in outcomes between the two matched groups is known as the average treatment effect on the treated (ATT) 30 , 31 , 32 , 33 .

We employed a probit model to regress the binary indicator of whether a child was in the pos-COVID group on a set of covariates which included indicators of the state of residence, type of residence (urban vs. rural), wealth index quintile, religion, caste, household size, sex of household head, marital status of the mother, mother and household head’s education level and age, mother’s height, child’s age in months, sex, birth order (first, second, third, fourth or higher), and a binary indicator of whether the child was born in a health facility (instead of home birth). Wealth index was a composite index of household ownership of durable assets such as TV, radio, and car, along with housing condition indicators such the type of construction material, and the availability of toilet and electricity 29 . Meta-analyses have indicated that these variables are all associated with the nutritional status of children 34 , 35 , 36 . Social and economic status, sex, age, and education level of the household head, and place of residence of child’s household may be associated with access to resources and nutritious foods. Child sex, birth order, and household size may affect intrahousehold resource allocation for the child relative to others within the household. Mother’s education level and place of delivery may reflect the quality of parenting and level of investment in child health. Mother’s height impacts child birth outcomes—for example, mothers with short stature are more likely to have babies with low birth weight and small for gestational age status 37 . These children may also experience lower than average physical growth rates.

Using the predicted probability (known as the propensity score) from this regression, we matched each post-COVID child with a pre-COVID child. We used one-to-one, nearest-neighbor matching with replacement. Heteroskedastic-consistent analytical standard errors were used 38 . After matching, we examined the average difference in child growth outcomes across all matched pairs of post-COVID and pre-COVID children. These estimators can be interpreted as the ATT effect of the pandemic.

Sensitivity analysis and matching quality tests

In a sensitivity test, we accounted for possible differences in past trends in nutritional status between the pre- and post-COVID groups. We repeated our analysis after including three indicators of past nutrition as covariates in our propensity score matching analysis: percentage of under-5 children that were 1) stunted, 2) wasted, and 3) underweight. These indicators were obtained, at the state level, from the National Family Health Survey 2015-2016 (NFHS-4) and combined with the child-level NFHS-5 data in our analysis. Minor differences in state boundaries between the two survey rounds were adjusted in the following way. Jammu and Kashmir (J&K) estimates from NFHS-4 were assigned to J&K and Ladakh in NFHS-5, and estimates of Daman & Diu and Dadra & Nagar Haveli were combined due to small sample sizes.

In further sensitivity analysis, we also examined other matching algorithms—matching of observations to the nearest three neighbors and kernel matching. In kernel matching, each treated observation is matched with a weighted average of control observations, where the weight is an inverse function of the distance between control and treatment observation propensity scores. We imposed “common support” in all models—all observations below the minimum or above the maximum propensity score for the post-COVID group were excluded.

Previous analysis has shown that health service delivery improved partially in late 2020 14 , which could have potentially improved child anthropometric outcomes in 2021. To capture such trends, we separately analyzed the entire sample of post-COVID children (surveyed in 2020 and 2021) and those surveyed in 2020. The comparison group for both analyses were the same (pre-COVID children surveyed prior to the first lockdown). It allowed us to understand the potential medium-term and short-term effects of the pandemic. We also conducted subsample analysis, restricting the sample to male or female children, and rural, urban, high-wealth (top three wealth quintiles), and low-wealth (bottom two wealth quintiles) households.

We tested the validity of our PSM method by evaluating matching quality in two ways. First, we examined the difference in mean and median percentage bias across all matching variables (covariates of the first stage probit regression of PSM) before and after matching. Bias measures the differences in the sample mean (median) of a covariate between matched and unmatched groups, calculated as the percentage of the square root of the average (median) of the sample variance of the groups. A reduction in bias indicates the matching procedure has made the two groups more comparable. Second, we examined the pseudo R 2 of the PSM model. The subsample of only matched observations from both groups is taken, then first-stage PSM is conducted again from this subsample providing a pseudo R 2 value. A higher p value or lower pseudo R 2 after matching would indicate there is a reduction in systematic differences in variables. All analyses were conducted using Stata version 14.2 and p  < 0.05 was used for statistical significance.

Reporting summary

Further information on research design is available in the  Nature Portfolio Reporting Summary linked to this article.

Summary statistics

Table  1 presents the differences between key demographic and socioeconomic variables between children surveyed before and after the COVID-19 lockdown. There were 73,349 post-COVID and 159,571 pre-COVID under-5 children in NFHS-5. Across the country, there were more post-COVID children in the Central region (51% vs. 14%, p  < 0.01) and fewer in the Northeast region (4% vs. 19%, p  < 0.01), relative to pre-COVID groups. Other variables with large significant differences include the greater number of post-COVID children in Hindu households (81% vs. 70%, p  < 0.01) and with mothers completing higher education (16% vs. 13%, p  < 0.01). Children in the post-COVID group were 0.46 months younger ( p  < 0.01) than pre-COVID children.

Estimates of the associations of the pandemic with child growth outcomes

Table  2 presents the propensity score matching (PSM)-based (one-to-one nearest neighbor matching) summary estimates of the associations of the pandemic with child growth outcomes. Estimates are reported separately for all post-COVID children (surveyed in 2020 and 2021) and post-COVID children surveyed in 2020. The same sample of pre-COVID children was used as the comparison group for both analyses.

During 2020 and 2021, post-COVID children had 1.2% (95% CI: 0.5–1.9%, p  < 0.01) higher underweight rates, 0.10% (95% CI: 0.06–0.13, p  < 0.01) lower height-for-age Z-scores, and 0.04 (95% CI: 0.01–0.07, p  < 0.01) lower weight-for-height Z-scores as compared with matched pre-COVID children. However, wasting rates were 1.2% (95% CI: 0.5%–1.9%, p  < 0.01) lower in post-COVID children as compared with the matched comparison group.

During 2020, post-COVID children had 4.6% (95% CI: 3.4%–5.9%, p  < 0.01), 1.6% (95% CI: 0.2%–2.9%, p  < 0.05), and 2.4% (95% CI: 1.3%–3.5%, p  < 0.01) higher underweight, stunting, and wasting rates, respectively than matched pre-COVID children. They also had 0.07 (95% CI: 0.01–0.12, p  < 0.05) lower height-for-age Z-scores than matched pre-COVID children.

These estimates were similar in sensitivity analyses in which we used two alternative matching algorithms—three nearest three neighbor matching and kernel matching—instead of one-to-one nearest neighbor matching (Supplementary Tables  1 and 2 ). The results were also not sensitive to the inclusion of past state-level nutrition trends (anthropometric indicators from NFHS 2015-2016 included as covariates), as presented in Table  3 .

Subsample analysis

Table  4 presents the subsample results by wealth group, rural or urban location, and sex of the child. In our analysis with children surveyed in both 2020 and 2021, post-COVID children in the rural and low-wealth (two poorest wealth quintile) subsamples were more likely to be underweight than matched pre-COVID children, while no differences in underweight rates were seen in high-wealth and urban subsamples. Height-for-age z-scores were lower among post-COVID children across all subsamples, and the largest differences with pre-COVID children were in low-wealth households (−0.17) followed by rural households (−0.06).

When we separately considered children surveyed in 2020, rural and low-wealth post-COVID children were more likely to be wasted as compared to matched pre-COVID children from the corresponding subgroups, while wasting rates in urban and high-wealth post-COVID children were not different from their matched pre-COVID counterparts. Stunting rates were higher in post-COVID children than matched pre-COVID children across all subsamples except for low-wealth households and girls, where the difference was not statistically significant. In all subsamples, underweight rates were higher in post-COVID children than in the matched comparison group.

Differences in growth indicators between post-COVID and matched pre-COVID groups were larger for boys as compared with girls. Our subsample estimates were not sensitive to alternative three nearest neighbors and kernel matching algorithms (Supplementary Tables  3 and 4 ).

Matching quality test results

PSM substantially reduced systematic differences between the post-COVID and pre-COVID groups. There were substantial reductions in mean and median percentage bias in the values of the covariates from the unmatched data to the matched sample. The goodness of fit of the propensity score estimation model (pseudo R 2 ) was substantially lower in the matched sample than the unmatched data — the pseudo R 2 values reduced from 0.08–0.10 in all models to 0.00 (Supplementary Tables  5 and 6 ). The results show that our PSM estimator was valid 39 , 40 , 41 .

The COVID-19 pandemic has caused substantial disruptions in food, health, and economic systems globally, causing reductions in health service utilization and access to nutritious foods. We used national health survey data in India to estimate the potential effect of the pandemic on child health and nutritional status. We found that after accounting for socioeconomic factors, weight and height indicators of under-5 children surveyed after the pandemic were worse as compared with children surveyed before the pandemic. The largest differences were concentrated in children from rural and low-income households. The differences between the two groups were also larger when only children surveyed in 2020 were considered, suggesting a partial recovery in nutritional status of children in 2021.

There is limited empirical evidence with which we could compare our results. Our findings are consistent with one yet unpublished study 25 in the Indian context that estimated that children born during the pandemic had significantly lower birth weight than those born before the pandemic. Another study from a single health center in Mumbai found that preterm birth rates decreased from 14% to 10% from the first wave to the second wave of the pandemic, although the authors did not account for confounding factors 26 . In the South Asia region, a study from a tertiary health center in Dhaka, Bangladesh, examined 9290 hospitalized children and found higher stunting, wasting, and risk of mortality in children under six months of age who were admitted to the hospital during the COVID-19 pandemic compared to children admitted pre-pandemic 42 .

Globally, several studies have focused on preterm birth rates and neonatal birth weight during the pandemic. A 2022 meta-analysis of 66 studies—more than 70% of which were from upper middle- or high-income countries including four from China—found that there was a significant reduction in preterm birth rates during the pandemic 43 . Most studies included in both meta-analyses used single health center data. Besides birth outcomes, studies focusing on children have primarily analyzed weight changes in school age children (older than five years of age) and adolescents in higher-income countries 44 , 45 , 46 . A 2023 meta-analysis 43 of 36 studies estimated that mean birth weight increased during the COVID-19 pandemic, but there was no change among children in LMICs. There were 15 LMIC studies included in this meta-analysis, with six studies from China. Most studies included in the meta-analysis used single health center data, unlike ecological or population-level data used in our analysis.

Previous modeling studies 13 , 15 , 23 , 24 conducted in the early stages of the pandemic had predicted negative impacts of the pandemic on nutritional status. One study estimated that COVID-19 may cause an additional 9.3 million wasted children, 2.6 million stunted children, and 168,000 additional child deaths by 2022 globally due to disruptions in health and economic systems 15 . Future productivity losses of $29.7 billion were estimated globally due to increased stunting and mortality 15 . Another projection based on a model of 118 LMICs predicted at least 253,500 additional child deaths over a six month period, of which 18–23% would be due to increased child wasting and 41% due to reduced access to antibiotics for pneumonia and neonatal sepsis and oral rehydration solution for diarrhea 13 . An analysis of economic disruptions in 129 countries estimated that a 5% reduction in gross domestic product per capita in 2020 would have caused 282,996 additional deaths in under-5 children 24 . With 43,063 under-5 deaths, India was the largest contributor to this estimated burden, and the deaths were estimated to double for every additional 5% decrease in economic activity 24 .

Our estimates suggest that the worst-case scenarios predicted by modeling studies were not realized in India in the short-term. We found an increase in probability of stunting of 1.6% by the end of 2020, but a partial recovery in 2021. Height-for-age z-scores did not fully recover during this study period, with a reduction of 0.1 through mid-2021. Underweight rate among pandemic-affected children was 1.2% higher, with weight-for-height z-scores decreasing by 0.04 at the end of the study period. These estimates are equivalent to an additional 1.4. million underweight under-5 children in 2021 or a loss of two years of progress in underweight rate reduction based on historical progress. Both wasting rates and WHZ were lower in the post-pandemic period up to April 2021, relative to the pre-pandemic periods. Therefore, the distribution of weight-for-height improved for those with the lowest WHZ, but decreased everywhere else. In sub-sample analysis, rise in child underweight rates was observed only in rural and low-wealth households. Children in low-wealth households experienced the highest reduction in height-for-age and weight-for-height z-scores.

Our study has important policy implications. According to the fetal origins hypothesis, shocks to child health—such as acute or sustained malnutrition—during early development stages can have long-lasting effects 21 , 22 . Stunting and wasting have been associated with fewer years of schooling completed, poor cognition, greater risk of mortality, and lower wages 47 . Wasting and stunting in children is associated with a 12-fold increased mortality risk 19 . Increases in stunting may have resulted in a 7% reduction in optimal cognitive function in Africa and South Asia. Childhood stunting is estimated to cause annual losses of 9–10% GDP per capita when stunted children reach adulthood 48 . Catch-up growth is possible with proper nutrition after initial stunting, but is unable to completely undo the total damage to child development and developmental epigenetics due to shocks during sensitive growth periods.

The COVID-19 pandemic may have adversely affected child nutrition indicators through several mechanisms: 1) disruptions to food systems and the food supply chain may have limited the availability of foods for mothers during pregnancy and in postpartum, and for their children, 2) access to maternal and child health services may have been limited during the pandemic, especially due to lockdowns, and 3) mothers may have been infected by COVID-19 during pregnancy or in the postpartum period, affecting child health 49 , 50 . As there are a myriad of factors potentially affecting child nutrition, a multipronged approach will be necessary to recover from the damage caused to child health by the pandemic.

First, in Asia and Africa, where 80% of food consumption relies on the supply chain 51 , investments to enhance food and health system resilience are crucial to mitigate shocks such as the COVID-19 pandemic 16 . The pandemic led to reduced food supply due to production restrictions and hoarding. By May 2020, wheat and rice prices in India increased by 4% and 11% 51 , and prices of grocery staples like potatoes and tomatoes increased by 15% and 28%, respectively, from the pre- to post-lockdown period 52 . Longer global food supply chains were more susceptible than shorter ones 53 . In Andhra Pradesh, household food insecurity rose from 21% in December 2019 to 80% in August 2020 54 . Children in food-insecure households were almost half as likely to have a diverse diet (at least four of the seven food groups consumed in a 24 h period) as compared with food-secure households 54 . Despite expanded government initiatives such as free rations, only half of households received food supplementation, with consistently food-insecure households having lower access 54 . A case study on Maharashtra found that the closure of wholesale markets disrupted supply chains and producers faced challenges due to financial and resource constraints, resulting in higher food prices 55 . A robust local agricultural production and delivery system, including a strengthened public food distribution system, is essential to prevent similar food availability shocks.

Second, access to maternal and child health and nutrition programs must be fully restored to their pre-pandemic levels. During the pandemic, programs that directly provide food to mothers and children were affected. Nationwide, schools were closed for over 18 months and the daily free school lunch program (known as the mid-day meal scheme)—which is a major source of supplementary child nutrition in India—was suspended 56 , 57 . Additionally, some mothers and their children may have delayed healthcare seeking during the pandemic, resulting in reductions in antenatal care visits and institutional childbirths 11 , 58 . Maternal malnutrition during pregnancy and post-pregnancy, poor feeding and care practices, and childhood infections may be associated with lower maternal care access, which in turn may negatively affect child nutritional status 59 .

Finally, universalizing key childhood health programs such as routine immunization is critical. Previous work has shown positive relationships between routine childhood vaccinations and anthropometric outcomes of children in India 60 , 61 , 62 . Despite the role vaccines play in improving child growth outcomes, the coverage of DPT3 (diphtheria, pertussis, and tetanus, third dose) vaccine among Indian children reduced from 91% in 2019 to 85% in 2020 63 . Globally, an estimated 23 million children did not receive DPT3 in 2020 63 . There is an urgent need for catch-up vaccinations for missed doses and continued efforts towards universal coverage.

These efforts will require a multisectoral approach and international support 15 . However, funding to multilateral organizations such as the WHO, UNICEF, and the World Food Programme, may decrease during a crisis. Increased pressure on donor countries and the ability to mobilize domestic resources will be key. Increased investments for these interventions is required immediately—a study suggested an additional $1.2 billion per year will be needed to meet global nutrition targets due to the COVID-19 pandemic, on top of the previously $7 billion estimated need 15 .

There are important limitations to our analysis. While we accounted for a wide range of potentially confounding factors in our propensity score matching, there may remain unobserved characteristics of children that are different between the pre-COVID and post-COVID groups. If such differences are correlated with child growth outcomes, they may bias our estimates. Second, while we have primarily focused on undernutrition in the context of weight and height indicators, it is also possible that some children may have experienced increased rates of overweight and obesity due to reduced physical activity during the lockdown. A meta-analysis of 15 countries found a link between COVID-19 lockdowns and rates of obesity and overweight among children and adolescents 46 . Third, because the NFHS-5 survey ended in April 2021, out study could not capture the potential negative effects of the pandemic during the COVID-19 delta variant surge and related lockdowns and other measures during April to June of 2021 in India. Fourth, our work could not identify the mechanisms through which the identified changes occurred. For example, we could not measure the relative contribution of environmental stressors, infections, and lack of nutrition in reducing a child’s nutritional status. Finally, we considered height and weight outcomes as they are the most commonly used and reported growth indicators. Other biomarkers such as head circumference and anemia rates could also be examined in the future.

The COVID-19 pandemic led to decreases in anthropometric outcomes of children during their sensitive developmental stages. A partial recovery in child health outcomes was observed in 2021; child height and weight have not fully recovered, and the effects are concentrated in vulnerable households. The resilience of health and food systems to shocks such as COVID-19 should be strengthened while immediate investments are required to decrease child malnutrition and improve broader child health outcomes.

Data availability

Raw household survey data are publicly available from the Demographic and Health Surveys, https://dhsprogram.com/data/ . Source data for this study are available from Dataverse 64 .

Code availability

Code is available from Dataverse 64 .

Borkowski, P., Jażdżewska-Gutta, M. & Szmelter-Jarosz, A. Lockdowned: everyday mobility changes in response to COVID-19. J. Transp. Geogr. 90 , 102906 (2021).

Article   PubMed   Google Scholar  

Ferguson, N. M. et al. Impact of non-pharmaceutical interventions (NPIs) to reduce COVID-19 mortality and healthcare demand. https://doi.org/10.25561/77482 (2020).

Summan, A. & Nandi, A. Timing of non-pharmaceutical interventions to mitigate COVID-19 transmission and their effects on mobility: a cross-country analysis. Eur. J. Health Econ. https://doi.org/10.1007/s10198-021-01355-4 (2021).

Moynihan, R. et al. Impact of COVID-19 pandemic on utilisation of healthcare services: a systematic review. BMJ Open 11 , e045343 (2021).

World Health Organization. 14.9 million excess deaths associated with the COVID-19 pandemic in 2020 and 2021. https://www.who.int/news/item/05-05-2022-14.9-million-excess-deaths-were-associated-with-the-covid-19-pandemic-in-2020-and-2021 (2022).

Jha, P. et al. COVID mortality in India: National survey data and health facility deaths. Science 375 , 667–671 (2022).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Malani, A. & Ramachandran, S. Using household rosters from survey data to estimate all-cause excess death rates during the COVID pandemic in India. J. Dev. Econ. 159 , 102988 (2022).

Article   Google Scholar  

Banaji, M. & Gupta, A. Estimates of pandemic excess mortality in India based on civil registration data. PLOS Glob. Public Health 2 , e0000803 (2022).

Article   PubMed   PubMed Central   Google Scholar  

Guilmoto, C. Z. An alternative estimation of the death toll of the Covid-19 pandemic in India. PLOS One 17 , e0263187 (2022).

Lewnard, J. A., B, C. M., Kang, G. & Laxminarayan, R. Attributed causes of excess mortality during the COVID-19 pandemic in a south Indian city. Nat. Commun. 14 , 3563 (2023).

Sharma, S. et al. Impact of COVID-19 on utilization of maternal and child health services in India: health management information system data analysis. Clin. Epidemiol. Glob. Health 21 , 101285 (2023).

Jain, R. & Dupas, P. The effects of India’s COVID-19 lockdown on critical non-COVID health care and outcomes: evidence from dialysis patients. Soc. Sci. Med. 296 , 114762 (2022).

Roberton, T. et al. Early estimates of the indirect effects of the COVID-19 pandemic on maternal and child mortality in low-income and middle-income countries: a modelling study. Lancet Glob. Health 8 , e901–e908 (2020).

Summan, A., Nandi, A., Shet, A. & Laxminarayan, R. The effect of the COVID-19 pandemic on routine childhood immunization coverage and timeliness in India: retrospective analysis of the National Family Health Survey of 2019–2021 data. ancet Reg. Health—Southeast Asia 8 , 100099 https://doi.org/10.1016/j.lansea.2022.100099 (2022).

Osendarp, S. et al. The COVID-19 crisis will exacerbate maternal and child undernutrition and child mortality in low- and middle-income countries. Nat. Food 2 , 476–484 (2021).

Article   CAS   PubMed   Google Scholar  

Beckman, J., Baquedano, F. & Countryman, A. The impacts of COVID-19 on GDP, food prices, and food security. Q. Open 1 , qoab005 (2021).

Reserve Bank of India. Reserve Bank of India Report on Currency and Finance 2021-22. 167 https://m.rbi.org.in/scripts/AnnualPublications.aspx?head=Report%20on%20Currency%20and%20Finance (2023).

Bairagi, S., Mishra, A. K. & Mottaleb, K. A. Impacts of the COVID-19 pandemic on food prices: evidence from storable and perishable commodities in India. PLOS One 17 , e0264355 (2022).

Thurstans, S. et al. The relationship between wasting and stunting in young children: a systematic review. Matern. Child Nutr. 18 , e13246 (2022).

Mertens, A. et al. Child wasting and concurrent stunting in low- and middle-income countries. Nature 621 , 558–567 (2023).

Almond, D. & Currie, J. Killing me softly: the fetal origins hypothesis. J. Econ. Perspect. 25 , 153–172 (2011).

Barker, D. J. The fetal and infant origins of adult disease. BMJ 301 , 1111 (1990).

Busch-Hallen, J., Walters, D., Rowe, S., Chowdhury, A. & Arabi, M. Impact of COVID-19 on maternal and child health. Lancet Glob. Health 8 , e1257 (2020).

Cardona, M., Millward, J., Gemmill, A., Jison Yoo, K. & Bishai, D. M. Estimated impact of the 2020 economic downturn on under-5 mortality for 129 countries. PLoS One 17 , e0263245 (2022).

Kumar, S., Hill, C. & Halliday, T. Association between births during COVID-19 pandemic and neonatal outcomes: a nationwide cross-sectional study in India. SSRN Scholarly Paper at https://doi.org/10.2139/ssrn.4361270 (2023).

Mahajan, N. N. et al. Increased spontaneous preterm births during the second wave of the coronavirus disease 2019 pandemic in India. Int. J. Gynaecol. Obstet. 157 , 115–120 (2022).

Christian, P. et al. Risk of childhood undernutrition related to small-for-gestational age and preterm birth in low- and middle-income countries. Int. J. Epidemiol. 42 , 1340–1355 (2013).

International Institute for Population Sciences. National Family Health Survey (NFHS-4) 2015-2016: India. (2017).

National Family Health Survey -India. http://rchiips.org/nfhs/NFHS-5Reports/NFHS-5_INDIA_REPORT.pdf .

Dehejia, R. H. & Wahba, S. Propensity score-matching methods for nonexperimental causal studies. Rev. Econ. Stat. 84 , 151–161 (2002).

Dehejia, R. H. & Wahba, S. Causal effects in nonexperimental studies: reevaluating the evaluation of training programs. J. Am. Stat. Assoc. 94 , 1053–1062 (1999).

Heckman, J. J., Ichimura, H. & Todd, P. E. Matching as an econometric evaluation estimator: evidence from evaluating a job training programme. Rev. Econ. Stud. 64 , 605–654 (1997).

Rosenbaum, P. R. & Rubin, D. B. The central role of the propensity score in observational studies for causal effects. Biometrika 70 , 41–55 (1983).

Vilcins, D., Sly, P. D. & Jagals, P. Environmental risk factors associated with child stunting: a systematic review of the literature. Ann. Glob. Health 84 , 551–562 (2018).

Wondimagegn, Z. T. Magnitude and determinants of stunting among children in Africa: a systematic review. Curr. Res. Nutr. Food Sci. J. 2 , 88–93 (2014).

Charmarbagwala, R., Ranger, M., Waddington, H., & White, H. The Determinants of Child Health and Nutrition: a Meta-Analysis. (World Bank, 2004).

Softa, S. M. et al. The association of maternal height with mode of delivery and fetal birth weight at King Abdulaziz University Hospital, Jeddah, Saudi Arabia. Cureus 14 , e27493 (2022).

PubMed   PubMed Central   Google Scholar  

Abadie, A. & Imbens, G. W. Large sample properties of matching estimators for average treatment effects. Econometrica 74 , 235–267 (2006).

Sianesi, B. An evaluation of the Swedish system of active labor market programs in the 1990s. Rev. Econ. Stat. 86 , 133–155 (2004).

Leuven, E. & Sianesi, B. PSMATCH2: Stata module to perform full Mahalanobis and propensity score matching, common support graphing, and covariate imbalance testing. Statistical Software Components S432001, Boston College Department of Economics (2003).

Nandi, A., Behrman, J. R., Kinra, S. & Laxminarayan, R. Early-life nutrition is associated positively with schooling and labor market outcomes and negatively with marriage rates at age 20–25 years: evidence from the Andhra Pradesh Children and Parents Study (APCAPS) in India. J. Nutr. 148 , 140–146 (2018).

Nuzhat, S. et al. Health and nutritional status of children hospitalized during the COVID-19 pandemic, Bangladesh. Bull. World Health Organ 100 , 98–107 (2022).

Yao, X. D., Zhu, L. J., Yin, J. & Wen, J. Impacts of COVID-19 pandemic on preterm birth: a systematic review and meta-analysis. Public Health 213 , 127–134 (2022).

Wen, J., Zhu, L. & Ji, C. Changes in weight and height among Chinese preschool children during COVID-19 school closures. Int. J. Obes. 45 , 2269–2273 (2021).

Article   CAS   Google Scholar  

Karatzi, K., Poulia, K.-A., Papakonstantinou, E. & Zampelas, A. The impact of nutritional and lifestyle changes on body weight, body composition and cardiometabolic risk factors in children and adolescents during the pandemic of COVID-19: a systematic review. Children 8 , 1130 (2021).

Chang, T.-H. et al. Weight gain associated with COVID-19 lockdown in children and adolescents: a systematic review and meta-analysis. Nutrients 13 , 3668 (2021).

Nandi, A., Behrman, J. R., Bhalotra, S., Deolalikar, A. B. & Laxminarayan, R. Human capital and productivity benefits of early childhood nutritional interventions. Disease Control Priorities Third Edition, Volume 8: Child & Adolescent Development 8, (385–402. World Bank Publications, Washington D.C., 2017).

Chapter   Google Scholar  

Galasso, E. & Wagstaff, A. The aggregate income losses from childhood stunting and the returns to a nutrition intervention aimed at reducing stunting. Econ Hunm Biol. 34 , 225–238 (2018).

Wei, X. et al. Impact of China’s essential medicines scheme and zero-mark-up policy on antibiotic prescriptions in county hospitals: a mixed methods study. Trop. Med. Int. Health 22 , 1166–1174 (2017).

Elsaddig, M. & Khalil, A. Effects of the COVID pandemic on pregnancy outcomes. Best. Pract. Res Clin. Obstet. Gynaecol. 73 , 125–136 (2021).

Akseer, N., Kandru, G., Keats, E. C. & Bhutta, Z. A. COVID-19 pandemic and mitigation strategies: implications for maternal and child health and nutrition. Am. J. Clin. Nutr. 112 , 251–256 (2020).

Narayanan, S. & Saha, S. Urban food markets and the COVID-19 lockdown in India. Glob. Food Secur. 29 , 100515 (2021).

Picchioni, F., Goulao, L. F. & Roberfroid, D. The impact of COVID-19 on diet quality, food security and nutrition in low and middle-income countries: a systematic review of the evidence. Clin. Nutr. 41 , 2955–2964 (2022).

Nguyen, P. H. et al. Impact of COVID-19 on household food insecurity and interlinkages with child feeding practices and coping strategies in Uttar Pradesh, India: a longitudinal community-based study. BMJ Open 11 , e048738 (2021).

Sukhwani, V., Deshkar, S. & Shaw, R. COVID-19 lockdown, food systems and urban–rural partnership: case of Nagpur, India. Int. J. Environ. Res. Public Health 17 , 5710 (2020).

Thankachan, P. et al. There should always be a free lunch: the impact of COVID-19 lockdown suspension of the mid-day meal on nutriture of primary school children in Karnataka, India. BMJ Nutr. Prev. Health 5 , 364–366 (2022).

Afridi, F. Child welfare programs and child nutrition: Evidence from a mandated school meal program in India. J. Dev. Econ. 92 , 152–165 (2010).

Singh, A. K. et al. Impact of COVID-19 pandemic on maternal and child health services in Uttar Pradesh, India. J. Fam. Med Prim. Care 10 , 509–513 (2021).

UNICEF. Nutrition and Care for Children with Wasting | UNICEF. https://www.unicef.org/nutrition/child-wasting (2022).

Nandi, A. et al. Anthropometric, cognitive, and schooling benefits of measles vaccination: Longitudinal cohort analysis in Ethiopia, India, and Vietnam. Vaccine 37 , 4336–4343 (2019).

Anekwe, T. D. & Kumar, S. The effect of a vaccination program on child anthropometry: evidence from India’s Universal Immunization Program. J. Public Health 34 , 489–497 (2012).

Nandi, A., Deolalikar, A. B., Bloom, D. E. & Laxminarayan, R. Haemophilus influenzae type b vaccination and anthropometric, cognitive, and schooling outcomes among Indian children. Ann. N.Y. Acad. Sci. 1449 , 70–82 (2019).

COVID-19 pandemic leads to major backsliding on childhood vaccinations, new WHO, UNICEF data shows. https://www.who.int/news/item/15-07-2021-covid-19-pandemic-leads-to-major-backsliding-on-childhood-vaccinations-new-who-unicef-data-shows .

Summan, A. Replication data for: ‘COVID-19 pandemic and anthropometric outcomes in Indian children: analysis of the 2019-2021 National Family Health Survey data’. Harvard Dataverse https://doi.org/10.7910/DVN/E4DVWH (2024).

Download references

Acknowledgements

This work was supported, in part, by the Bill & Melinda Gates Foundation [INV-029062]. Under the grant conditions of the Foundation, a Creative Commons Attribution 4.0 Generic License has already been assigned to the Author Accepted Manuscript version that might arise from this submission. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Author information

Authors and affiliations.

One Health Trust, 5636 Connecticut Avenue NW, PO Box 42735, Washington, DC, 20015, USA

Amit Summan & Arindam Nandi

The Population Council, 1 Dag Hammarskjold Plaza, New York, NY, 10017, USA

Arindam Nandi

One Health Trust, Obeya Pulse, First Floor, 7/1, Halasur Road, Bengaluru, Karnataka, 560042, India

Ramanan Laxminarayan

High Meadows Environmental Institute, Princeton University, Guyot Hall, Princeton, NJ, 08544, USA

You can also search for this author in PubMed   Google Scholar

Contributions

A.S. and A.N. designed the study. A.S. conducted the analysis and wrote the first version of the manuscript. A.S. and A.N. had access to, and verified, the data. A.S., A.N., and R.L. interpreted the findings and critically evaluated and edited the manuscript. A.S., A.N., and R.L. approved the final draft and accepted the responsibility for publication.

Corresponding author

Correspondence to Arindam Nandi .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Peer review

Peer review information.

Communications Medicine thanks Kaushik Bose, Aritra Ghosh and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary information, reporting summary, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Summan, A., Nandi, A. & Laxminarayan, R. Analysis of anthropometric outcomes in Indian children during the COVID-19 pandemic using National Family Health Survey data. Commun Med 4 , 127 (2024). https://doi.org/10.1038/s43856-024-00543-6

Download citation

Received : 11 May 2023

Accepted : 03 June 2024

Published : 01 July 2024

DOI : https://doi.org/10.1038/s43856-024-00543-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

data analysis and research findings

Log in using your username and password

  • Search More Search for this keyword Advanced search
  • Latest content
  • For authors
  • Browse by collection
  • BMJ Journals

You are here

  • Volume 14, Issue 7
  • Suicidal behaviours and associated factors among medical students in Bangladesh: a protocol for systematic review and meta-analysis (2000–2024)
  • Article Text
  • Article info
  • Citation Tools
  • Rapid Responses
  • Article metrics

Download PDF

  • http://orcid.org/0000-0002-2832-7254 Mantaka Rahman 1 , 2 ,
  • http://orcid.org/0009-0003-0467-3771 M H M Imrul Kabir 3 ,
  • http://orcid.org/0009-0006-5683-0013 Sharmin Sultana 1 ,
  • Afroza Tamanna Shimu 4 ,
  • http://orcid.org/0000-0001-8880-6524 Mark D Griffiths 5
  • 1 International Centre for Diarrhoeal Disease Research Bangladesh , Dhaka , Bangladesh
  • 2 MSc Student, Applied Statistics and Data Science , East West University , Dhaka , Bangladesh
  • 3 Department of Mathematical and Physical Science (MPS) , East West University , Dhaka , Bangladesh
  • 4 Green Life Medical College and Hospital , Dhanmondi , Bangladesh
  • 5 Department of Psychology , Nottingham Trent University , Nottingham , UK
  • Correspondence to Dr Mantaka Rahman; drmantaka.icddrb{at}gmail.com

Introduction Suicidal behaviour is common among medical students, and the prevalence rates might vary across various regions. Even though various systematic reviews have been conducted to assess suicidal behaviours among medical students in general, no review has ever assessed or carried out a sub-analysis to show the burden of suicidal behaviours among Bangladeshi medical students.

Methods and analysis The research team will search the PubMed (Medline), Scopus, PsycINFO and Google Scholar databases for papers published between January 2000 and May 2024 using truncated and phrase-searched keywords and relevant subject headings. Cross-sectional studies, case series, case reports and cohort studies published in English will be included in the review. Review papers, commentaries, preprints, meeting abstracts, protocols and letters will be excluded. Two reviewers will screen the retrieved papers independently. Disagreements between two reviewers will be resolved by a third reviewer. Exposure will be different factors that initiate suicidal behaviours among medical students. The prevalence of suicidal behaviours (suicidal ideation, suicide plans and suicide attempts) in addition to the factors responsible, and types of suicide method will be extracted. Narrative synthesis and meta-analysis will be conducted and the findings will be summarised. For enhanced visualisation of the included studies, forest plots will be constructed. Heterogeneity among the studies will be assessed and sensitivity analysis will be conducted based on study quality. Included studies will be critically appraised using Joanna Briggs’s Institutional critical appraisal tools developed for different study designs.

Ethics and dissemination The study will synthesise evidence extracted from published studies. As the review does not involve the collection of primary data, ethical approval will not be required. Findings will be disseminated orally (eg, conferences, webinars) and in writing (ie, journal paper).

PROSPERO registration number CDR 42023493595.

  • meta-analysis
  • suicide & self-harm

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:  http://creativecommons.org/licenses/by-nc/4.0/ .

https://doi.org/10.1136/bmjopen-2023-083720

Statistics from Altmetric.com

Request permissions.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

STRENGTHS AND LIMITATIONS OF THIS STUDY

The present study will be a rigorous systematic review and meta-analysis focusing on the prevalence of suicidal behaviours and associated factors among Bangladeshi medical students.

The Cochrane Handbook’s strict methodology will be followed, and the results will be published in line with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses statement.

The review will only include peer-reviewed papers containing primary data reporting Bangladeshi medical students in the selection of the studies.

Most studies will comprise self-reported data, which are subject to various methodological biases.

The potential low quality of the individual studies may limit the conclusions that can be made.

Introduction

Suicidal behaviour is a broad term that includes three subcategories (1) suicide ideation (SI), which refers to thoughts of wanting to end one’s life; (2) suicide plan (SP), which refers to the formulation of a specific method to die and (3) suicide attempt (SA), which refers to engaging in potentially self-injurious behaviour with at least some intent to die or with a non-fatal outcome. 1 2 According to the World Health Organization (WHO), approximately 77% of suicides occur in low-income and middle-income countries (LMICs). Suicide rates in Southeast Asia (10.2 per 100,000) were higher than the global average (9.0 per 100,000) in 2019 due to population growth and population age structure. 3 Suicide ranks as the fourth largest cause of death for those between the ages of 15 and 29 years and claims more lives annually than HIV, malaria, breast cancer, war and murdered individuals. 4

According to the 2022 Bangladesh Education Statistics, out of 174,888 students and 826 institutions, 28.93% of students were admitted to medical college, and 4.11% to dental college, with approximately two-thirds being female (64.42%). 5 Bangladesh is considered as a hub of medical studies in south-east Asia. However, medical students appear to have had greater suicide rates (up to 3–5 times higher) than the general community over the past 130 years, with some estimates being even higher. 6 7 In addition, systematic reviews and meta-analyses have reported a high prevalence of suicidal behaviours among medical students ranging from 3.8% to 18.7%, compared with university student’s 9.0%–9.7%. 1 8 9 Although, the number of suicides among medical students has been little studied globally, 10 surveys show that Bangladeshi public and private medical students’ suicidal ideation ranging from 23.8% to 27.4%, 11 12 which is of concern. However, according to several study findings, medical students in Austria, Turkey, Pakistan and China had, respectively, rates of suicide thoughts and attempts within a year of 11.3% and 0.3%, 12% and 2.1%, 35.6% and 4.8%, and 8.2% and 4.3%. 13 14

Such rates may be because medical school teaching and learning environments are highly competitive, with high expectations for achievement from students, teachers and parents alike. Furthermore, since many psychiatric illnesses among adults begin around the age of 24 years (when medical students are at the height of their training), it is possible that the high incidence of psychiatric disorders among medical students may result from this. 13 One study reported that 33.5% of Bangladeshi medical students had poor mental health status, 15 and another reported that 39.1% of Bangladeshi medical students had various degrees of depression. 16 In contrast, a web-based study reported 80.2% of Bangladeshi medical students had moderate to severe depression symptoms. 17 In other countries, a systematic review reported that the prevalence of depression among medical students in China was 32.74%, in Turkey 39%, in Nepal 29.9%, in Egypt 65% and outside North America 7.7%–65.5%. 18–20 Psychiatric disorders (primarily depression) contribute greatly to suicidal behaviour and are among the most important risk factors for suicidality.

Throughout the world, the study of medicine is seen as being intrinsically difficult and demanding 21 due to the pressures of the classroom, overexpectations, 13 the demands of the workplace, burn-out and depression (particularly among younger doctors), as well as the ongoing trouble of balancing job, family and financial obligations. 7 In addition, other factors that contribute to suicidal behaviour among medical students include chronic stress, 22 poor mental health status, 15 academic stress, familial pressure, depression, 10 relationship status, drug addiction, alcohol use, 12 online addictions, 23 sleeping difficulties, thoughts of dropping out, physical or sexual assault, 1 parenting style 24 and family history. 10 11

The under-reporting of suicides is a well-known phenomenon in the field of suicidology, potentially complicating the accurate estimation of medical student suicide rates. 25 A meta-analysis of the prevalence of suicide behaviours among Bangladeshi medical students has never been previously conducted, even though numerous reviews have been carried out evaluating suicidal behaviours among medical students and university students more generally. Although numerous studies have reported on suicide behaviours in Bangladesh, as aforementioned, no meta-analysis has previously examined the factors contributing to suicidal behaviours among medical students in Bangladesh.

Aim and research question

The overall aim of this systematic review and meta-analysis is to meta-analyse the prevalence of suicidal behaviours (suicidal ideation, suicidal attempts and suicidal plans), factors associated with suicidal behaviour and methods used for suicidal behaviours among the medical students of Bangladesh. The Joanna Briggs Institute (JBI) mnemonic, Condition, Context and Population, 26 was used to formulate the research question. Here, the condition is suicidal behaviour (SI, SP and SA), the context is Bangladesh and the population is Bangladeshi Medical students. The research question is ‘What is the prevalence of suicidal behaviours (SI, SP and SA) among medical students of Bangladesh?’

Study design

This systematic review protocol will be conducted using the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) Protocols 2015 guidelines 27 and the Meta-analysis of observational studies in Epidemiology guidelines for systematic review and meta-analysis of observational studies. 28 The protocol has been registered in PROSPERO (CDR 42023493595).

Eligibility criteria

The study will include all empirical studies with available full texts, published from 1 January 2000 to May 2024. This time frame ensures a comprehensive inclusion of recent research while capturing a substantial body of literature for the systematic review and meta-analysis. All papers published in English with human participants will be considered. Cross-sectional studies, case series, case reports and cohort studies will all be included in this study. Studies concerning university students who did not specify the exact number of medical students who had suicidal behaviour will be excluded. Review papers, study protocols, books, chapters, preprints, meeting abstracts, commentaries, letters and editorials will also be excluded.

Information sources

Using comprehensive and advanced search strategies, the research team will search the major databases including Medline (PubMed), Scopus, PsycINFO and Google Scholar. The search strategy will include terms related to exposure and outcome, and built-in filters in the databases will be used to customise the final search output.

Search strategy

A comprehensive search strategy has been developed in consultation with an expert systematic reviewer which will be adapted for selected bibliographic databases in combination with a combination of Medical Subject Headings (MeSH), keyword terms and filters ( Figure 1 ) using the VOSviewer software tool visualising bibliometric networks. 29 The tentative search strategy for different databases is presented in Table 1 summarising the key search terms for population and outcome. All studies published in English will be considered for inclusion in the meta-analyses.

  • Download figure
  • Open in new tab
  • Download powerpoint

Cluster analysis showing searched keywords from PubMed database.

  • View inline

Key terms which will be used for developing search strategy

Condition/domain being studied

The conditions being studied include suicidal behaviours and medical students from Bangladesh.

Population/participants

The review will include studies of all ethnicities, genders and all over the country including medical students (bachelor in medicine, bachelor in dental surgery), undergraduate medical students, intern doctors, preclinical students, clinical students, and residency or non-residency medical graduate trainees.

Being a medical student.

Comparator(s)/control

Not applicable. There will be no comparison group.

Understanding suicidal behaviour and associated factors among Bangladeshi medical students.

Exclusion criteria

The types of output that will be excluded include:

Review papers, study protocols, books, chapters, preprints, meeting abstracts, commentaries, letters and editorials.

Full-text inaccessible studies.

Papers not published in the English language.

Studies regarding university students without specifying the exact number of medical students with suicidal behaviour.

Any relevant studies published before 1 January 2000.

The population/participants that will be excluded include:

Non-medical students.

Medical students outside Bangladesh.

The primary outcome will be to identify the prevalence of suicidal behaviours and its associated factors. The secondary outcome will be to address the methods used for suicide (hanging, poisoning, etc.) among the medical students of Bangladesh.

Study records

Data management.

EndNote V.21.0 reference management software (Clarivate Analytics, Philadelphia, USA) will be used to compile the papers retrieved from the comprehensive literature search. 30 The search results from the databases and relevant references (if needed) will be combined and duplicate articles will be removed. The remaining papers will be exported to the web-based application ‘Rayyan QCRI’ to facilitate article screening and collaboration among the reviewers. 31 Citation abstracts and full-text papers will be uploaded to Rayyan web application.

Selection process

To identify the studies that qualify, two independent reviewers (SS, ATS) independently checked the titles and abstracts of all retrieved papers. Then, for the final inclusion, both the independent reviewers will examine the full-text papers of the qualifying research. A third reviewer will settle any disagreements that the two reviewers have. There will be a log of the reasons for exclusion. PRISMA flow diagrams ( Figure 2 ) will be used to illustrate a summary of the research paper list for inclusion and exclusion. 32

PRISMA flow diagram of study selection process. PRISMA, Preferred Reporting Items for Systematic Reviews and Meta-Analyses.

Data extraction

Data extraction will be conducted using a Microsoft Excel spreadsheet (Microsoftn, Washington, USA). To present individual study characteristics and participant characteristics, descriptive statistics and qualitative narrative analysis will be used. To determine the pooled prevalence of suicidal behaviours, a random effect meta-analysis will be performed using R statistical package V.4.3.2 in-built meta-packages based on the number of students who have various suicidal behaviours. Assuming that the selected studies will be convenience samples from a larger population, a random-effects model will be used to generalise findings beyond the included studies. 33

Risk of bias (quality) assessment

The JBI Critical Appraisal Checklist will be used to assess study quality. 34 35 The Cochran’s Q statistic and the 𝘐 2 statistic will be used to assess between-study heterogeneity. The studies’ heterogeneity will be examined using prediction intervals for a comprehensive assessment. The results will be displayed on forest plots, and funnel plots will be created to visually assess publication bias. Two review authors will independently assess the risk of bias in studies being considered after full-text review. Disagreements between the review authors over the risk of bias in particular studies will be resolved by discussion, with the involvement of a third review author where necessary.

Strategy for data synthesis

An electronic search will be performed using PubMed, Scopus, PsycINFO and Google Scholar databases combining the main key elements of the Population, Exposure, Comparator and Outcomes inclusion criteria. To develop the search strategy, a list of relevant index terms and keywords will be derived from the existing database, relevant literature and combined Boolean operators, truncations and explode functions. In consultation with experts in systematic review, the search strategy will be refined accordingly. A total of almost 690 studies were yielded from a preliminary search conducted on 9 December 2023. All included studies will be summarised and tabulated for data extraction. Egger’s test and funnel plots will be conducted to examine the possibility of publication bias. Moreover, a subgroup analysis will be conducted to calculate the pooled prevalence of suicidal behaviours across different study characteristics. In addition, a narrative synthesis will be carried out in the event that quantitative synthesis is not possible.

Patients and public involvement

This is a protocol for systematic review and no patients will be directly involved in this review. This review will be done to identify the prevalence of suicidal behaviours and their associated factors which influence SI, SA and SP among Bangladeshi medical students which has been a matter of concern.

Ethics and dissemination

The study will synthesise evidence extracted from published studies. As the review does not involve the collection of primary data, ethical approval will not be required. A manuscript will be developed and submitted to an international peer-reviewed journal for publication based on the PRISMA statement as well as the PRISMA for Network Mata-Analyses (PRISMA-NMA guidelines. In addition, the findings may also be verbally disseminated (eg, conferences, webinars).

Ethics statements

Patient consent for publication.

Not applicable.

  • Kaggwa MM ,
  • Najjuka SM ,
  • Favina A , et al
  • World Health Organization
  • ↵ Bangladesh Education Statistics , 2022 . Available : https://banbeis.portal.gov.bd/
  • Blacker CJ ,
  • Swintak CC , et al
  • Govil N , et al
  • Rukundo GZ ,
  • Byakwaga H ,
  • Kinengyere A , et al
  • Griffiths MD
  • Mozaffor M ,
  • Islam MS , et al
  • Ventriglio A ,
  • Yohannis Z , et al
  • Hossain S ,
  • Gupta RD , et al
  • Likhon RA ,
  • Biswas MAAJ ,
  • Samir N , et al
  • Henderson M
  • Puthran R ,
  • Zhang MWB ,
  • Tam WW , et al
  • Liu J , et al
  • Ashiq MAR ,
  • Jubayer Biswas MAA , et al
  • Rosiek-Kryszewska A ,
  • Leksowski Ł , et al
  • Akhter S , et al
  • Tugnoli S ,
  • Casetta I ,
  • Caracciolo S , et al
  • Sampson HH ,
  • Lisy K , et al
  • Liberati A ,
  • Tetzlaff J , et al
  • Stroup DF ,
  • Berlin JA ,
  • Morton SC , et al
  • Lessa M , et al
  • Gotschall T
  • Ouzzani M ,
  • Hammady H ,
  • Fedorowicz Z , et al
  • McKenzie JE ,
  • Bossuyt PM , et al
  • Cheung MW-L ,
  • Lim Y , et al
  • Institute TJB
  • Aromataris E ,
  • Fernandez R ,
  • Godfrey CM , et al

Contributors MR conceptualised the review. M H M IK and MDG provided expert opinions in designing the review. MR drafted the protocol manuscript. MR, SS and ATS screened the papers. M H M IK and MDG reviewed and revised the protocol manuscript for intellectual content. All authors read and approved the final version of the protocol manuscript and MR is responsible for the overall content (as guarantor). Chat GPT, Claude AI.

Funding The authors did not receive a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

Competing interests None declared.

Patient and public involvement No patients and/or the public were involved in the design, or conduct, or reporting, or dissemination plans of this research. Refer to the Methods section for further details.

Provenance and peer review Not commissioned; externally peer reviewed.

Read the full text or download the PDF:

  • All topics »
  • Coronavirus disease (COVID-19) 
  • Ukraine emergency
  • Environment and health
  • Health services delivery 
  • Vaccines and immunization
  • Mental health
  • Digital health
  • Behavioural and cultural insights

data analysis and research findings

  • All publications

United Action for Better Health

  • News releases 
  • Feature stories 
  • Photo stories 
  • Initiatives »
  • An introduction to WHO in the European Region

74th session of the WHO Regional Committee for Europe

74th session of the WHO Regional Committee for Europe

First qualitative research study conducted in Turkmenistan focuses on HPV vaccination

Within the framework of a WHO–European Union joint project on immunization in central Asia, the WHO Country Office in Turkmenistan and the Ministry of Health and Medical Industry of Turkmenistan jointly conducted the country’s first qualitative research study.

The project aimed at identifying factors influencing parents' decisions related to human papillomavirus (HPV) vaccination for their children. Consisting of focus-group discussions and in-depth interviews, the research provided an understanding of parents’ attitudes and beliefs about HPV as well as barriers to HPV vaccination.

The results of the research conducted over 3 weeks in late 2023 will serve as the basis for activities to increase public awareness about HPV and sustain confidence in HPV vaccination in the future.

HPV vaccination in Turkmenistan

Turkmenistan included the HPV vaccine in its routine immunization schedule starting in 2016, for boys and girls at 9 years of age. Although national vaccination coverage remains high, a slight downward trend has been observed in both urban and rural areas: from 99.2% in 2021 to 98.5% in 2023.

With a relatively young population increasingly turning to the internet for information, it is important that evidence-based answers to potential questions about vaccination are readily available. However, official online information about vaccines remains limited, creating an opportunity for misinformation to spread with the potential to decrease vaccination uptake in the coming years.

The Ministry asked WHO to conduct a qualitative research study to identify what parents know about HPV, the diseases it can cause, the effectiveness of vaccination in preventing these diseases, and especially what questions or concerns they have on HPV vaccination that need to be addressed in a transparent and accessible manner.

The study, conducted jointly by experts from the Ministry and WHO, aimed to develop targeted interventions to better inform the public and health-care workers about HPV vaccines. Focus groups and in-depth interviews with health-care providers, parents and staff of public organizations were conducted to identify participants’ knowledge, attitudes and behavioural determinants for uptake of HPV vaccine and childhood vaccines in general.

The study was conducted in cities, including the capital, as well as in rural sites in 2 regions. Data collection and analysis were conducted using the COM-B framework, which looks at 3 key components: capability, opportunity and motivation for behaviour change.

Study outcomes

Study findings revealed that attitudes toward HPV were generally positive, partially due to positive attitudes toward vaccination in general but also due to preparatory steps taken by health authorities prior to introducing the HPV vaccine in 2016.

These steps included informing and training health workers to administer and answer questions about the vaccine and to inform parents and children about the benefits of HPV vaccination in preventing HPV infection, emphasizing its role in preventing the spread of the virus rather than only in preventing cervical cancer.

Despite high levels of knowledge and trust in vaccination, study participants did reveal certain gaps in knowledge and potential vulnerability to misinformation. Based on the findings, researchers proposed several measures, including:

  • making up-to-date information on childhood vaccination available through a single online portal to ensure accessibility and availability for the public;
  • training health workers to increase their capacity to effectively communicate with parents on HPV and other vaccines in the routine immunization schedule; and
  • using existing facility-level data and ongoing activities to conduct local, community-based interventions to effectively engage the minority of parents delaying or rejecting vaccination.

Based on these recommendations, the Ministry is developing an action plan that will include regular training for health workers and provision of information to parents via online resources and individual consultations.

With an eye to sustaining high demand for vaccination in the future, the Ministry is also planning to pilot an education module for 10–12-year-olds called Immune Patrol in several schools. WHO developed Immune Patrol to increase health literacy, resistance to misinformation, and knowledge about the immune system and immunization. WHO will provide technical support to the Ministry to implement the action plan and to pilot the Immune Patrol package in 2024 and beyond.

Partnering with the European Union to support and strengthen vaccination

Questions and answers about human papillomavirus, second edition

Field guide to qualitative research for new vaccine introduction

Supporting the prevention, detection and treatment of cervical cancer

To read this content please select one of the options below:

Please note you do not have access to teaching notes, getting stuck in a collective stigma: sex offense registrants, liminality liminoid experiences, and identity limbo groups.

Journal of Criminal Psychology

ISSN : 2009-3829

Article publication date: 3 July 2024

The purpose of this study was to examine the internalization of group-level identities held by people who are on the sex offense registry and how these influence emotions and the willingness to accept treatment. The types and consequences of identities and stigmas are often examined at the individual level, but most people belong to groups that hold collective identities that can be detected in phrases such as “we, us, our,” etc.

Design/methodology/approach

Longitudinal data from 2008 to 2024 was used to examine registrant’s group identities. Interviews were conducted with 115 registrants and 40 of their family members, and narrative research analysis was used to assess how participants’ levels of liminality influence why some on the registry never come to see themselves as sex criminals.

Three group-level identities were found that corresponded with varying phases of liminality. The first group had a fixed mindset, no liminality and a strong sense of self. The second group of registrants had liminoid experiences, allowing them to change the way they saw themselves over time. This group had a growth mindset that believed change was attainable. The third group exhibited fixed mindset, as they either always saw themselves as sex criminals and required no transition or came to see themselves as sex offenders post-punishment.

Originality/value

To the best of the authors’ knowledge, there are no studies that have examined group-level identities among people convicted of sex crimes or what the consequences of group identities have on behavior.

  • Sex offending

Acknowledgements

The authors would like to thank the University of Nebraska Sponsored programs for internal grant money to facilitate this study as well as the School of Criminology and Criminal Justice at the University of Nebraska for the use of PhD students as members of the research team. Students were granted access to the data for their own use to create dissertations and journal articles.

Cooley Webb, B. , Petersen, C. and Sample, L.L. (2024), "Getting stuck in a collective stigma: sex offense registrants, liminality liminoid experiences, and identity limbo groups", Journal of Criminal Psychology , Vol. ahead-of-print No. ahead-of-print. https://doi.org/10.1108/JCP-03-2024-0017

Emerald Publishing Limited

Copyright © 2024, Emerald Publishing Limited

Related articles

All feedback is valuable.

Please share your general feedback

Report an issue or find answers to frequently asked questions

Contact Customer Support

Regional fertility predictors based on socioeconomic determinants in Slovakia

  • Original Research
  • Open access
  • Published: 02 July 2024
  • Volume 41 , article number  20 , ( 2024 )

Cite this article

You have full access to this open access article

data analysis and research findings

  • Janetta Nestorová Dická   ORCID: orcid.org/0000-0003-1027-6871 1 &
  • Filip Lipták 1  

48 Accesses

Explore all metrics

The study's primary purpose was to recognise the effects of determinants on the level of fertility and thereby explain the differences in trends in the regions of Slovakia. At the turn of the century, the differences in fertility in regions increased, but the total fertility rate decreased. Multivariate statistical methods clarified the regional effects of the level and nature of fertility. Initial regression surveys indicated weak effects between regions, which led to applying factor and cluster analysis to establish regional types. Comprehensive regression analysis was then applied. The strength and nature of regional relationships differed at the inter- and intra-regional levels. Research has demonstrated significant differences in fertility rates dependent on the socioeconomic environment, as regional types uniquely link to determinants. Moreover, each determinant has specific spatial patterns with unequal regression coefficients at different regional levels, which cannot be evaluated constantly. Knowing how spatial variation in fecundity occurs will enable future studies to elucidate the processes involved. Finally, fertility is vital for social assessment and policy formulation, the study’s findings could inform local decision-makers and planners in identifying the socioeconomic conditions underlying fertility at the regional level and planning appropriate intervention strategies.

Similar content being viewed by others

data analysis and research findings

Determinants of Fertility During the Fertility Transition in Estonia: A Spatial Analysis

Multilevel modelling of individual fertility decisions in tunisia: household and regional contextual effects.

data analysis and research findings

Local and Global Analysis of Fertility Rate in Italy

Avoid common mistakes on your manuscript.

Introduction

Like many other post-communist countries, Slovakia experienced a fertility transition in the early 1990s, achieving stable fertility below the replacement level. However, while differences in regional fertility levels increased at the turn of the century, the total fertility rate (TFR) decreased significantly. The TFR dropped from 2.1 in the early 1990s to 1.2 children per woman in the early 2000s. As a common phenomenon in post-communist countries, this trend has been the focus of much research. This study, in particular, aims to contribute to understanding this trend in Slovakia, where fertility is slowly increasing but remains below the replacement level, currently standing at 1.6 children per woman.

The political changes in Eastern and Central Europe at the end of the 1990s profoundly impacted various societal levels. The socialist system, seemingly stable and stagnant, was replaced by unprecedented social and economic changes (Sobotka, 2011 ). The establishment of democracy and the transformation of the Slovak economy led to the privatisation of state property and the restoration of private property and business. The emergence of unemployment and the fundamental changes in housing and family policy created a sense of social insecurity. Ultimately, these societal changes heralded ideological and cultural change, significantly altering population reproduction.

The previous socialist conditions had ensured a high degree of reproduction uniformity. It was especially noted in early fertility, a strong inclination towards the two-child family model, and low childlessness. Šprocha and Tišliar ( 2016 ) noted that in the post-communist period, due to societal changes, divergent tendencies are manifested in this area, and the reproductive paths of women differ. In addition, Potančoková ( 2011 ) reported that the significant factors affecting birth rates in the transformation period were decreased fertility intensity, childbirth postponement, and more children borne out of marriage. The power of these tendencies varied across Slovak regions due to differences in demographic, economic, social, and cultural factors. However, there is general agreement that these factors indirectly but significantly affect current fertility characteristics and inter-regional differences (Abdennadher et al., 2022 ; Campisi et al., 2020 ; Iwasaki & Kumo, 2020 ; Lieming et al., 2022 ).

The study evaluates the effects of socioeconomic determinants of fertility, explaining differences between regions in Slovakia. The regional fertility in Slovakia varies significantly. Herein, we identify significant fertility predictors based on regional strengths and characteristics. The study's originality lies in finding significant predictors behind the fertility rates between the regions of Slovakia while connecting the spatial context with the determinants explaining the differences.

Theoretical framework

Demographic development is a multidimensional process influenced by a combination of factors, and their interactions may vary in different regions and periods. Scientists from various fields have investigated the complex interplay of factors that shape demographic trends and changes in population composition (Bongaarts, 2009 ; Bontje, 2020 ; Pampel, 2011 ; Raymo, 2015 ; Sobotka & Fürnkranz-Prskawetz, 2020 ; Tulchinsky & Varavikova, 2014 ). Birth rates and fertility intensity are critical demographic components that determine population increase or decrease, and determinants with varying effects influence the levels. In addition, the set and structure of fertility determinants differ in countries and regional areas depending on the specific cultural, social, political, and economic situation. Moreover, the determinants interact and vary in relative importance in different contexts, so understanding specific population dynamics is crucial in establishing fertility and birth patterns.

Determinants affecting natality and fertility can be perceived in several broader categories (El-Ghannam, 2005 ; Marenčaková, 2006 ; Adhikari, 2010 ; Wang & Sun, 2016 ), while in the conditions of the Slovak regional structure, the level of fertility varies most often depending on social, economic or cultural factors (Potančoková et al., 2008 ; Šprocha & Bačík, 2021a , 2021b ; Šprocha & Bleha, 2018 ; Šprocha & Tišliar, 2019 ; Šprocha et al., 2022 ).

Sociocultural conditions

Sociocultural conditions may contribute to regional variation in fertility levels by influencing individuals living in certain areas. A major societal shift in Slovakia in the past three decades is increased women's education, and its association with fertility has been extensively researched. Results indicate that increased education is consistently related to lower TFR values (Adsera, 2017 ; Cheng et al., 2022 ; Gray & Evans, 2019 ; Maulida et al., 2023 ; Shirahase, 2000 ). Šprocha and Tišliar ( 2019 ) add that education is one of the critical factors in the intensity and timing of maternity and family starts and the realisation of fertility outcomes in Slovakia.

D'Addio and D'Ercole ( 2005 ) explained observed changes in OECD countries’ fertility rates by socio-economic influences, stating that as a result of increasing women's education, women's participation in the labor market also increases, the average age of first-time mothers increases, and women's higher education contributed to higher contraceptive use. More educated women are also more likely to delay childbearing until an older age because they participate in the labor market, are more economically independent, are more likely to have a career orientation and build a job position. Consequently, they often have fewer children than initially planned (Ní Brolcháin & Beaujouan, 2012 ; Iacovou & Tavares, 2011 ; Testa, 2014 ). The desire for children increases at higher income levels, increasing the chances of providing for a bigger family, and higher income, as well as better employment prospects, are the result of accumulated human capital in the form of higher educational attainment.

Becker ( 1992 ) assumes that people with higher education might want to have a larger number of children, thus pointing to the positive effect of income—but this is in contrast to the negative effect of the costs of lost opportunities (Nisén, 2016 ). However, age is a crucial determinant of fertility because conception ability decreases with age. Fekiačová ( 2019 ) adds that postponing parenthood until an older age can negatively affect the intensity of fertility in highly educated women in the form of a smaller number of children, or it can lead to final childlessness, which may indicate lower TFR values in the region. The opposite trend can be observed among less educated women when they enter motherhood soon after completing their education, thereby trying to reduce the uncertainty associated with the official labor market (Hechter & Kanazawa, 1997 ). In addition, Potančoková et al. ( 2008 ) cite ethnicity and cultural determinants of fertility, where Slovak Roma women with low education secure a more certain source of finance, which often indicates higher TFR values.

Further research provides evidence of an unequal relationship between education and fertility. The negative gradient dominates primarily in Central and Eastern European and German-speaking countries (Beaujouan et al., 2016 ; Klesment et al., 2014 ; Nisén et al., 2021 ; Wood et al., 2014 ). Deviations from this pattern have been reported in northern and northwestern European states (except Finland), where the gradients have narrowed and are often no longer observable (Jalovaara et al., 2019 ; Testa, 2014 ).

The theory of gender equality (Anderson & Kohler, 2015 ; McDonald, 2013 ; Mills, 2010 ; Neyer et al., 2013 ) also points out the negative impact of women's higher education on their fertility. However, the effect is not uniform and varies by gender and parity. The demographic revolution theory further elucidates the relationship between education and fertility (Axinn & Barber, 2001 ; Lesthaeghe, 2014 ; Sobotka et al., 2008 ). As women's educational attainment increases, fertility rates tend to decline, often associated with improved gender equality and socio-economic development in many developed countries.

Religiosity, explaining deviations in reproductive outcomes, has long been perceived as one of the significant and consistent factors determining actual fertility, especially in Slovakia (Šprocha & Tišliar, 2019 ). Numerous studies confirm its positive influence on fertility (Philipov & Berghammer, 2007 ; Frejka & Westoff, 2008 ; Zhang, 2008 ; Dilgmaghani, 2019 ; Herzer, 2019 ; Götmark & Andersson, 2020 ; Bein et al., 2021a ; Buber-Ennser & Berghammer, 2021 ). All religiosity rates are generally related to a higher ideal number of children, a higher probability of having another child, and a higher expected and actual number of children, primarily with the Catholic faith in the European environment. Among various dimensions of religiosity, religious involvement is a more significant predictor of fertility than religious tradition, affiliation, and self-rated religiousness (Philipov & Berghammer, 2007 ; Dilmaghani, 2019 ; Bein et al., 2021b ; Perry & Schleiffer, 2019 ; Buber-Ennser & Berghammer, 2021 ), which promises higher TFR values.

The scientific community has focused on clarifying the relationship between religion and fertility, and on explaining the differences in its influence on fertility in different countries. For example, religion is important in the lives of half of American women but in less than one-sixth of European women. In addition, Frejka and Westoff ( 2008 ) report that women in Northern and Western Europe are less religious. However, they have the same or even higher fertility than American women, and significantly higher fertility than Southern European women. The authors consider that a slight increase in European fertility could theoretically be expected if Europeans were as religious as Americans. Buber-Ennser and Berghammer ( 2021 ) add that the positive influence of religion on actual fertility outcomes is much more substantial in Western Europe than in Central and Eastern Europe. Moreover, the effects of religion in the latter countries are considered generally weak and inconsistent, and religious significance and relationship to fertility there would benefit from further investigation.

Many studies have examined variations in fertility levels in different ethnicities (Adebowale, 2019 ; Bagavos et al., 2008 ; Booth, 2010 ; Chui & Trovata, 1989 ; Dubuc & Haskey, 2010 ; Forste & Tienda, 1996 ; Jasilioniene et al., 2014 ; Martin, 2019 ; Muhammad, 1996 ; Urale et al., 2019 ), which explain in connection with cultural, social, economic and minority postulates. Some ethnic and cultural groups maintain relatively high birth rates despite the global shift towards smaller families, and traditional gender roles, religious beliefs, cultural norms, and socio-economic conditions all influence the outcome. Moreover, established patterns are not absolute determinants because individual choices and circumstances vary considerably within a given ethnic group (Forste & Tienda, 1996 ; Majo, 2014 ; Šprocha, 2014 ).

Another important differentiating factor in fertility outcomes is marital status, which affects basic demographic processes such as birth rate and mortality, as well as marriage and abortion rates. Concerning fertility, marriage has long been considered its prerequisite (Ridfuss & Parnell, 1989 ), and their high degree of correlation in the past has been empirically documented (Magdalenić, 2016 ). However, with the onset of the demographic transition, this relationship began to loosen, and new forms of family behavior appeared in society. Several studies have led to the clarification of the relationship between marital status and fertility and the description of differences in the level of fertility between subpopulations with different marital status (Van Bavel et al., 2012 ; Hiekel & Castro-Martín, 2014 ; Magdalenić, 2016 ; Meggiolaro & Ongaro, 2010 ; Nedomová, 2015 ; Perelli-Harris et al., 2009 ; 2014 ; Perelli-Harris, 2014 ; Raley, 2001 ; Ridfuss & Parnell, 1989 ). Behind the increase in non-marital births in the European area is an increasing number of conceptions and births within cohabitation, pointed out by Perelli-Harris et al. ( 2012 ), while in most of the observed countries, the percentage of births to unmarried women decreased or remained relatively similar. According to Magdalenić ( 2016 ), the reproductive function of marriage thus loses its relevance. In addition to the type of partnership itself, fertility is also affected by its instability. In line with Meggiolaro and Ongaro ( 2010 ), divorce can theoretically be considered a depressive factor, but empirical studies have not consistently confirmed this hypothesis. Remarriage and cohabitation increase the probability of giving birth after the dissolution of marriage, whereby separated childless women have a higher risk of conception. However, according to Van Bavel et al. ( 2012 ), divorced women have lower fertility than continuously married women, even when re-partnering. In the case of men, it is the opposite.

Slovakia also captures something similar, where with the onset of the second demographic transition (cf. Mládek, 1998 ) already in the early 1990s, new forms of family behavior begin to appear with a gradual decline in cohabitation in marriage and an increase in partnerships cohabitation. The period of the last three decades due to population aging (Kačerová, 2009 ; Šprocha & Ďurček, 2019 ) also brings changes in the age composition of the Slovak population due to population aging (Kačerová, 2009 ; Šprocha & Ďurček, 2019 ) with the expansion of thirty- and forty-year-old age groups, which enters into marriage more often. These changes are behind the significant increase in the share of the reproductive Slovak population living in marriage and only a slight increase in the divorced. Up to 85% of the population lived in marriages in Slovakia in the last census of 2021, and 17% of the reproductive population were divorced. Among the EU countries, Slovakia still has a higher proportion of married people. The shift in society from the traditional model of the family, which typically involved a married couple and their children living together, to modern and postmodern forms of family relations with an inclination towards individualism, personal freedom, and independence (Kraus et al., 2020 ; Mendelová, 2018 ) causes a general decline in marital cohabitation with the prospect of further decline, then would be the relevance of using the women's divorce rate in the research low.

Nevertheless, within the demographic of reproductive-aged women, married Slovak women continue to outnumber divorced women, with a notable correlation observed between Slovak regions and TFR. Šprocha and Tišliar ( 2021 ) also indicate that middle-aged and older women enter the marriage union, where the union remains more stable if women with a higher education who postpone motherhood enter it. However, many factors influence the formation of the regional structure of the reproductive population according to family status. Demographic factors include primarily the intensity and timing of marriage and divorce. Therefore, in the north and east of Slovakia, a region with a more significant representation of married women was formed (cf. Bleha et al., 2014 ). In the western region, it is precisely the hinterland of the capital city that, due to the migration attraction of the productive population, registered a similarly higher proportion of married women in the last period (cf. Pregi & Novotný, 2019 ). The southern region of Slovakia is represented by a higher representation of cohabitation and a higher representation of divorced women in the reproductive population, which some authors (Džupinová et al., 2008 ; Korec, 2005 ; Šprocha & Ďurček, 2017 ) explain by a higher representation of low-income persons, higher unemployment and a lower level of development of local economies.

Economic conditions

The income-fertility hypothesis describing the relationship between income and fertility suggests that fertility falls as income rises. D'Addio and D'Ercole ( 2005 ) consider women's income and earnings as critical influences for childbearing, and they record the complexity of the relationship between income and fertility. Findings have shown that fertility rates can reflect the difference between current and past income levels of each cohort, and wealthier OECD countries have higher fertility rates and higher average age at first birth (Herzer et al., 2012 ), while in all countries, women with higher household income levels have fewer children compared to other women. Although there is a general correlation between income and fertility, this relationship can vary across countries and cultures (Fox et al., 2019 ; Luci-Greulich & Thévenon, 2014 ).

Individual preferences and choices regarding family size can differ significantly, even within income groups. Shifts in family policies (Adema & Thévenon, 2014 ), changes in the spatial organization of the economic sphere (Ciminelli et al., 2021 ; Wachsmuth, 2022 ), and selective processes of international and internal migration (Rees et al., 2017 ; Billari & Dalla Zuana, 2013 ; Pregi & Novotný, 2019 ) contribute to this. In addition, Fox et al. ( 2019 ) suggest a stronger convex relationship between fertility and income in Western European countries, while in Eastern European countries, where only Poland and the Slovak Republic seem to have transitioned to a positive fertility–income relationship.. The income level at which the association between income and fertility changes from negative to positive was much lower in the East than in the West but varied in both regions.

Some research has documented an inverse relationship between fertility rates and women's participation in the labor market (see Bernhardt, 1993 ; Mishra & Smyth, 2010 ; Altuzarra et al., 2019 ; Del Rey et al., 2021 ). However, some studies point to a change in the correlation between total fertility and the women's employment rate or the labor force participation rate (Ahn & Mira, 2002 ; D'Addio & D'Ercole, 2005 ; Del Boca et al., 2003 ) or they track their joint growth (Hwang et al., 2018 ) in some OECD countries. In fact, according to Klasen et al. ( 2021 ), heterogeneity in returns to women’s own characteristics and family circumstances—including education, income, and fertility explains most of the between-country differences in participation rates, indicating that the economic, social, and institutional constraints that shape women’s labor force participation are still largely country-specific.

Hwang et al. ( 2018 ) see the change in correlation and the increase in both indicators as an increase in childcare substitutability, i.e. the degree of perception of market-provided child care by parents as a sufficient substitute for the mother’s time. As the value of substitutability increases, the female labor force participation rate increases while a convex relationship (U-shaped) emerges in interaction with TFR, the existence of which can be explained by a combination of behavioral and compositional effects. According to Thévenon and Luci ( 2012 ), fertility trends also depend decisively on mothers' ability to combine work and family life, pointing to a higher fertility rate in countries where women have greater access to the labor market (Northern European countries and France). Fertility outcomes can be influenced by the availability of formal childcare facilities, which (Ridfuss et al., 2010 ; Haan & Wrohlich, 2011 ) is considered an essential prenatal tool of family policy mitigating the conflict between work and family. Jung et al. ( 2019 ), as well as Wood and Neels ( 2019 ), state that despite the generally widespread hypothesis about the positive impact of available childcare on the level of fertility in the developed world, the results of several empirical studies are ambiguous and contrary to hypothetical expectations. A positive relationship between these indicators was recorded most often only in regions with high participation of women in the labor market. However, the research results of Ridfuss et al. ( 2010 ) showed that increasing the availability, i.e., capacity, of facilities from 0 to 60% for preschool children in Norway led to an increase in fertility by 0.5–0.7 children per woman under 35 years of age.

In addition to women's participation in the labor market, unemployment also affects the timing and intensity of fertility. Still, according to some authors, the effects are unclear (Andersen & Özcan, 2021 ; Bono et al., 2015 ; Huttunen & Kellokumpu, 2016 ). D'Addio and D'Ercole ( 2005 ) point out that in the 1990s, there was a change in the correlation between fertility and unemployment, with the fertility rate of most OECD countries being higher in times of low unemployment and falling with its increase. Adsera ( 2011 ) states that high and persistent unemployment is associated with delayed childbearing and, as a result, likely fewer children. However, Fernandez-Crehuet et al. ( 2020 ) report that fertility and unemployment rates are unrelated in the long run. Andersen and Özcan ( 2021 ) also concluded that unemployment positively affects the transition to motherhood and has no significant effect on second-order children, possibly due to the specificity of the Danish context. Finally, studies record that unemployment does not necessarily negatively affect fertility, and its negative impact is selective.

Housing availability further determines fertility, but its influence in Slovakia has rarely been analyzed. However, Katuša ( 2012 ) demonstrated that many young Slovak couples could not afford appropriate housing for child-rearing, and this was a prominent cause of the decline in birth rates and fertility, regardless of employment, education status, and other factors. European researchers (Campisi et al., 2020 ; Makszin & Bohle, 2020 ; Mulder & Billari, 2006 ; Stoenchev & Hrischeva, 2023 ; Zavisca & Gerber, 2016 ) have led efforts to understand how housing conditions, especially ownership, are related to the transition to adulthood, the formation and dissolution of partnerships, and fertility.

In addition, Zavisca and Gerber ( 2016 ) identified housing as a primary source of socioeconomic difference, with the home critical for everyday life and family dynamics, consumer lifestyle, subjective well-being, and family planning. High housing prices also mean that individuals and couples may face financial constraints that make it difficult to afford the costs associated with raising children. According to studies by Zhang et al. ( 2012 ) and Saguin ( 2021 ), an increase in housing prices leads to an indistinctive decrease in the birth rate, but changes in housing prices do not have an immediate effect on fertility; they can even also have a positive impact (Clark & Ferrer, 2019 ; Simo-Kengne & Bonga-Bonga, 2020 ). However, housing availability problems affect the timing of births more significantly than the intensity of fertility (Kostelecký & Vobecká, 2009 ), and the relationship between housing affordability and fertility, therefore, varies depending on the region or country’s social, economic, and cultural background.

Urban–rural conditions

The degree of urbanization similarly plays a role in the fertility rates. Urban areas tend to have a higher prevalence of apartments with smaller living spaces, which is natural due to population density and limited land availability. Research by Kulu and Vikat ( 2007 ) revealed that people living in apartments have lower fertility than those living in single-family homes. The smaller living spaces of housing may force families to limit childbearing due to space constraints. In contrast, the family environment often associated with family houses may facilitate reproduction in rural areas (Felson & Solaún, 1975 ).

High housing costs can lead individuals to migrate from urban centers to rural or suburban environments, where they can generally have more children. These selective steps for increasing fertility outcomes may also contribute to patterns of low urban fertility (Kulu, 2013 ; Rusterholz, 2015 ). However, studies also show that selective moves from urban centers to family-friendly environments do not cause significant differences in urban–rural fertility levels (Kulu & Washbrook, 2014 ).

Marenčáková ( 2001 ) considers the size of the settlement to be an important differentiating factor for several demographic phenomena and processes in Slovakia and adds that the difference in the reproductive behavior of the population of cities and rural communities is generally accepted. Scientific studies (Kulu & Boyle, 2009 ; Vobecká & Piguet, 2012 ; Kulu, 2013 ; Kulu & Washbrook, 2014 ; Riederer & Buber-Ennser, 2019 ; Lopéz-Gay & Salvati, 2021 ; Rodrigo-Comino et al., 2021 ; Salvati, 2021 ) describe and explain the differences in fertility levels based on the urban–rural dichotomy or along the city-suburb-rural gradient. Most confirm that rural areas and small towns have higher fertility than large cities. Urban–rural fertility disparities decrease over time (Bleha et al., 2020 ; Kulu, 2013 ) when factors describing the economic environment, family and gender norms, and population composition are considered (Riederer& Beaujouan, 2024 ).

While urbanization generally correlates with lower birth rates, there can be considerable variation between countries and regions. According to Billari and Kohler et al. ( 2002 ), European regional fertility regimes have rapidly changed due to common socio-economic transformations. However, these changes vary across the geographical gradient north–south and west–east of European cities (Rodrigo-Comino et al., 2021 ). These findings largely mirror the dominant phase of the urban life cycle (Morelli et al., 2014 ). Northwestern cities have entered a phase of reurbanization (Dembski et al., 2021 ), leading to new forms of urban expansion such as polycentric development. This trend has seen inner cores gradually attracting a younger population and couples with a high propensity to marry and have children, indirectly contributing to a higher average birth rate.

Conversely, Eastern European cities are still undergoing the final phase of suburbanization (Stryjakiewicz, 2022 ; Szmytkie, 2021 ), characterized by intense population growth in suburban districts and significant stability or even shrinkage of inner urban cores. Rodrigo-Comino et al. ( 2021 ) state that suburbanization is only associated with younger and larger families—and thus higher fertility levels—in Eastern and Southern Europe. Compositional and contextual factors and the influence of selective migration and housing conditions can explain differences in fertility between urban and rural areas (Kulu, 2013 ; Kulu & Washbrook, 2014 ; Nestorová Dická et al., 2019 ; Riederer & Buber-Ennser, 2019 ).

Other conditions

The theoretical overview of the factors that influence the level of regional fertility indicates the authors' intention to include them in their research. However, in addition to the above, regional fertility can be conditioned by other indicators not included in our study. Studies, such as Hesketh and Xing ( 2006 ), Wu et al. ( 2006 ), Tafuro and Guilmoto ( 2020 ), Chao et al. ( 2021 ), and Becquet et al. ( 2022 ), have pointed out that the gender ratio of the reproductive population can also play an important role in the given issue. Slovakia's secondary sex ratio shows a slight predominance of the male gender, similar to the case in other developed regions (Orzack et al., 2015 ), which can be used as a reflection of living conditions and health status (Chao et al., 2022 ) while deteriorating conditions are associated with a decrease in ratio values and vice versa. Golian and Liczbińska ( 2022 ) also confirmed what was stated for Slovakian conditions, while in the period of formation of the current reproductive population, there are no significant exogenous shocks that could significantly affect the sex ratio at birth and thereby affect the demographic status of the current reproductive populations in the regions of Slovakia. In almost every region of Slovakia, men slightly predominate in the reproductive population, or the ratio is mainly balanced. In the end, even the initial survey of the relationship between the fertile population of women and the level of fertility between Slovak regions did not confirm significance.

However, the migration factor can also affect the total fertility rate. In regions with significant migration, the birth rate can be affected by changes in the population's demographic composition (e.g., Bagavos, 2019 ; Sabater & Graham, 2019 ). In Europe, some authors register that economic and migration factors often explain changes in fertility (cf. Majelantle & Navaneetham, 2013 ; Sobotka et al., 2011 ), and migration may be responsible for reshaping the ethnic and social composition of many highly developed countries (Sobotka, 2008 ). According to research by Parr ( 2021 ) on the fertility replacement level in the presence of positive net immigration, Slovakia's current migration replacement TFR is low. This is related to its low birth rate and low net migration, and the relatively lower life expectancy at birth also plays a role compared to Western, more developed countries. This means that migration does not significantly impact the overall demographic situation in the Slovak regions and, therefore, not on the level of fertility. Compared to Western, more developed countries, where migration can be a more significant factor, its influence is minimal in Slovakia. As for life expectancy, although it is relatively lower compared to Western countries and may affect the demographic structure of the population, it does not directly affect the fertility rate. Fertility levels are more influenced by other social, economic, and cultural factors, which are the primary focus of our research.

In the same way, access to health care plays an important role in the issue of this research. Regions with better access to health facilities and information on reproductive health tend to have lower fertility due to increased use of contraception and better outcomes in the area of maternal health, as confirmed by studies such as Yüceşahin and Özgür ( 2008 ), Phillips et al. ( 2019 ), Herrera-Almanza and Rosales-Rueda ( 2020 ). From another point of view, according to Brodeur et al. ( 2022 ) and Jones et al. ( 2023 ), access to reproductive health care services, including family planning resources and prenatal care, can affect fertility rates. Stefko et al. ( 2018 ) demonstrated the diversity of healthcare facilities' performance in Slovakia's regions. However, over time, there is an indirect dependence between the variables and the results of the estimated efficiency in all regions, and thus, all regions of Slovakia have increased their productivity compared to ten years before the COVID-19 pandemic. Technological improvements significantly impacted this improvement (Vaňková & Vrabková, 2022 ). Even though Soltes ( 2016 ) states that the situation in individual regions is uneven and there are regional differences in access to care, all Slovak districts have medical facilities of varying quality. It is important to note that access to health care can affect a population's overall health, but the direct effect on fertility levels is unclear. Our research focused on other factors that directly and significantly impact fertility levels, such as economic, social, and cultural determinants. Although access to health care is important for the population's overall health status, based on available data and research, it was not considered a key factor that would directly influence the fertility rate in Slovak regions.

In addition to the determinants mentioned above, the researchers also investigated the relationship between abortion, average life expectancy (Trynov et al., 2020 ), the size of the university population, the share of employees in agriculture (Campisi et al., 2020 ), or also investigating the quality of life and the level of fertility (Palomba et al., 2018 ; Koert, 2021 ). These factors play a significant or less significant role in influencing overall fertility in the regions.

Abortion rates can have an impact on the level of fertility in Slovak regions, but this relationship is often complex and influenced by many other determinants that influence reproductive behavior. In our research, we focused on identifying factors that directly and significantly affect the level of fertility. In the initial phase of the investigation, no significant relationship between abortion rate and fertility level was demonstrated. This confirmed that abortion is not the primary determinant of fertility in Slovak regions in the context of our research.

Educating women plays a significant role in the final level of fertility. Research indicates that women's low education is directly linked to a higher fertility rate (cf. Šprocha et al., 2020 ). Although higher education also affects fertility, its effect may be more indirect and complex. Higher education is often associated with postponing parenthood, lower fertility rates, and greater emphasis on career and personal development (Šprocha & Bačík, 2021a , 2021b ). As part of our research, we decided to focus on women's low education because its impact on fertility is more direct and pronounced in the context of Slovakia.

The level of employment in agriculture indicates a society's development level, which can impact the overall fertility level. Less developed regions often show higher fertility rates, influenced by various social and cultural factors. However, in the case of Slovakia, employment in agriculture is very low, representing only 2% of the economically active population, and this share does not differ significantly in individual regions. Although theoretically, higher employment in agriculture can be related to higher fertility, empirical data for Slovakia do not confirm this relationship. Based on the analysis of regional differences, we found that higher employment in agriculture does not automatically mean higher female fertility. Therefore, we decided not to include this factor in our research because it would not contribute to a more accurate understanding of the determinants of fertility in Slovak regions.

The influence of quality of life on fertility is complex and may vary according to specific conditions in a given country or region. A higher quality of life can lead to lower fertility due to better economic stability, higher education, and better access to health care. Conversely, lower quality of life may be associated with higher fertility due to traditional values and lower access to contraception and health care. In our research, however, we found that the quality of life factor penetrates our analyzed social, economic, and cultural determinants of fertility. Specifically, we considered different aspects of quality of life within these determinants, such as economic stability, education, and access to health care, and analyzed their individual impact on fertility levels. Therefore, we decided not to include quality of life as a separate factor but rather to integrate its various aspects into the broader context of our research.

Research focus

To understand what drives the differences in the rate of participation in fertility in the regions at the beginning of the twenty-first century, we create a unified empirical framework that enables comparative analyses in space Slovakia. The research goal is to recognize the effects of socioeconomic determinants affecting fertility in the Slovak regions. This objective can only be investigated to the extent that the items in the last population census in Slovakia and the public databases of the Slovakian statistical office are available. Therefore, answers to how a region’s culture contributes to fertility may remain unanswered. Correlations between fertility rates and selected determinants may provide insight into some correlates or primary predictors of fertility in these groups. They may open avenues for further, more purposeful research into the difference in fertility levels in Slovak regions.

Data and methods

The study relies on publicly available statistics from the Statistical Office of the Slovak Republic (SOSR) and the Center for Scientific and Technical Information of the Slovak Republic (CSTI SR). The primary determinants are selected indicators from the demographic, social, and economic statistics in the public DATAcube database. The primary database consisted of data on women’s birth rate and fertility, followed by data on the unemployment rate of women, the average monthly income, and the number of completed housing units. The second section of the statistical data includes key data about religion, nationality, women's family status, and women's education from the last population census in 2021.

The CSTI SR delivers data on the number of children, pupils, and students at various types of schools in the Slovak regions. Childcare substitutability in kindergartens for children aged 3–6 years was investigated concerning fertility levels and nature in Slovak regions. The last census revealed the ethnic structure of Slovakia's inhabitants, and this information was used to estimate the Roma population, which has significantly different reproductive behaviors (Nestorová Dická, 2021 ). Correcting this data with the Atlas of Roma Communities ( 2019 ) gave a reliable reality compared to the implemented state population censuses (Šprocha, 2014 ).

Spatial research was implemented at the LAU1 level, covering Slovak districts at the regional level. Bezák ( 1996 ) proposed merging the Bratislavan and Košice city districts in separate spatial units for research purposes. They were not integrated as part of the study of the determinants of fertility, and the existence of 9 urban districts of Bratislava and 22 urban districts of Košice was preserved due to their diversity in terms of birth rate and fertility of the population, as well as socioeconomic determinants.

The basic spatial units consisted of 79 districts, diverse in spatial and population size. For example, Prešov, Nitra, and Žilina districts have more than 160,000 inhabitants. In contrast, Stropkov, Turčianské Teplice, Banská Štiavnica, and Medzilaborce consist of less than 20,000 people. Bezák ( 1996 , 1997 ) pointed out Slovak regional inequality and injustice. The available data for regional analysis is only for these spatial units. Statistical analyses of women's fertility determinants provide ample opportunity to investigate socioeconomic effects in the aforementioned regional structure. The reference period for analyzing fertility data and selected determinants was 2019–2021, and the census data was from 2021.

Due to the primary purpose of the research, which is a regression analysis of the influence of socioeconomic determinants (Table  1 ) on the level of TFR fertility in the regions as a dependent variable, the key was the selection of determinants as independent variables, the choice of which was conditional on the results of many professional studies listed in the theoretical framework dealing with various socioeconomic effects on fertility.

Initial regression analysis of the influence of selected socioeconomic indicators in spatial units on the fertility level revealed only fragile dependencies, and the correlation strength with different indicators varied in some regions. The reason is significant regional diversity in terms of fertility and socioeconomic determinants.

That led us to create a regional typification of spatial units, where individual types are quasi-homogeneous sets of spatial units with similarities in fertility nature and level. Factor analysis (FA) was crucial for the typification of spatial units. It reduced the number of intercorrelated input variables without much information loss and created new variables, i.e., factors (Nestorová Dická, 2013 ). FA assumes that each entering trait can be expressed as a linear combination of a few common latent factors.

The input database consisted of 10 variables related to various aspects of fertility and fertility (Table  1 ). The rate of Kaiser–Meyer–Olkin (KMO; 0.83), which tests the suitability of input data for FA by comparing the size of experimental correlation coefficients to the size of partial correlation coefficients, evaluated the original input variables as highly suitable for FA.

The factor model with the Principal component analysis (PCA) method was aimed at identifying interrelated groups of variables. The PCA method is one of the most frequently used in FA, identifying linear combinations of observed variables that maximize the overall variability of the data. FA used the PCA method to extract the components from the reduced correlation matrix, in which FA monitored the initial estimates of the communalities of the individual variables. For FA, the most significant factors determined according to Kaiser's eigenvalue criterion, i.e., factors with an eigenvalue greater than one, are considered statistically significant. Two primary factors can thus clarify the structure and fertility level in Slovakia regions (Table  2 ), which explain almost 91% of the total variability of the input variables.

Extracted factors with factor scores for each spatial unit express the degree of influence of individual factors (Fig.  1 ). The mentioned factors became the basis for regional typification. Relatively homogeneous regional classes concerning fertility's nature and intensity due to applied cluster analysis were created (Fig.  2 ). Where spatial units were classified through a hierarchical procedure using Euclidean distances, to which Ward's clustering method applies. Finally, the discriminant cluster analysis results verified the optimal spatial unit distribution for the selected number of types. The 10 resultant regional types were entered into regression analysis of selected fertility determinants in Slovak regions (Fig.  2 and Table  3 ).

figure 1

Extracted factors differentiating fertility nature and intensity in the Slovak regions

figure 2

Regional typification of fertility nature and intensity

The second part of the research focused on the knowledge of the effects of socioeconomic determinants on fertility in individual regional types, i.e., using regression analysis at two hierarchical levels (see Götmark & Andersson, 2020 ). Research first established the relationship between individual regional types, i.e. the investigated units here represent unique regional types with average values of TFR and socioeconomic determinants. The second level represents investigation within individual regional types at the level of spatial units, i.e. the investigated units here represent regional-type spatial units with their values of TFR and socioeconomic determinants. All analysed regional types had equal weight. For each socioeconomic determinant, a regression trend, correlation, and determination coefficient R2 were obtained at both levels, which, according to Klein ( 2020 ), provides the level of total variability explained by the independent variable. The model is more "effective" if its value is closer to 1, meaning the linear regression model explains a larger percentage of variability. Řehák ( 2023 ) adds that the coefficient of determination also represents a measure of statistical dependence between the independent and dependent variables, and its higher value indicates closer statistical dependence. The value of R2 expresses dependency strength, and the value of the coefficient β1 the character of the influence in the linear Eq. ( 1 ). A positive coefficient is established when Y increases as X increases, and negative values indicate an indirect effect. A linear regression model at the intraregional level was applied only for regional types containing more than one spatial unit.

In the final phase of the investigation, the effects determinants of regional fertility were detected using the multiple linear regression method (MLR). The MLR evaluated the relationship between a continuous target, i.e., the factors from factorial analysis, and selected 9 socioeconomic determinants. The multiple regression model is developed from a simple regression model, where the dependent variable Y is a function of several independent variables X1, X2, X3,…, X9, and the residual component (error terms). The study uses an equation model (2), with Y as the dependent variable, X as the independent variable(s), β0 is the point where the regression line intersects the Y axis, β1 to β9 are the regression coefficients that determine the direction of the line, and e is the measurement error. After controlling for the effect of other predictors, the net effect of each independent variable on the dependent variable has also been measured through multivariate analysis MLR.

As a post-communist country, Slovakia has had an insufficient level of fertility below the replacement rate since the beginning of the 1990s. Although some authors (cf. Šprocha & Bačík, 2021a , 2021b ; Šprocha & Tišliar, 2016 ) emphasized the partial reversal of development trends during the new millennium, its increased level remains below the replacement rate, while the spatial picture of fertility in Slovakia is changing. Figure  3 highlights the spatial distribution of TFR values and fertility timing in Slovakia regions. A key factor in the spatial variability of fertility in Slovakia is the advanced process of postponing childbearing. Closely linked with it are indicators of the timing and age distribution of fertility, which contribute to the current differentiation of regions (Fig.  3 ). The existence of significant differences in the intensity and timing of fertility was also confirmed by studies such as Bleha et al. ( 2014 ), Šídlo and Šprocha ( 2018 ), Šprocha et al. ( 2019 ). During the new millennium, a vast region with very low fertility was formed in Slovakia, located in Slovakia's western and easternmost areas. The processes of postponing childbearing mark the districts of this region, and therefore, by a significant increase in fertility at an older age, and according to Šprocha et al. ( 2019 ), only by a very limited catch-up rate. Moreover, the capital region is forming an area with relatively favorable recuperation phase development. The region with a favorable level of fertility, a significantly smaller area, occupies areas in the middle east and north of Slovakia, in which the favorable level of fertility is a reflection of specific socio-demographic features (cf. Drinka & Majo, 2016 ; Nestorová Dická, 2021 ; Roupa & Kusendová, 2013 ).

figure 3

Source: SO SR

Total fertility rate and timing of fertility in Slovakia regions, 2019–2021.

Factorial analysis and regional typification

The two factors extracted from factor analysis are independent variables with special links to human fertility nature and level in the Slovak regions. Figure  1 highlights the spatial distribution of factor scores and the relationship of regions to both generated fertility factors.

The first factor has a significant relation to indicators pointing to the age distribution of fertility, including the fertility of higher-order children and non-marital fertility; it can be interpreted as a factor in fertility timing, cohabitation, and family size. The factor covered up to 49% of the total variance of the variables. From a spatial perspective, the country's eastern region, particularly its southern and partly northern areas, exhibits a higher prevalence of fertility among younger women, along with increased rates of cohabitation and higher-order childbearing (Fig.  1 ). A peripheral location, underdeveloped economies with high long-term unemployment rates, inadequate or absent transport infrastructure, and lower human potential characterize regions. This is primarily evidenced by a low proportion of the population with higher education, a high proportion of residents from socially excluded environments, and marginalized Roma communities. The earlier onset of fertility, coupled with a higher prevalence of children born out of wedlock and in higher order in these regions, may be attributed to attempts to mitigate the financial uncertainty associated with the labor market (Potančoková et al., 2008 ) and ensure at least a certain income in unfavorable structural conditions, thanks to which the costs of lost opportunities related to childcare are low (Šprocha & Tišliar, 2019 ). In contrast, the North-Western geographical regions are prosperous and attractive to a higher-income population. The postponement of childbearing can be explained by the higher participation of women in the labor market; their higher education offers greater career prospects and economic independence in the labor market with building a job position (NíBrolcháina & Beaujouana, 2012 ), as a result, the costs of lost opportunities are high.

The second factor expresses the intensity of fertility, and it covers 42% of the total variability in the input information. Low and insufficient fertility, well below the replacement rate, is typical for many regions in the extreme east or west, south or central part of Slovakia (Fig.  1 ). Ethnic, cultural, and economic factors determine the higher intensity of fertility. Behind the low intensity of fertility can be peripherality or rurality (Nestorová Dická et al., 2019 ), bad economic situation, and a greater concentration of the population without religion or belonging to Protestant denominations. Finally, these communities generally have lower fertility levels, and often a higher percentage of the aged population.

Fertility regional typification in Slovakia was created by combining the obtained factor score levels. Figure  2 shows the ten regional types generated by cluster analysis, representing the different structures of human fertility and natality. Regional types A to E reflect low fertility and natality, with older women's more intensive fertility. They occupy the far east of Slovakia and the western region, except the capital region. Regional types F to J, which have higher fertility levels, represent the opposite. They are found in the central part of eastern Slovakia and the north. The positive extreme is represented by regional type J, which reached a higher fertility level above the replacement rate, with the fertility of primarily younger women and a higher birth rate of children of higher order. Its existence is connected with the Roma population, which has a significant presence mainly in the central part of eastern Slovakia. In contrast, the negative extreme is represented by region A, which has the lowest-low fertility according to the designation fertility levels by Kohler et al. ( 2002 ) and occupies Slovak districts primarily in the far east and western parts of the country.

The regression analysis applied to previous research results (regional types) established the effect of selected socioeconomic determinants on fertility (TFR) at two hierarchical levels. Because the cluster analysis was used, the variation in TFR values was relatively low for individual regional types inside, i.e., at the intraregional level. TFR values varied somewhat more between districts in regions J and F. Higher variability is between average TFR values of individual types, i.e., at the interregional level. Regional types G and I represent the highest levels of TFR (2.4–2.3), but Types A to H registered TFR values well below the replacement rate.

The key findings of the relationship between selected socioeconomic determinants and TFR in regional types are documented in Fig.  4 . Regression analysis reveals that the fertility level of the population was negatively related primarily to the women's divorce rate, Childcare substitutability rate, urbanization, and monthly income at the interregional level. A positive trend was determined for the Roma ethnicity, the Catholic population, unemployment, and low education of women. However, residential building did not affect fertility. The women’s divorce rate had the closest relationship with fertility at R2 = 0.62 and Roma ethnicity at R2 = 0.44.

figure 4

Total Fertility Rate (TFR) in Regional Types and its Relationship to Socioeconomic Determinants

Figures  2 and 4 highlight that regional type I, which represents only one district known for strong tradition, religious beliefs, and national unity with the low ethnic diversity of its population, records the lowest 3.7% average divorce rate with an extreme positive level of TFR. Type F, as the region of the capital with the hinterland, had the highest rate, with a value of up to 14%. The TFR is low below the replacement level. This stark contrast in divorce rates significantly impacts the TFR, which decreases with increasing divorce rates in regional types. On the contrary, the fertility level increased significantly with Roma ethnicity, where the average concentration of the Roma population varied from zero in types F and I to 8.1% in type J, where the TFR with a level of 2.3 children per woman represents the highest recorded value among the Slovak regions.

A significant correlation with TFR was also achieved with the degree of enrollment rate of children in kindergartens or the degree of urbanisation, while women's fertility decreases with their growth. The enrollment rate of children ranged from 535‰ in type J, with 2.3 children per woman, to 808‰ in type A, representing the lowest fertility level of 1.3. Similarly, it noted the indicator of urbanization, where values ranged from 12.3% in type I with a fertility level above the replacement rate to 82.7% in type F with a TFR of 1.6 children per woman. The amount of the monthly income also demonstrated a slight negative dependence. Research indicates that economically developed regions are directly related to a lower TFR. On the contrary, TFR values increased more significantly with the level of Catholicism and female unemployment. The proof is the generated types I and J, where the types reached the highest values of measures of Catholicism and simultaneously have the highest levels of TFR, i.e. more than 2.0 children. Regions dominated by the Protestant population record a lower TFR level below 1.5 children. Female unemployment was the lowest in the regional type of the capital city with hinterland (type F), with a value of 5.5%. The TFR of type F reached only 1.6 children per woman, and the highest fertility level, 2.3 children, was recorded in type J, with 20% of unemployed women.

From the point of view of the broader socioeconomic context, however, there are significant correlations also between the individual socioeconomic controls (Table  4 ) among the ten regional types. It was noted that the low educational level of women is significantly positively correlated with the Roma community, in which, at the same time, higher female unemployment persists with low monthly incomes, rates of children’s schooling, residential construction, or the rate of urbanization. The higher Roma community in regional types G, H, and J confirms higher unemployment and low incomes dominated by the rural population with a low level of residential development. The female divorce rate, on the other hand, indicates a significant relationship to urbanization with higher monthly incomes but with lower rates of Catholics, who are more likely to be concentrated in more rural regions. Residential construction primarily concerns regions with higher incomes, educational levels, and female employment. The aforementioned socioeconomic controls are documented by the Bratislava regional type F.

The TFR and socioeconomic determinant relation are also analyzed at the level of individual districts within regional types. Correlations between TFRs and socioeconomic determinants are present in Fig.  5 . However, the strength and nature of the relationships often differed from the previous level. Figures  4 and 5 highlight deviations in intraregional regression. We registered the most significant variations in four socioeconomic indicators. Prevailing opposite trends appear in the case of monthly income, Catholicism, and women's unemployment. On the other hand, residential building development showed a positive trend several times at the intraregional level, while independence was demonstrated only in one type, J. All regional types showed a negative association between TFR and urbanization, confirming the relationship reported in many studies.

figure 5

Correlations between Total Fertility Rates and socioeconomic determinants

In five types, TFR increases with the increasing rate of women with low education, while this relationship is strongest in Bratislava type F. In five regions, the TFR decreases to the growing women's divorce rate, most in type C. Surprisingly, the opposite but significant trend was noted in type A. Regression analysis significantly and positively confirmed the association between fertility and Roma ethnicity in four types, but the relationship was negative in three.

Types C and F show a certain similarity in the observed trends, which, apart from the high intensity of dependence, differ only in the unemployment of women, where the opposite trends have taken place. Regional type D, significantly spatially disjointed with districts scattered throughout the country, did not capture a significant relationship to the given determinants.

Compared to the relationships found at the interregional level, no type showed complete similarity in the relationship of TRF to socioeconomic determinants. However, the most significant similarity was found in Bratislava type F, whose districts showed opposite trends only at Roma ethnicity and residential building. Regional type A shows significant differences, with confirmed tendency only in women's low education or the degree of childcare substitutability. The closeness of the relationships is not significant.

In assessing the intensity of the association between TFR and the individual determinants, it is evident that the individual region types are clearly associated with a different combination of socioeconomic determinants (Fig.  5 ). This is also evidenced by the one-way test ANOVA, which shows the presence of statistically significant differences between the region types and the socioeconomic indicators (Table  5 ). Thus, there are significant differences in TFR according to women's socioeconomic background.

Determinants of the spatial differentiation of fertility

In three separate multivariate analyses, the factor scores of both factors and TFR represented the dependent variables. A multiple linear regression (MLR) model was used to determine the joint effect of the nine socioeconomic determinants, using the Enter method to determine statistical significance with respect to the dependent variable. Before conducting the multivariate analysis, regression assumptions such as normality of residuals, homoscedasticity, multicollinearity, and independence of residuals were tested. The significance level for all tests was p  < 0.05 and p  < 0.001. The scientific interest was directed to the knowledge of primary predictors of Slovak regional fertility.

A linear relationship was initially assumed between each independent variable and Slovak regional fertility. The MLRs included only statistically significant effects of the chosen variables. The values of the R, R-squared, and Adjusted R-squared multiple correlation coefficients revealed a gradual decrease in model significance. However, the models retained high quality with high significant values, and ANOVA confirmed they all explain a substantial percentage of regional variation. Table 6 highlights that the variables included in the models can explain a significant 92 per cent ( p  < 0.001) of the variance in fertility at fertility timing, a substantial 69 per cent ( p  < 0.001) in fertility intensity, and a significant 68 per cent ( p  < 0.001) in total fertility rate.

The prediction of the first factor value, which explains the difference in fertility timing in Slovakia's regions, was found to be statistically significant with only seven variables. Ethnicity and divorce rate did not affect the first factor and were excluded from multi-linear regression. However, multivariate analysis determined that low women's education, residential building, and women’s unemployment levels are significant predictors of fertility timing. The coefficients in Table  6 represent the effect of each independent variable on the dependent variable, which highlights that women’s low education significantly and positively affects the value of the fertility timing factor at p  < 0.001. Increasing women's low education rate tended to increase the value of the given factor ( β  = 0.348). The coefficient of unemployment ( β  = 0.176) showed a significant ( p  < 0.05) positive effect on the timing of fertility, i.e. at the fertility of younger women. Other variables had a negative effect on the model. The low intensity of residential building development significantly and negatively affects the timing factor's values, indicating fertility in older women ( β  = −0.288). Similarly, the coefficients ( β  = −0.173, β  = −0.152, β  = −0.151, β  = −0.118) showed significant influence on the timing of older women's fertility at p  < 0.05. These are the respective β coefficients for child kindergarten enrollment, income, urbanization, and Catholicism. This indicates that when the enrollment rate of children in kindergartens increases, the income increases, the rate of urbanization and Catholics also decreases, and the factor timing values decrease, which indicates the fertility of older women. These findings significantly impact our understanding of fertility patterns in Slovakia's regions.

The prediction of the value of the second factor, which clarifies the difference in fertility intensity in Slovakia's regions, was statistically significant, with only four variables at the 0.001 significant level. Five of the original variables were excluded from the model. The MLR revealed that Roma ethnicity, monthly income level, residential building, and the women's divorce rate significantly affect fertility intensity in the Slovak regions. Notably, increasing Roma ethnicity tended to increase the values of the given factor, i.e., fertility intensity ( β  = 0.550). Similarly, fertility intensity in Slovakia's regions increases with monthly income or residential construction development ( β  = 0.479, β  = 0.388). The women's divorce rate indicator was negatively affected in the model. The lower rate of divorced women has a significant and negative effect on the values of the factor, indicating lower fertility intensity ( β  = −0.533).

The final multivariate analysis was examined using the total fertility level, which also indicates the potential reproductive competence of the population. As a statistical measure, it estimates the average number of children a woman would bear in her lifetime if current age-specific fertility rates remain constant. It is a valuable tool for demographers and policymakers to understand and forecast population trends. The prediction of the level of TFR was statistically significant with five variables at the 0.001 and 0.05 levels, with Roma ethnicity ( β  = 0.697) and residential building ( β  = 0.302) indicating significant positive predictors of the level of TFR in the regions. Low levels of the above variables predict low levels of TFR. On the other hand, significant negative predictors of the level of TFR in the regions are unemployed women, enrollment of children in kindergartens, and divorced women in the region's population. If the values of the above variables increase ( β  = −0.345, β  = −0.306, β  = −0.275), the level of TFR decreases.

The demographic transition and the related changes in people's reproductive behavior also affected the Slovak regions from the late 1990s (Mládek, 1999 ). The postponement of childbearing, the reduction of the marital birth rate, and the increase of one-child families and childless families due to the changed living, social, economic, and political conditions of the socialist model of reproductive behavior were the main reasons for the significant change in the fertility of the Slovak population with a decrease in the birth intensity and its stabilization far below the replacement rate (Šprocha & Tišliar, 2016 ). These findings are the driving force for the current research on fertility and natality under the conditions of Slovakia (Babinčák & Kačmárová, 2023 ; Bleha & Ďurček, 2019 ; Lentner & Horbulák, 2021 ; Šprocha & Fitalová, 2022 ; Šprocha & Tišliar, 2019 ; Šprocha et al., 2022 ). Moreover, the study's results highlight the main determinants of regional fertility in Slovakia, which are spatially significantly differentiated due to their nature and intensity.

The basic dimensions of Slovakia's regional fertility were developed through factor analysis, which provided a two-factor solution to the fertility assessment model. The two-factor model is supported by Šprocha et al’s ( 2022 ) independent research, but variations in input variables reverse the factors’ nature. The first factor indicates the nature of fertility, which differentiates regions in terms of the timing of fertility, and according to Šprocha and Šidlo ( 2016 ), the timing of fertility is the main factor in the variability of fertility in Slovakia. The second factor captures fertility intensity, which is influenced by demographic, socioeconomic, and cultural determinants, as well as the timing of fertility (Šprocha & Bačík, 2021a , 2021b ). To construct primary regional fertility types, a two-factor model of fertility was applied using selected socioeconomic determinants at two hierarchical levels. The cluster analysis generated almost homogeneous regional types of fertility nature and intensity. Regions A-F show low fertility levels, and G-J have higher fertility above the national average. The number of clusters is purposefully higher than those examined in Šprocha et al. ( 2022 ). This provides a more detailed investigation of regression analysis with selected determinants.

Research on the inter-regional fertility level confirmed the expected relations for the selected indicators, especially about its nature, which is less connected with its strength. Women's divorce rates show negative dependence, which is consistent with the findings of some studies (see Meggiolario & Ongaro, 2010 ; Van Bavel et al., 2012 ) that demonstrate divorce has a negative effect on human fertility. A low Slovak female divorce rate is, therefore, a guarantee of higher TFR values, as evidenced by regional types I and J. The study found a positive trend in the Catholic religion, in line with the findings and theoretical concepts of Philipov and Berghammer ( 2007 ) and Zhang ( 2008 ), albeit one that is less pronounced as the women's divorce rate, which is consistent with their findings and theoretical concepts. Catholics are among the religious groups that support marriage and children and oppose contraception and abortion, which would decrease their fertility. Their less pronounced dependence confirms Šprocha and Tišliar's ( 2019 ) claim about losing the power of religiosity as a differential factor in reproductive behavior in Slovakia. However, it is still true that Slovak regions with a low concentration of Catholics achieve low TFR. The high and positive dependency on TFR among Roma ethnicity is related to their significantly different reproductive behavior (Nestorová Dická, 2021 ), which manifests itself in higher fertility intensity, especially in a socially excluded environment (Šprocha, 2014 ). Areas of higher concentration of the Roma population also guarantee a higher level of TFR. As in this study, Roma ethnicity was a significant predictor of fertility by Šprocha and Bleha ( 2018 ). Educating women plays a significant role in the final level of fertility, particularly in influencing the TFR. Research indicates that women's low education is directly linked to higher TFR, as lower educational attainment often correlates with earlier and more frequent childbearing (cf. Šprocha et al., 2020 ). Our analysis confirms this, showing a statistically significant relationship between lower educational levels and higher TFR across Slovak regions.

At the same time, socioeconomic controls revealed the connection of the Roma community with a low level of education, women's employment, and income, which was also pointed out by Rosičová et al. ( 2009 ), Preoteasa ( 2013 ), Andrei et al. ( 2014 ), etc. Similarly, urbanization, documenting the advancement of human civilization with social and economic progress, is more closely related to higher education (Marginson, 2016 ) and divorce rate (Zhang et al., 2014 ) than rural areas, as well as higher economic potential, which is documented by higher incomes or GDP (Henderson, 2003 ; Sancar & Sancar, 2017 ). Research indicates that economically developed regions are directly related to a lower TFR, as a higher level of education of women often correlates with later and less childbearing (cf. Šprocha et al., 2020 ). Several studies have also confirmed the relationship between religion, including Catholicism, and the divorce rate (e.g., Corley & Woods, 2021 ; Sander, 2019 ) and the availability of childcare facilities and unemployment for women and men (Kim, 2018 ; Legazpe & Davia, 2019 ).

Similarly, women's unemployment had a positive dependence. This is explained by Becker's neoclassical theory, which states that unemployment should reduce the cost of lost opportunities by providing time for childbearing and child care, thus promoting human fertility (Cazzola et al., 2016 ). Urbanisation exhibited a negative trend in fertility outcomes, which can be explained by the compositional or context hypothesis (see more Kulu, 2013 ; Kulu & Washbrook, 2014 ). Riederer and Buber-Ennser ( 2019 ) add that the main reason for the postponement of births in urban regions, where actual fertility rates are lower, in Eastern European countries may be precisely the difference in the context of the urban-suburban-rural gradient. Although monthly income shows a negative influence, it has a less pronounced tightness, which agrees with the earlier findings of Marenčáková ( 2006 ), who also confirmed the negative relationship between income and fertility in Slovakia.

The kindergarten enrolment rate has a significantly negative fertility impact in regions where the total fertility rate decreases as the enrolment rate increases. There are contrasting reports of childcare's influence on fertility. Some researchers consider a positive effect of available childcare but add that the results of several empirical studies do not support the hypothetical expectations (Wood & Neels, 2019 ; Ridfuss et al., 2010 ). Our research confirmed this, but the relationship may be related to many children from a socially disadvantaged environment without pre-primary education. However, Varsik ( 2019 ) considers that children from marginalized Roma communities have low enrolment rates, which conditions its negative effect on their communities. These communities are more concentrated in regional types with a higher level of fertility.

Only residential building demonstrated independence, but findings vary. Some confirm the causal relationship between housing affordability as the independent variable and fertility behavior as the dependent variable, often observed in less developed societies where the state controls the housing market (Felson & Solaún, 1975 ). Kostelecký and Vobecká ( 2009 ) found the existence of specific connections in Czechia when the improvement of the economic situation (increase in GDP and housing construction, decrease in inflation and unemployment) was accompanied by an improvement in the availability of owner-occupied housing and an increase in the birth rate. While the importance of housing availability has been scientifically proven, Salvati ( 2021 ) recorded that the positive impact of residential building was confirmed only in the initial phase of urban development. Considering the observed independence at the inter-regional level and the prevailing positive dependence at the intra-regional level, research in this direction requires further investigation into other housing factors that significantly impact fertility outcomes. For example, affordability (Clark et al., 2020 ) may significantly impact actual birth rates and TFR more than the development of new residential buildings.

However, the current nature and strength of the relationship between selected socioeconomic determinants and TFR are not identical within the individual types at the intraregional and interregional levels. From the assessment of the impact of TFR, it follows that individual regional types have different associations with the determinants studied, which was also confirmed by a one-way ANOVA test with statistically significant differences between regional types and socioeconomic determinants. Thus, the research demonstrated significant differences in TFR depending on the socioeconomic background of women, which was also confirmed in the study of Polykretis and Alexakis ( 2021 ).

The two MLR analyses determined that urbanisation, Catholicism, and women’s low level of education do not influence fertility level or intensity. However, these variables significantly affect fertility timing and differentiate younger and older women’s fertility, while the variable of Roma ethnicity is not statistically significant, which contradicts other analyses. In Slovak regions, fertility timing is not only associated with Roma issues but also in some "non-Roma" regions, fertility of younger women is significant, especially in the south of Slovakia (Levice, Poltár, Veľký Krtíš, Krupina districts), where fertility timing is also associated with the increase in out-of-wedlock births (Šprocha & Tišliar, 2021 ), which was predicted for both Hungarian and Roma populations. In addition, Roma ethnicity is a significant predictor of fertility intensity and level in Slovak regions. This is supported by the results of the studies, e.g. Šprocha et al. ( 2022 ), Nestorová Dická ( 2021 ), Szabó et al., 2021 , Šprocha, ( 2014 ), Potančoková et al. ( 2008 ), which point to the higher reproductive characteristics of Roma population in Slovak regions. Šprocha and Bleha ( 2018 ) add that at least two-thirds of the overall variability is explained by some selected "socioeconomic" indicators and the share of the Roma population.

In the initial phase of the research, it was established that variations in fertility and birth rates are predominantly stratified along the urban–rural continuum. Notably, rural areas in Slovakia exhibit significantly higher fertility rates. However, as Nestorová Dická et al. ( 2019 ) highlighted, extreme rural municipalities characterized by low population density experience notably diminished population reproduction rates. Consequently, it becomes evident that fertility within Slovak rural areas primarily thrives within suburban municipalities and those with moderate population sizes. Nevertheless, two multiple linear regression (MLR) models indicate that the urban–rural gradient exerts negligible influence on the intensity and magnitude of fertility outcomes within the Slovak interregional framework, a finding corroborated by Šprocha et al.'s ( 2022 ) study in Slovakia. The distribution of the populace across urban and rural settings finds validation in fertility timing, as underscored by the research conducted by Riederer and Buber-Ennser ( 2019 ).

Unemployment has also become an important positive predictor of fertility timing, but it has a negative effect on the total fertility rate. Slovak economically developed regions (Korec, 2009 ) achieve higher fertility levels with high employment (Šprocha & Bleha, 2018 ) due to selective migration (Novotný et al., 2023 ), but also housing availability. On the other hand, Šprocha et al. ( 2022 ) supplement that young people in this area face problems related to high housing prices and low childcare substitutability rate (Križan et al., 2022 ; Madajová et al., 2021 ) due to insufficient capacity. Consistent with these conditions, an important finding of Mills et al. ( 2011 ) is that educated people are more likely to focus on building a career and, therefore, seek to combine parenthood at an older age, which affects their fertility outcomes, confirmed by the results of the first MLR.

The most important determinant affecting regional fertility differences is women's education, especially young women's education, according to research by Kostelecký and Vobecká ( 2009 ). The authors add that when women's education is controlled, housing affordability plays an important role in explaining regional differences in fertility—both the total fertility rate and the fertility timing. However, women’s low education was not significantly related to the level or intensity of fertility outcomes in Slovak regions. Götmark and Andersson ( 2020 ) support this finding in Eastern European countries, where there was no or a weak relation between education and fertility outcomes.

Slovakia's reproductive potential seems to have been stabilised for a long time below the level of insufficient population replacement. Therefore, research into understanding and predicting total and regional fertility with the prediction of further development is important for various scientific disciplines and interest groups.

Each determinant influencing fertility has specific spatial patterns with unequal regression coefficients at different regional levels, which cannot be summarised in a constant way. Therefore, implementing multivariate statistical techniques can adequately describe the relationships between fertility and its determinants. Mathematical-statistical techniques can potentially contribute to demography, geography, and social sciences by offering usually invisible solutions, enabling practitioners to better understand relationships' spatial perspectives. In this context, the study was to reveal the effect of a set of nine socioeconomic determinants on the level of regional fertility. To achieve this, spatial changes in the distribution of fertility and determinants were captured, and spatial heterogeneity in their relationships was examined using regression techniques.

We analyzed fertility variations across Slovak regions, examining the convergence or divergence of our regional findings about widely accepted theories on the determinants influencing fertility levels. The results highlighted that selected socioeconomic determinants were not unnecessary in any case, and they all showed certain regional connections about fertility nature and level. We have demonstrated that regional variations in fertility rates arise from economic, social, and cultural factors. The models differ regarding the nature and intensity of fertility in relation to the urban–rural dichotomy, the educational or Roma issue. The persistence of geographic variation will be important for understanding Slovak regional fertility levels.

Knowledge of prevailing geographic and population influences is essential to understanding Slovakia's regional fertility levels and intensity. Given that fertility is vital for social assessment and policy formulation, the study's findings could help local decision-makers and planners identify the socioeconomic conditions underlying fertility at the regional level and plan appropriate intervention strategies.

Abdennadher, M., Bouabid, A., & de Peretti, C. (2022). Socio-economic and regional disparities in fertility in Tunisia: A spatial econometrics approach. Available at SSRN . https://doi.org/10.2139/ssrn.4240447

Article   Google Scholar  

Adebowale, A. S. (2019). Ethnic disparities in fertility and its determinants in Nigeria. Fertility Research and Practice, 5 (1), 1–16. https://doi.org/10.1186/s40738-019-0055-y

Adema, W., Ali, N., & Thévenon, O. (2014). Changes in Family Policies and Outcomes: Is There Convergence. OECD Social, Employment and Migration Working Papers , No. 157. Paris: OECD. https://doi.org/10.1787/1815199X

Adhikari, R. (2010). Demographic, socioeconomic, and cultural factors affecting fertility differentials in Nepal. BMC Pregnancy and Childbirth, 10 (1), 1–11. https://doi.org/10.1186/1471-2393-10-19

Adresa, A. (2011). Where are the babies? Labor market conditions and fertility in Europe. European Journal of Population, 27 (1), 1–32. https://doi.org/10.1007/s10680-010-9222-x

Adsera, A. (2017). Education and fertility in the context of rising inequality. Vienna Yearbook of Population Research, 15 , 63–92. https://doi.org/10.1553/populationyearbook2017s063

Ahn, N., & Mira, P. (2002). A note on the changing relationship between fertility and female employment rates in developed countries. Journal of Population Economics, 15 (4), 667–682. https://doi.org/10.1007/s001480100078

Altuzarra, A., Gálvez-Gálvez, C., & González-Flores, A. (2019). Economic development and female labour force participation: The case of European Union countries. Sustainability, 11 (7), 1962. https://doi.org/10.3390/su11071962

Andersen, S. H., & Özcan, B. (2021). The effects of unemployment on fertility. Advances in Life Course Research, 49 (1), 100401. https://doi.org/10.1016/j.alcr.2020.100401

Anderson, T., & Kohler, H. P. (2015). Low fertility, socioeconomic development, and gender equity. Population and Development Review, 41 (3), 381–407. https://doi.org/10.1111/j.1728-4457.2015.00065.x

Andrei, R., Martinidis, G., & Tkadlecova, T. (2014). Challenges faced by Roma women in Europe on education, employment, health and housing-focus on Czech Republic, Romania and Greece. Balkan Social Science Review, 4 , 323–351.

Google Scholar  

Atlas of Roma Communities, (2019). Office of the Slovak government Plenipotentiary for Roma communities, Bratislava. Available at: https://www.minv.sk/?atlas-romskych-komunit-2019

Axinn, W. G., & Barber, J. S. (2001). Mass education and fertility transition. American Sociological Review, 66 (4), 481–505. https://doi.org/10.2307/3088919

Babinčák, P., & Kačmárová, M. (2023). Attitudes towards multi-child families and their relationship with fertility preferences and other characteristics. Journal of Family Studies, 29 (3), 1362–1378. https://doi.org/10.1080/13229400.2022.2048963

Bagavos, C. (2019). On the multifaceted impact of migration on the fertility of receiving countries. Demographic research , 41, 1–36. https://www.jstor.org/stable/26850641

Bagavos, Ch., Tsimbos, C., & Verropoulou, G. (2008). Native and migrant fertility patterns in Greece: A cohort approach. European Journal of Population, 24 (3), 245–263. https://doi.org/10.1007/s10680-007-9142-6

Beaujouan, E., Brzozowska, Z., & Zeman, K. (2016). The limited effect of increasing educational attainment on childlessness trends in twentieth-century Europe, women born 1916–65. Population Studies, 70 (3), 275–291. https://doi.org/10.1080/00324728.2016.1206210

Becker, G. S. (1992). Fertility and the economy. Journal of Population Economics, 5 (3), 185–201. https://doi.org/10.1007/BF00172092

Becquet, V., Sacco, N., & Pardo, I. (2022). Disparities in gender preference and fertility: Southeast asia and latin America in a comparative perspective. Population Research and Policy Review, 41 (3), 1295–1323. https://doi.org/10.1007/s11113-021-09692-1

Bein, Ch., Gauthier, A. H., & Mynarska, M. (2021b). Religiosity and fertility intentions: can the gender regime explain cross-country differences? European Journal of Population, 37 (2), 443–472. https://doi.org/10.1007/s10680-020-09574-w

Bein, Ch., Mynarska, M., & Gauthier, A. H. (2021). Do costs and benefits of children matter for religious people? Perceived consequences of parenthood and fertility intentions in Poland. Journal of Biosocial Science, 53 (3), 419–435. https://doi.org/10.1017/S0021932020000280

Bernhardt, E. M. (1993). Fertility and employment. European Sociological Review, 9 (1), 25–42. https://doi.org/10.1093/oxfordjournals.esr.a036659

Bezák, A. (1996). Reflexie nad novým administratívnym členením Slovenskej republiky. Geografické Informácie, 4 , 7–9.

Bezák, A. (1997). Priestorová organizácia spoločnosti a územno-správne členenie štátu. Geografické Štúdie, 3 , 6–13.

Billari, F. C., & Dalla-Zuana, G. (2013). Cohort replacement and homeostasis in world population, 1950–2100. Population and Development Review, 39 (4), 563–585. https://doi.org/10.1111/j.1728-4457.2013.00628.x

Billari, F., & Kohler, H. P. (2004). Patterns of low and lowest-low fertility in Europe. Population Studies , 58 (2), 161–176. https://doi.org/10.1080/0032472042000213695

Bleha, B., Mészáros, J., Pilinská, V., Šprocha, B., & Vaňo, B. (2020). Analýza demografického vývoja oblastí a obcí podľa štatútu a veľkosti v Slovenskej republike. INFOSTAT Bratislava, Výskumné demografické centrum, Prírodovedecká fakulta UK, Centrum spoločenských a psychologických vied SAV. http://www.infostat.sk/vdc/pdf/Analyza_oblasti_obce_Slovensko.pdf

Bleha, B., & Ďurček, P. (2019). An interpretation of the changes in demographic behaviour at a sub-national level using spatial measures in post-socialist countries: A case study of the Czech Republic and Slovakia. Papers in Regional Science, 98 (1), 331–351. https://doi.org/10.1111/pirs.12318

Bleha, B., Šprocha, B., & Vaňo, B. (2014). Demografická prognóza okresov Slovenska do roku 2035 v kontexte odhaľovania geografickej nerovnomernosti a konvergencie. Acta Geographica Universitatis Comenianae, 58 (1), 11–44.

Del Boca, D., Aaberge, R., Colombino, U., Ermisch, J., Francesconi, M., Pasqua, S., & Strom, S. (2003). Labour market participation of women and fertility: the effect of social policies. In FRDB Child Conference . Alghero (Vol. 1206). https://www.frdb.org/wp-content/uploads/2003/06/copy_0_paper_delboca.pdf

Bongaarts, J. (2009). Human population growth and the demographic transition. Philosophical Transactions of the Royal Society b: Biological Sciences, 364 (1532), 2985–2990. https://doi.org/10.1098/rstb.2009.0137

Bono, E. D., Weber, A., & Winter-Ebmer, R. (2015). Fertility and economic instability: The role of unemployment and job displacement. Journal of Population Economics , 28(2), 463–478. http://www.jstor.org/stable/44289905

Bontje, M. (2020). Population and development. International encyclopedia of human geography (2nd ed.), 229–234. Elsevier. https://doi.org/10.1016/B978-0-08-102295-5.10338-5

Booth, H. (2010). Ethnic differentials in the timing of family formation: A case study of the complex interaction between ethnicity, socioeconomic level, and marriage market pressure. Demographic research , 23(7), 153–190. https://www.jstor.org/stable/26349592

Brodeur, T. Y., Grow, D., & Esfandiari, N. (2022). Access to fertility care in geographically underserved populations, a second look. Reproductive Sciences, 29 (7), 1983–1987. https://doi.org/10.1007/s43032-022-00991-2

Buber-Ennser, I., & Berghammer, C. (2021). Religiosity and the realisation of fertility intentions: A comparative study of eight European countries. Population, Space and Place, 27 (6), 433. https://doi.org/10.1002/psp.2433

Campisi, N., Kulu, H., Mikolai, J., Klüsener, S., & Myrskylä, M. (2020). Spatial variation in fertility across Europe: Patterns and determinants. Population, Space and Place, 26 (4), e2308. https://doi.org/10.1002/psp.2308

Cazzola, A., Pasquini, L., & Angeli, A. (2016). The relationship between unemployment and fertility in Italy: A time-series analysis. Demographic research , 34(1), 1- 38. https://www.jstor.org/stable/26332027

Chao, F., Gerland, P., Cook, A. R., & Alkema, L. (2021). Global estimation and scenario-based projections of sex ratio at birth and missing female births using a Bayesian hierarchical time series mixture model. The Annals of Applied Statistics, 15 (3), 1499–1528.

Chao, F., Kc, S., & Ombao, H. (2022). Estimation and probabilistic projection of levels and trends in the sex ratio at birth in seven provinces of Nepal from 1980 to 2050: A Bayesian modeling approach. BMC Public Health, 22 (1), 358. https://doi.org/10.1186/s12889-022-12693-0

Cheng, H., Luo, W., Si, S., Xin, X., Peng, Z., Zhou, H., Liu, H., & Yu, Y. (2022). Global trends in total fertility rate and its relation to national wealth, life expectancy and female education. BMC Public Health, 22 (1), 1–13. https://doi.org/10.1186/s12889-022-13656-1

Chui, T.W.L., & Trovata, F. (1989). Ethnic Variations in Fertility: Microeconomic and Minority Group Status Effects. International Review of Modern Sociology , 19(1), 37–52. http://www.jstor.org/stable/41420942

Ciminelli, G., Schwellnus, C., & Stadler, B. (2021). Sticky floors or glass ceilings? The role of human capital, working time flexibility and discrimination in the gender wage gap. OECD Economics Department Working Papers , No. 1668, OECD Publishing, Paris, https://doi.org/10.1787/02ef3235-en

Clark, J., & Ferrer, A. (2019). The effect of house prices on fertility: Evidence from Canada. Economics, 13 (1), 20190038. https://doi.org/10.5018/economics-ejournal.ja.2019-38

Clark, W. A., Yi, D., & Zhang, X. (2020). Do house prices affect fertility behavior in China? An empirical examination. International Regional Science Review, 43 (5), 423–449. https://doi.org/10.1177/0160017620922885

Corley, C. J., & Woods, A. Y. (2021). Socioeconomic, sociodemographic and attitudinal correlates of the tempo of divorce. In The Consequences of Divorce (pp. 47–68). Routledge.

D'Addio, A. Ch., & D'Ercole, M. M. (2005). Trends and Determinants of Fertility Rates: The Role of Policies. OECD Social, Employment and Migration Working Papers , No. 27, OECD Publishing, Paris, https://doi.org/10.1787/880242325663

Del Rey, E., Kyriacou, A., & Silva, J. I. (2021). Maternity leave and female labor force participation: Evidence from 159 countries. Journal of Population Economics, 34 , 803–824. https://doi.org/10.1007/s00148-020-00806-1

Dembski, S., Sykes, O., Couch, C., Desjardins, X., Evers, D., Osterhage, F., Siedentop, S., & Zimmermann, K. (2021). Reurbanisation and suburbia in Northwest Europe: A comparative perspective on spatial trends and policy approaches. Progress in Planning, 150 , 100462. https://doi.org/10.1016/j.progress.2019.100462

Dilmaghani, M. (2019). Religiosity, secularity and fertility in Canada. European Journal of Population, 35 (2), 403–428. https://doi.org/10.1007/s10680-018-9487-z

Drinka, R., & Majo, J. (2016). Veľké vidiecke obce na Slovensku–vybrané charakteristiky plodnosti na začiatku 21. Storočia. Geografický Časopis, 68 (4), 301–317.

Dubuc, S., & Haskey, J. (2010). Ethnicity and Fertility in the United Kingdom. Ethnicity and Integration: Understanding Population Trends and Processes, 3 , 63–82. https://doi.org/10.1007/978-90-481-9103-1_4

Džupinová, E., Halás, M., Horňák, M., Hurbánek, P., Káčerová, M., Michniak, D., Ondoš, S., & Rochovská, A. (2008). Periférnosť a priestorová polarizácia na území Slovenska . Bratislava (Geo-grafika).

El-Ghannam, A. R. (2005). An examination of factors affecting fertility rate differentials as compared among women in less and more developed countries. Journal of Human Ecology, 18 (3), 181–192. https://doi.org/10.1080/09709274.2005.11905828

Fekiačová, E. (2019). Vplyv vzdelania na reprodukčné správanie v kontexte nízkej plodnosti na Slovensku po roku 1992. Diplomová práca, Univerzita Karlova, Praha. https://relik.vse.cz/2021/download/pdf/487-Fekiacova-Eva-paper.pdf

Felson, M., & Solaún, M. (1975). The fertility-inhibiting effect of crowded apartment living in a tight housing market. American Journal of Sociology, 80 (6), 1410–1427. https://doi.org/10.1086/225997

Fernandez-Crehuet, J. M., Gil-Alana, L. A., & Barco, C. M. (2020). Unemployment and fertility: A long run relationship. Social Indicators Research, 152 , 1177–1196. https://doi.org/10.1086/225997

Forste, R., & Tienda, M. (1996). What’s behind racial and ethnic fertility differentials? Population and Development Review, 22 , 109–133. https://doi.org/10.2307/2808008

Fox, J., Klüsener, S., & Myrskylä, M. (2019). Is a positive relationship between fertility and economic development emerging at the sub-national regional level? Theoretical considerations and evidence from Europe. European Journal of Population, 35 , 487–518. https://doi.org/10.1007/s10680-018-9485-1

Frejka, T., & Westoff, Ch. F. (2008). Religion, religiousness and fertility in the US and in Europe. European Journal of Population, 24 (1), 5–31. https://doi.org/10.1007/s10680-007-9121-y

Golian, J., & Liczbińska, G. (2022). The Influence of Extreme Exogenous Shocks on the Sex Ratio at Birth A Study of the Population of Detva (Upper Hungary), 1801–1920. Romanian Journal of Population Studies , https://doi.org/10.24193/RJPS.2022.2.02

Götmark, F., & Andersson, M. (2020). Human fertility in relation to education, economy, religion, contraception, and family planning programs. BMC Public Health, 20 (1), 1–17. https://doi.org/10.1186/s12889-020-8331-7

Gray, E., & Evans, A. (2019). Changing education, changing fertility: a decomposition of completed fertility in Australia. Australian Population Studies , 3(2), 1–15. https://doi.org/10.37970/aps.v3i2.42

Haan, P., & Wrohlich, K. (2011). Can child care policy encourage employment and fertility? Evidence from a structural model. Labour Economics, 18 (4), 498–512. https://doi.org/10.1016/j.labeco.2010.12.008

Hechter, M., & Kanazawa, S. (1997). Sociological rational choice theory. Annual Review of Sociology, 23 (1), 191–214. https://doi.org/10.1146/annurev.soc.23.1.191

Henderson, V. (2003). The urbanization process and economic growth: The so-what question. Journal of Economic Growth, 8 , 47–71. https://doi.org/10.1023/A:1022860800744

Herrera-Almanza, C., & Rosales-Rueda, M. F. (2020). Reducing the cost of remoteness: Community-based health interventions and fertility choices. Journal of Health Economics, 73 , 102365. https://doi.org/10.1016/j.jhealeco.2020.102365

Herzer, D. (2019). A note on the effect of religiosity on fertility. Demography, 56 (3), 991–998. https://doi.org/10.1007/s13524-019-00774-6

Herzer, D., Strulik, H., & Vollmer, S. (2012). The long-run determinants of fertility: One century of demographic change 1900–1999. Journal of Economic Growth, 17 , 357–385. https://doi.org/10.1007/s10887-012-9085-6

Hesketh, T., & Xing, Z. W. (2006). Abnormal sex ratios in human populations: Causes and consequences. Proceedings of the National Academy of Sciences, 103 (36), 13271–13275. https://doi.org/10.1073/pnas.0602203103

Hiekel, N., & Castro-Martín, T. (2014). Grasping the diversity of cohabitation: Fertility intentions among cohabiters across Europe. Journal of Marriage and Family, 76 (3), 489–505. https://doi.org/10.1111/jomf.12112

Huttunen, K., & Kellokumpu, J. (2016). The effect of job displacement on couples’ fertility decisions. Journal of Labor Economics, 34 (2), 403–442. https://doi.org/10.1086/683645

Hwang, J., Park, S., & Shin, D. (2018). Two birds with one stone: Female labor supply, fertility, and market childcare. Journal of Economic Dynamics & Control, 90 , 171–193. https://doi.org/10.1016/j.jedc.2018.02.008

Iacovou, M., & Tavares, L. P. (2011). Yearning, learning, and conceding: Reasons men and women change their childbearing intentions. Population and Development Review, 37 (1), 89–123. https://doi.org/10.1111/j.1728-4457.2011.00391.x

Iwasaki, I., & Kumo, K. (2020). Determinants of regional fertility in Russia: A dynamic panel data analysis. Post-Communist Economies, 32 (2), 176–214. https://doi.org/10.1080/14631377.2019.1678333

Jalovaara, M., Neyer, G., Andersson, G., Dahlberg, J., Dommermuth, L., Fallesen, P., & Lappegård, T. (2019). Education, gender, and cohort fertility in the Nordic countries. European Journal of Population, 35 (3), 563–586. https://doi.org/10.1007/s10680-018-9492-2

Jasilioniene, A., Stankuniene, V., & Jasilionis, D. (2014). Census-linked study on ethnic fertility differentials in Lithuania. Studies of Transition States and Societies , 6(2), 57–67. https://doi.org/10.58036/stss.v6i2.182

Jones, B., Peri-Rotem, N., & Mountford-Zimdars, A. (2023). Geographic opportunities for assisted reproduction: A study of regional variations in access to fertility treatment in England. Human Fertility, 26 (3), 494–503. https://doi.org/10.1080/14647273.2023.2190040

Jung, M., Ko, W., Choi, Y., & Cho, Y. (2019). Spatial variations in fertility of South Korea: A geographically weighted regression approach. International Journal of Geo-Information, 8 (6), 262. https://doi.org/10.3390/ijgi8060262

Kačerová, M. (2009). Populačné starnutie obyvateľstva Slovenska. Populačný vývoj Slovenska na prelome tisícročí, kontinuita, či nová éra , 105–125.

Katuša, M. (2012). Differences in family and reproductive behavior of the inhabitants of Bratislava with high school and university education. Acta Geographica Universitatis Comenianae , 56(2), 139–160. http://actageographica.sk/stiahnutie/56_02_02_Katusa.pdf

Kim, J. (2018). Childcare facilities, availability of substitute workers and parental leave utilization. Korea and the World Economy, 19 (2), 137–168.

Klasen, S., Le, T. T. N., Pieters, J., & Santos Silva, M. (2021). What drives female labour force participation? Comparable micro-level evidence from eight developing and emerging economies. The Journal of Development Studies, 57 (3), 417–442. https://doi.org/10.1080/00220388.2020.1790533

Klein, D. (2020). Pokročilé štatistické metódy. Univerzita pavla Jozefa Šafárika v Košiciach. https://unibook.upjs.sk/img/cms/2020/pf/pokrocile-statisticke-metody.pdf

Klesment, M., Puur, A., Rahnu, L., & Sakkeus, L. (2014). Varying association between education and second births in Europe: Comparative analysis based on the EU-SILC data. Demographic Research , 31(27), 813–860. https://www.jstor.org/stable/26350081

Koert, E., Takefman, J., & Boivin, J. (2021). Fertility quality of life tool: Update on research and practice considerations. Human Fertility, 24 (4), 236–248. https://doi.org/10.1080/14647273.2019.1648887

Kohler, H. P., Billari, F. C., & Ortega, J. A. (2002). The emergence of lowest-low fertility in Europe during the 1990s. Population and Development Review, 28 (4), 641–680. https://doi.org/10.1111/j.1728-4457.2002.00641.x

Korec, P. (2005). Regionálny rozvoj Slovenska v rokoch 1989–2004: Identifikácia menej roz vinutých regiónov Slovenska . Bratislava (Geo-grafika).

Korec, P. (2009). General and individual reasons of development of regional structure of the Slovak Republic. Russia and Slovakia: modern tendencies of demographic and socioeconomic processes , Ekaterinburg, Institute of Economics, 50–72.

Kostelecký, T., & Vobecká, J. (2009). Housing affordability in Czech regions and demographic behaviour–Does housing affordability impact fertility. Czech Sociological Review , 45(6), 1191–1213. https://doi.org/10.13060/00380288.2009.45.6.02

Kraus, B., Stašová, L., Junová, I., Kraus, B., Ondrejkovič, P., Krzysztof Świątkiewicz, W., Vilka, L., Rieke, U., Trapenciere, I., & Pankiv, L. (2020). Characteristics of Family Lives in Central Europe. Contemporary Family Lifestyles in Central and Western Europe: Selected Cases , 21–47. https://doi.org/10.1007/978-3-030-48299-2_2

Križan, F., Gurňák, D., & Švecová, A. (2022). Transformation of the Kindergarten Network in Slovakia: Notes on the Temporal and Spatial Changes. Geographical Information , 26(2), 17–28. https://doi.org/10.17846/GI.2022.26.2.17-28

Kulu, H., & Vikat, A. (2007). Fertility differences by housing type: The effect of housing conditions or of selective moves? Demographic research , 17, 775–802. https://www.jstor.org/stable/26347971

Kulu, H. (2013). Why Do Fertility Levels Vary between Urban and Rural Areas? Regional Studies, 47 (6), 895–912. https://doi.org/10.1080/00343404.2011.581276

Kulu, H., & Boyle, P. J. (2009). High fertility in city suburbs: compositional or contextual effects? European Journal of Population, 25 (2), 157–174. https://doi.org/10.1007/s10680-008-9163-9

Kulu, H., & Washbrook, E. (2014). Residential context, migration and fertility in a modern urban society. Advances in Life Course Research, 21 , 168–182. https://doi.org/10.1016/j.alcr.2014.01.001

Legazpe, N., & Davia, M. A. (2019). Women’s employment and childcare choices in Spain through the great recession. Feminist Economics, 25 (2), 173–198. https://doi.org/10.1080/13545701.2019.1566754

Lentner, C., & Horbulák, Z. (2021). Some state financial segments of the childbirth and family support system in Slovakia. Public Finance Quarterly , 66(4), 482–500. https://doi.org/10.35551/PFQ_2021_4_2

Lesthaeghe, R. (2014). The second demographic transition: A concise overview of its development. Proceedings of the National Academy of Sciences, 111 (51), 18112–18115. https://doi.org/10.1073/pnas.1420441111

Lieming, F., Shatalova, E., & Kalabikhina, I. E. (2022). Determinants of regional fertility in China during the first years of reaching below-replacement fertility. BRICS Journal of Economics, 3 (3), 101–127. https://doi.org/10.3897/brics-econ.3.e83259

López-Gay, A., & Salvati, L. (2021). Polycentric development and local fertility in metropolitan regions: An empirical analysis for Barcelona, Spain. Population, Space and Place, 27 (2), e2402. https://doi.org/10.1002/psp.2402

Luci-Greulich, A., & Thévenon, O. (2014). Does economic advancement “cause” a re-increase in fertility? An empirical analysis for OECD countries (1960–2007). European Journal of Population, 30 , 187–221. https://doi.org/10.1007/s10680-013-9309-2

Madajová, M. S., Šveda, M., & Výbošťok, J. (2021). Will there be a place for all children? Capacity of pre-school facilities in the Bratislava self-governing region. Geographical Journal , 73(4), 301–322. https://doi.org/10.31577/geogrcas.2021.73.4.16

Magdalenić, I. (2016). The influence of marital status on fertility in Serbia and the European Union. Demografija, 13 , 175–190.

Majelantle, R. G. N. K., & Navaneetham, K. (2013). Migration and fertility: A review of theories and evidences. Journal of Global Economics, 1 (1), 1–3. https://doi.org/10.4172/2375-4389.1000101

Majo, J. (2014). Some Remarks about the Ethnicity Concept in Current Slovakian Human Geography. Acta Geographica Universitatis Comenianae , 58(2), 149–172. http://www.actageographica.sk/stiahnutie/58_2_03_Majo.pdf

Makszin, K., & Bohle, D. (2020). Housing as a fertility trap: The inability of states, markets, or families to provide adequate housing in East Central Europe. East European Politics and Societies, 34 (4), 937–961. https://doi.org/10.1177/0888325419897748

Marenčáková, J. (2001). Veľkosť obcí Slovenska ako diferenciačný faktor vybraných populačných javov. Súčasný populačný vývoj na Slovensku v európskom kontexte , Bratislava: Slovenská štatistická a demografická spoločnosť, 130–137. ISBN 80-88946-11-5.

Marenčáková, J. (2006). Reproductive and family behaviour of the population in Slovakia after 1989. Temporal and Spatial Aspects. Geographical Journal, 58 (3), 197–224.

Marginson, S. (2016). High participation systems of higher education. The Journal of Higher Education, 87 (2), 243–271. https://doi.org/10.1080/00221546.2016.11777401

Martin, T. F. (2019). Toward a Theory of Fertility and Ethnic Social Capital. Marriage & Family Review, 56 (1), 1–19. https://doi.org/10.1080/01494929.2019.1630046

Maulida, Y., Harlen, H., Sari, D. R., & Zacharias, T. (2023). Factors Predicting Fertility Rate in Indonesia. Jurnal Ekonomi Pembangunan: Kajian Masalah Ekonomi dan Pembangunan , 24(1), 1–11. https://doi.org/10.23917/jep.v24i1.20076

McDonald, P. (2013). Societal foundations for explaining low fertility: Gender equity. Demographic research , 28, 981–994. https://www.jstor.org/stable/26349977

Meggiolaro, S. & Ongaro, F. (2010). The implications of marital instability for a woman's fertility: Empirical evidence from Italy. Demographic Research , 23, 963–996. https://www.jstor.org/stable/26349619

Mendelová, E. (2018). Manželstvo a kohabitácia z pohľadu troch generácií. Pedagogika, 9 (3), 122–134.

Mills, M. (2010). Gender roles, gender (in) equality and fertility: An empirical test of five gender equity indices. Canadian Studies in Population , 37(3–4), 445–474. https://doi.org/10.25336/P6131Q

Mills, M., Rindfuss, R. R., McDonald, P., TeVelde, E., ESHRE Reproduction and Society Task Force. (2011). Why do people postpone parenthood? Reasons and social policy incentives. Human reproduction update, 17 (6), 848–860. https://doi.org/10.1093/humupd/dmr026

Mishra, V., & Smyth, R. (2010). Female labor force participation and total fertility rates in the OECD: New evidence from panel cointegration and Granger causality testing. Journal of Economics and Business, 62 (1), 48–64. https://doi.org/10.1016/j.jeconbus.2009.07.006

Mládek, J. (1998). Druhý demografický prechod a Slovensko. Folia Geographica, 2 (30), 42–52.

Mládek, J. (1999). Population development in Slovakia in the European context. Acta Geographica Universitatis Comenianae, 2 (1), 59–71.

Morelli, V. G., Rontos, K., & Salvati, L. (2014). Between suburbanisation and re-urbanisation: Revisiting the urban life cycle in a Mediterranean compact city. Urban Research & Practice, 7 (1), 74–88. https://doi.org/10.1080/17535069.2014.885744

Muhammad, A. (1996). Ethnic Fertility Differentials in Pakistan. The Pakistan Development Review , 35(4), 733–744. https://www.jstor.org/stable/41259995

Mulder, C. H., & Billari, F. C. (2006). Home-ownership regimes and lowest-low fertility. International workshop: Home ownership in Europe: policy and research issues, Delft, The Netherlands, November 23–24, 2006. Delft University of Technology, OTB Research Institute for the Built Environment.

Nedomová, R. (2015). Rodinný stav jako diferencujíci faktor demografického chování. Diplomová práca. Vysoká škola ekonomická v Prahe.

Nestorová Dická, J. (2013). Sociálno-demografické dimenzie postsocialistického mesta Košice . Univerzita Pavla Jozefa Šafárika v Košiciach, Košice, 176 s.

Nestorová Dická, J. (2021). Demographic changes in slovak roma communities in the new millennium. Sustainability, 13 (7), 3735. https://doi.org/10.3390/su13073735

Nestorová Dická, J., Gessert, A., & Sninčák, I. (2019). Rural and non-rural municipalities in the Slovak Republic. Journal of Maps, 15 (1), 84–93. https://doi.org/10.1080/17445647.2019.1615010

Neyer, G., Lappegård, T., & Vignoli, D. (2013). Gender equality and fertility: Which equality matters? European Journal of Population, 29 , 245–272. https://doi.org/10.1007/s10680-013-9292-7

Ní Bhrolcháin, M., & Beaujouan, É. (2012). Fertility postponement is largely due to rising educational enrolment. Population Studies, 66 (3), 311–327. https://doi.org/10.1080/00324728.2012.697569

Nisén, J. (2016). Education and fertility – a study on patterns and mechanisms among men and women in Finland. Academic dissertation. Helsinki: Department of Social Research, University of Helsinki.

Nisén, J., Klüsener, S., Dahlberg, J., Dommermuth, L., Jasilioniene, A., Kreyenfeld, M., & Myrskylä, M. (2021). Educational differences in cohort fertility across sub-national regions in Europe. European Journal of Population, 37 , 263–295. https://doi.org/10.1007/s10680-020-09562-0

Novotný, L., Pregi, L., & Novotná, J. (2023). East-west or up the urban hierarchy? Internal migration patterns in Slovakia since post-socialist transformation to COVID-19 pandemic. Eurasian Geography and Economics . https://doi.org/10.1080/15387216.2023.2220344

Orzack, S. H., Stubblefield, J. W., Akmaev, V. R., Colls, P., Munné, S., Scholl, T., Steinsaltz, D., & Zuckerman, J. E. (2015). The human sex ratio from conception to birth. Proceedings of the National Academy of Sciences, 112 (16), E2102–E2111. https://doi.org/10.1073/pnas.1416546112

Palomba, S., Daolio, J., Romeo, S., Battaglia, F. A., Marci, R., & La Sala, G. B. (2018). Lifestyle and fertility: The influence of stress and quality of life on female fertility. Reproductive Biology and Endocrinology, 16 (1), 113. https://doi.org/10.1186/s12958-018-0434-y

Pampel, F. (2011). Cohort changes in the socio-demographic determinants of gender egalitarianism. Social Forces, 89 (3), 961–982. https://doi.org/10.1353/sof.2011.0011

Parr, N. (2021). A new measure of fertility replacement level in the presence of positive net immigration. European Journal of Population, 37 (1), 243–262. https://doi.org/10.1007/s10680-020-09566-w

Perelli-Harris, B., Kreyenfeld, M., Sigle-Rushton, W., Keizer, R., Lappegård, T., Jasilioniene, A., & Koeppen, K. (2009). The increase in fertility in cohabitation across Europe: Examining the intersection between union status and childbearing . (MPIDR Working Paper WP 2009–021), Rostock, Max Planck Institute for Demographic Research. https://doi.org/10.4054/MPIDR-WP-2009-021

Perelli-Harris, B. (2014). How similar are cohabiting and married parents? Second conception risks by union type in the united states and across Europe. European Journal of Population, 30 (4), 437–464. https://doi.org/10.1007/s10680-014-9320-2

Perelli-Harris, B., Kreyenfeld, M., Sigle-Rushton, W., Keizer, R., Lappegård, T., Jasilioniene, A., & Di Giulio, P. (2012). Changes in union status during the transition to parenthood in eleven European countries, 1970s to early 2000s. Population Studies, 66 (2), 167–182. https://doi.org/10.1080/00324728.2012.673004

Perry, S. L., & Schleifer, C. (2019). Are the faithful becoming less fruitful? The decline of conservative protestant fertility and the growing importance of religious practice and belief in childbearing in the US. Social Science Research, 78 , 137–155. https://doi.org/10.1016/j.ssresearch.2018.12.013

Philipov, D., & Berghammer, C. (2007). Religion and fertility ideals, intentions and behaviour: A comparative study of European countries. Vienna Yearbook of Population Research , 5, 271–305. https://www.jstor.org/stable/23025606

Phillips, J. F., Jackson, E. F., Bawah, A. A., Asuming, P. O., & Awoonor-Williams, J. K. (2019). The fertility impact of achieving universal health coverage in an impoverished rural region of Northern Ghana. Gates Open Research , 3(1537), 1537. https://doi.org/10.12688/gatesopenres.12993.1

Polykretis, C., & Alexakis, D. D. (2021). Spatial stratified heterogeneity of fertility and its association with socioeconomic determinants using Geographical Detector: The case study of Crete Island. Greece. Applied Geography, 127 , 102384. https://doi.org/10.1016/j.apgeog.2020.102384

Potančoková, M., Vaňo, B., Pilinská, V., & Jurčová, D. (2008). Slovakia: Fertility between tradition and modernity. Demographic research , 19, 973–1018. https://www.jstor.org/stable/26349266

Potančoková, M. (2011). Zmena reprodukčného správania populácie Slovenska po roku 1989: trendy, príčiny a dôsledky. Desaťročia premien slovenskej spoločnosti , 142–159. ISBN 9788085544695

Pregi, L., & Novotný, L. (2019). Selective migration of population in functional urban regions of Slovakia. Journal of Maps, 15 (1), 94–102. https://doi.org/10.1080/17445647.2019.1661880

Preoteasa, A. M. (2013). Roma women and precarious work: Evidence from Romania, Bulgaria, Italy and Spain. Revista De Cercetare Şi Intervenţie Socială, 43 , 155–168.

Raley, K. R. (2001). Increasing fertility in cohabiting unions: Evidence for the second demographic transition in the United states? Demography, 38 (1), 59–66. https://doi.org/10.1353/dem.2001.0008

Raymo, J. M. (2015). Second demographic transition. International Encyclopedia of the Social & Behavioral Sciences: Second Edition (pp. 346–348). Elsevier Inc. https://doi.org/10.1016/B978-0-08-097086-8.31083-2

Rees, P., Bell, M., Kupiszewski, M., Kupiszewska, D., Ueffing, P., Bernard, A., & Stillwell, J. (2017). The impact of internal migration on population redistribution: An international comparison. Population, Space and Place, 23 (6), e2036. https://doi.org/10.1002/psp.2036

Řehák, J. (2023). Koeficient determinace. Sociologická encyklopedie . Praha: Sociologický ústav AV ČR. https://encyklopedie.soc.cas.cz/w/Koeficient_determinace

Riederer, B., & Beaujouan, É. (2024). Explaining the urban–rural gradient in later fertility in Europe. Population, Space and Place, 30 (1), e2720. https://doi.org/10.1002/psp.2720

Riederer, B., & Buber-Ennser, I. (2019). Regional context and realisation of fertility intentions: The role of the urban context. Regional Studies, 53 (12), 1669–1679. https://doi.org/10.1080/00343404.2019.1599843

Rindfuss, R. R., Guilkey, D. K., Morgan, S. P., & Kravdal, Ø. (2010). Child-care availability and fertility in Norway. Population and Development Review, 36 (4), 725–748. https://doi.org/10.1111/j.1728-4457.2010.00355.x

Rindfuss, R. R., & Parnell, A. M. (1989). The varying connection between marital status and childbearing in the United States. Population and Development Review, 15 (3), 447–470. https://doi.org/10.2307/1972442

Rodrigo-Comino, J., Egidi, G., Sateriano, A., Poponi, S., Mosconi, E. M., & Gimenez Morera, A. (2021). Suburban fertility and metropolitan cycles: Insights from European cities. Sustainability, 13 (4), 2181. https://doi.org/10.3390/su13042181

Rosičová, K., Gecková, A. M., van Dijk, J. P., Rosič, M., Žežula, I., & Groothoff, J. W. (2009). Socioeconomic indicators and ethnicity as determinants of regional mortality rates in Slovakia. International Journal of Public Health, 54 , 274–282. https://doi.org/10.1007/s00038-009-7108-7

Roupa, M., & Kusendová, D. (2013). Historická podmienenosť regionálnych demografických rozdielov na Slovensku. Historický Časopis, 2 , 343–375.

Rusterholz, C. (2015). Costs of children and models of parenthood: Comparative evidence from two Swiss cities, 1955–1970. Journal of Family History, 40 (2), 208–229. https://doi.org/10.1177/0363199015569710

Sabater, A., & Graham, E. (2019). International migration and fertility variation in Spain during the economic recession: A spatial Durbin approach. Applied Spatial Analysis and Policy, 12 , 515–546. https://doi.org/10.1007/s12061-018-9255-9

Saguin, K. (2021). No flat, no child in Singapore: Cointegration analysis of housing, income, and fertility (No. 1231). ADBI Working Paper Series. http://hdl.handle.net/10419/238588

Salvati, L. (2021). Births and the city: Urban cycles and increasing socio-spatial heterogeneity in a low-fertility context. Tijdschrift Voor Economische En Sociale Geografie, 112 (2), 195–215. https://doi.org/10.1111/tesg.12454

Sancar, C., & Sancar, C. (2017). The Econometrical analysis of the relationship between urbanisation and economic growth (the case of EU countries and Turkey). Uluslararası İktisadi ve İdari İncelemeler Dergisi , 19, 1–24. https://doi.org/10.18092/ulikidince.287053

Sander, W. (2019). The Catholic family: Marriage, children, and human capital . Routledge.

Book   Google Scholar  

Shirahase, S. (2000). Women's increased higher education and the declining fertility rate in Japan. Review of population and social policy , 9, 47–63. https://www.ipss.go.jp/publication/e/R_s_p/No.9_P47.pdf

Šídlo, L., & Šprocha, B. (2018). Odkládání mateřství a regionální diferenciace plodnosti v Česku a na Slovensku. Geografie, 123 (3), 407–436.

Simo-Kengne, B.D., & Bonga-Bonga, L. (2020). House prices and fertility in South Africa: A spatial econometric analysis. MPRA Paper No. 100546. https://mpra.ub.uni-muenchen.de/100546/

Sobotka, T., Šťastná, A., Zeman, K., Hamplová, D., & Kantorová, V. (2008). Czech Republic: A rapid transformation of fertility and family behaviour after the collapse of state socialism. Demographic research , 19, 403–454. https://www.jstor.org/stable/26349255

Sobotka, T. (2008). Overview Chapter 7: The rising importance of migrants for childbearing in Europe. Demographic research , 19, 225–248. https://www.jstor.org/stable/26349250

Sobotka, T. (2011). Fertility in Central and Eastern Europe after 1989: Collapse and gradual recovery. Historical Social Research , 36(2), 246–296. https://www.jstor.org/stable/41151282

Sobotka, T., & Fürnkranz-Prskawetz, A. (2020). Demographic change in Central, Eastern and Southeastern Europe: Trends, determinants and challenges. 30 Years of transition in Europe (pp. 196–222). Edward Elgar Publishing.

Sobotka, T., Skirbekk, V., & Philipov, D. (2011). Economic recession and fertility in the developed world. Population and Development Review, 37 (2), 267–306. https://doi.org/10.1111/j.1728-4457.2011.00411.x

Soltes, V. (2016). Quantification of regional differences in the use of selected groups of medical technology in Slovakia in 2008–2014. In 3rd International Multidisciplinary Scientific Conference on Social Sciences and Arts SGEM 2016 (pp. 803–818).

Šprocha, B. 2014: Reprodukcia rómskeho obyvateľstva na Slovensku a prognóza jeho populačného vývoja . Bratislava: Infostat. http://www.infostat.sk/vdc/pdf/Romovia.pdf

Šprocha, B., & Šidlo, L. (2016). Regionálne rozdiely v charaktere plodnosti v Česku a na Slovensku. Reprodukce lidského kapitálu - vzájemné vazby a souvislosti , 559–571. https://relik.vse.cz/2016/download/pdf/57-Sprocha-Branislav-paper.pdf

Šprocha, B., & Tišliar, P. (2016). Transformácia plodnosti na Slovensku v 20. a na začiatku 21. storočia. Bratislava: Muzeológia a kultúrne dedičstvo. http://www.infostat.sk/vdc/pdf/transformacia.pdf

Šprocha, B., Bleha, B., Garajová, A., Pilinská, V., Mészáros, J., & Vaňo, B. (2019). Populačný vývoj v krajoch a okresoch Slovenska od začiatku 21. storočia . Bratislava: Infostat.

Šprocha, B., & Ďurček, P. (2019). Starnutie populácie Slovenska v čase a priestore . Bratislava: Infostat.

Šprocha, B., Tišliar, P., & Šídlo, L. (2020). Vzdelanie žien a plodnosť: k niektorým diferenčným aspektom transformácie plodnosti na Slovensku. Sociológia - Slovak Sociological Review , 52(5), 499–524. https://doi.org/10.31577/sociologia.2020.52.5.21

Šprocha, B., & Bačík, V. (2021). Vzdelanie žien a časovanie rodenia detí na Slovensku v priestorovej perspektíve. Geografický časopis, 73(1), 43–61. https://doi.org/10.31577/geogrcas.2021.73.1.03

Šprocha, B., & Tišliar, P. (2021). Niektoré aspekty nemanželskej plodnosti a detí narodených mimo manželstva na Slovensku. Sociológia - Slovak Sociological Review , 53(4), 339–376. https://doi.org/10.31577/sociologia.2021.53.4.13

Šprocha, B., & Bačík, V. (2021). Transformácia plodnosti na Slovensku v čase a priestore. Geographia Cassoviensis , 15(1), 37–55. https://doi.org/10.33542/GC2021-1-03

Šprocha, B., & Bleha, B. (2018). Does Socio-Spatial Segregation Matter? “Islands” of High Romany Fertility in Slovakia. Tijdschrift Voor Economische En Sociale Geografie, 109 (2), 239–255. https://doi.org/10.1111/tesg.12270

Šprocha, B., Bleha, B., & Nováková, G. (2022). Three decades of post-communist fertility transition in a subnational context: The case of Slovakia. Tijdschrift Voor Economische En Sociale Geografie, 113 (4), 397–411. https://doi.org/10.1111/tesg.12515

Šprocha, B., & Ďurček, P. (2017). Hodnotenie priestorových aspektov kohabitácií na Slovensku. Geographia Cassoviensis, 11 (1), 70–88.

Šprocha, B., & Fitalová, A. (2022). Late motherhood and spatial aspects of late fertility in Slovakia. Moravian Geographical Reports, 30 (2), 86–98. https://doi.org/10.2478/mgr-2022-0006

Šprocha, B., & Tišliar, P. (2019). Fertility and religious belief: Old and new relationships in Slovakia. Journal for the Study of Religions & Ideologies, 18 (52), 48–62.

Stefko, R., Gavurova, B., & Kocisova, K. (2018). Healthcare efficiency assessment using DEA analysis in the Slovak Republic. Health Economics Review, 8 , 1–12. https://doi.org/10.1186/s13561-018-0191-9

Stoenchev, N., & Hrischeva, Y. (2023). Study of the impact of housing affordability on the fertility rate in Bulgaria (2014–2021): A regional aspect. Baltic Journal of Real Estate Economics and Construction Management, 11 (1), 101–119. https://doi.org/10.2478/bjreecm-2023-0007

Stryjakiewicz, T. (2022). Shrinking cities in postsocialist countries of Central-Eastern and South-Eastern Europe: A general and comparative overview. Postsocialist Shrinking Cities . https://doi.org/10.4324/9780367815011

Szabó, L., Kiss, I., Šprocha, B., & Spéder, Z. (2021). Fertility of Roma Minorities in Central and Eastern Europe. Comparative Population Studies , 46. https://doi.org/10.12765/CPoS-2021-14

Szmytkie, R. (2021). Suburbanisation processes within and outside the city: The development of intra-urban suburbs in Wrocław. Poland. Moravian Geographical Reports, 29 (2), 149–165. https://doi.org/10.2478/mgr-2021-0012

Tafuro, S., & Guilmoto, C. Z. (2020). Skewed sex ratios at birth: A review of global trends. Early Human Development, 141 , 104868. https://doi.org/10.1016/j.earlhumdev.2019.104868

Testa, M. R. (2014). On the positive correlation between education and fertility intentions in Europe: Individual- and country-level evidence. Advances in Life Course Research, 21 , 28–42. https://doi.org/10.1016/j.alcr.2014.01.005

Thévenon, O., & Luci, A. (2012). Reconciling work, family and child outcomes: What implications for family support policies? Population Research and Policy Review, 31 , 855–882. https://doi.org/10.1007/s11113-012-9254-5

Trynov, A., Kostina, S., & Bannykh, G. (2020). Examination of Socio-economic Determinants of Fertility based on the Regional Panel Data Analysis. Economy of region , 16(3), 807–819, https://doi.org/10.17059/ekon.reg.2020-3-10

Tulchinsky, T. H., & Varavikova, E. A. (2014). Measuring, monitoring, and evaluating the health of a population. The New Public Health (Third Edition) , Cambridge, MA: Academic Press, 91–147. https://doi.org/10.1016/B978-0-12-415766-8.00003-3

Urale, P. W., O'Brien, M. A., & Fouché, C. B. (2019). The relationship between ethnicity and fertility in New Zealand. Kōtuitui: New Zealand Journal of Social Sciences Online , 14(1), 80–94. https://doi.org/10.1080/1177083X.2018.1534746

Van Bavel, J., Jansen, M., & Wijckmans, B. (2012). Has divorce become a pro-natal force in Europe at the turn of the 21st century? Population Research and Policy Review, 31 , 751–775. https://doi.org/10.1007/s11113-012-9237-6

Vaňková, I., & Vrabková, I. (2022). Productivity analysis of regional-level hospital care in the Czech republic and Slovak Republic. BMC Health Services Research, 22 (1), 180. https://doi.org/10.1186/s12913-022-07471-y

Varsik, S. (2019). Držím ti miesto. Analysis of capacities of kindergartens for 5 yrs old children. Institute of Education Policy. Bratislava: Ministry of Education of the Slovak Republic. Komentár 2/2019. https://www.minedu.sk/data/att/15248.pdf

Vobecká, J., & Piguet, V. (2012). Fertility, natural growth, and migration in the Czech Republic: An urban–suburban–rural gradient analysis of long-term trends and recent reversals. Population, Space and Place, 18 (3), 225–240. https://doi.org/10.1002/psp.698

Wachsmuth, L. (2022). Underpopulation, an impending economic crisis. Is home office correlated to realised fertility? A case study of Australia's demographic. Maastricht University https://flosse.dss.gov.au/flossejspui/bitstream/10620/18548/1/Homeofficecorrelatedtorealizedfertility_Wachsmuth2022.pdf

Wang, Q., & Sun, X. (2016). The role of socio-political and economic factors in fertility decline: A cross-country analysis. World Development, 87 , 360–370. https://doi.org/10.1016/j.worlddev.2016.07.004

Wood, J., Neels, K., & Kil, T. (2014). The educational gradient of childlessness and cohort parity progression in 14 low fertility countries. Demographic Research , 31(46), 1365–1416. https://www.jstor.org/stable/26350100

Wood, J., & Neels, K. (2019). Local childcare availability and dual-earner fertility: Variation in childcare coverage and birth hazards over place and time. European Journal of Population, 35 (5), 913–937. https://doi.org/10.1007/s10680-018-9510-4

Wu, Z., Viisainen, K., & Hemminki, E. (2006). Determinants of high sex ratio among newborns: A cohort study from rural Anhui province. China. Reproductive Health Matters, 14 (27), 172–180. https://doi.org/10.1016/S0968-8080(06)27222-7

Yüceşahin, M. M., & Özgür, E. M. (2008). Regional fertility differences in Turkey: Persistent high fertility in the southeast. Population, Space and Place, 14 (2), 135–158. https://doi.org/10.1002/psp.480

Zavisca, J. R., & Gerber, T. P. (2016). The socioeconomic, demographic, and political effects of housing in comparative perspective. Annual Review of Sociology, 42 , 347–367. https://doi.org/10.1146/annurev-soc-081715-074333

Zhang, L. (2008). Religious affiliation, religiosity, and male and female fertility. Demographic research , 18(8), 233–262. https://www.jstor.org/stable/26347984

Zhang, C., Wang, X., & Zhang, D. (2014). Urbanization, unemployment rate and China’rising divorce rate. Chinese Journal of Population Resources and Environment, 12 (2), 157–164. https://doi.org/10.1080/10042857.2014.910881

Zhang, Y., Hua, X., & Zhao, L. (2012). Exploring determinants of housing prices: A case study of Chinese experience in 1999–2010. Economic Modelling, 29 (6), 2349–2361. https://doi.org/10.1016/j.econmod.2012.06.025

Download references

Acknowledgements

The authors would like to thank the reviewers for their comments, suggestions, and notes, which significantly helped improve the paper's original version.

Open access funding provided by The Ministry of Education, Science, Research and Sport of the Slovak Republic in cooperation with Centre for Scientific and Technical Information of the Slovak Republic. This research was supported by the Scientific Grant Agency of the Ministry of Education, science, research and sport of the Slovak Republic and the Slovak Academy of Sciences. Grant Number VEGA 1/0514/21 “Spatial redistribution of human capital as an indicator of the formation of the regional system in Slovakia”. Grant Number VEGA 1/0768/24 Multiscale assessment of spatial variability of social-economic population stratification.

Author information

Authors and affiliations.

Institute of Geography, Faculty of Science, Pavol Jozef Šafárik University in Košice, Košice, Slovakia

Janetta Nestorová Dická & Filip Lipták

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Janetta Nestorová Dická .

Ethics declarations

Conflict of interests.

The authors declare no conflict of interest.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Nestorová Dická, J., Lipták, F. Regional fertility predictors based on socioeconomic determinants in Slovakia. J Pop Research 41 , 20 (2024). https://doi.org/10.1007/s12546-024-09340-3

Download citation

Accepted : 22 June 2024

Published : 02 July 2024

DOI : https://doi.org/10.1007/s12546-024-09340-3

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Human fertility
  • Socioeconomic factors
  • Factor analysis
  • Cluster analysis
  • Regional type
  • Multilinear regression
  • Find a journal
  • Publish with us
  • Track your research

IMAGES

  1. Standard statistical tools in research and data analysis

    data analysis and research findings

  2. 5 Steps of the Data Analysis Process

    data analysis and research findings

  3. 8 Types of Analysis in Research

    data analysis and research findings

  4. Data Analysis

    data analysis and research findings

  5. Reporting and discussing your findings ~ Research Project Topics

    data analysis and research findings

  6. What is Data Analysis in Research

    data analysis and research findings

VIDEO

  1. Data Analysis

  2. 6. Data Analysis

  3. Research Methodology in English Education /B.Ed. 4th Year/ Syllabus

  4. DATA ANALYSIS

  5. What is the Future of Academic Research with the Advancement of AI?

  6. Why the Discussion Chapter in Qualitative Research is Your Chance to Shine

COMMENTS

  1. Data Analysis in Research: Types & Methods

    Definition of research in data analysis: According to LeCompte and Schensul, research data analysis is a process used by researchers to reduce data to a story and interpret it to derive insights. The data analysis process helps reduce a large chunk of data into smaller fragments, which makes sense. Three essential things occur during the data ...

  2. A practical guide to data analysis in general literature reviews

    This article is a practical guide to conducting data analysis in general literature reviews. The general literature review is a synthesis and analysis of published research on a relevant clinical issue, and is a common format for academic theses at the bachelor's and master's levels in nursing, physiotherapy, occupational therapy, public health and other related fields.

  3. Research Findings

    Research findings refer to the results obtained from a study or investigation conducted through a systematic and scientific approach. These findings are the outcomes of the data analysis, interpretation, and evaluation carried out during the research process. Types of Research Findings. There are two main types of research findings: Qualitative ...

  4. Learning to Do Qualitative Data Analysis: A Starting Point

    Accordingly, thematic analysis can result in a theory-driven or data-driven set of findings and engage a range of research questions (Braun & Clarke, 2006). Second, thematic analysis engages with analytic practices that are fairly common with other approaches to qualitative analysis.

  5. Introduction to Data Analysis

    Many of us associate data with spreadsheets of numbers and values, however, data can encompass much more than that. According to the federal government, data is "The recorded factual material commonly accepted in the scientific community as necessary to validate research findings" (OMB Circular 110). This broad definition can include ...

  6. Data Analysis

    Data Analysis. Definition: Data analysis refers to the process of inspecting, cleaning, transforming, and modeling data with the goal of discovering useful information, drawing conclusions, and supporting decision-making. It involves applying various statistical and computational techniques to interpret and derive insights from large datasets.

  7. Research Guide: Data analysis and reporting findings

    Data analysis and findings. Data analysis is the most crucial part of any research. Data analysis summarizes collected data. It involves the interpretation of data gathered through the use of analytical and logical reasoning to determine patterns, relationships or trends.

  8. What Is Data Analysis? (With Examples)

    Written by Coursera Staff • Updated on Apr 19, 2024. Data analysis is the practice of working with data to glean useful information, which can then be used to make informed decisions. "It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts," Sherlock ...

  9. A Practical Guide to Writing Quantitative and Qualitative Research

    INTRODUCTION. Scientific research is usually initiated by posing evidenced-based research questions which are then explicitly restated as hypotheses.1,2 The hypotheses provide directions to guide the study, solutions, explanations, and expected results.3,4 Both research questions and hypotheses are essentially formulated based on conventional theories and real-world processes, which allow the ...

  10. What is data analysis? Methods, techniques, types & how-to

    A method of data analysis that is the umbrella term for engineering metrics and insights for additional value, direction, and context. By using exploratory statistical evaluation, data mining aims to identify dependencies, relations, patterns, and trends to generate advanced knowledge.

  11. Basic statistical tools in research and data analysis

    Abstract. Statistical methods involved in carrying out a study include planning, designing, collecting data, analysing, drawing meaningful interpretation and reporting of the research findings. The statistical analysis gives meaning to the meaningless numbers, thereby breathing life into a lifeless data. The results and inferences are precise ...

  12. PDF Analyzing and Interpreting Findings

    Data analysis in qualitative research remains somewhat mysterious (Marshall & Rossman, 2006; Merriam, 1998). The problem lies in the fact that there are few agreed-on canons for qualitative analysis in the sense of shared ground rules. There are no formulas for deter-mining the significance of findings or for inter-

  13. What is Data Analysis? (Types, Methods, and Tools)

    December 17, 2023. Data analysis is the process of cleaning, transforming, and interpreting data to uncover insights, patterns, and trends. It plays a crucial role in decision making, problem solving, and driving innovation across various domains. In addition to further exploring the role data analysis plays this blog post will discuss common ...

  14. Data Analysis Techniques In Research

    Data analysis techniques in research are essential because they allow researchers to derive meaningful insights from data sets to support their hypotheses or research objectives.. Data Analysis Techniques in Research: While various groups, institutions, and professionals may have diverse approaches to data analysis, a universal definition captures its essence.

  15. (PDF) Qualitative Data Analysis and Interpretation: Systematic Search

    Qualitative data analysis is. concerned with transforming raw data by searching, evaluating, recogni sing, cod ing, mapping, exploring and describing patterns, trends, themes an d categories in ...

  16. How to Write the Results/Findings Section in Research

    The findings include: Data presented in tables, charts, graphs, and other figures (may be placed into the text or on separate pages at the end of the manuscript) A contextual analysis of this data explaining its meaning in sentence form; All data that corresponds to the central research question(s)

  17. Analysing and Interpreting Data in Your Dissertation: Making Sense of

    In a dissertation, data analysis is crucial as it directly influences the validity and reliability of your findings. The scope of data analysis includes data collection, data cleaning, statistical analysis, and interpretation of results. ... In qualitative research, data coding is a critical step that involves categorizing and labelling data to ...

  18. Presenting the Results of Qualitative Analysis

    This chapter provides an introduction to writing about qualitative research findings. It will outline how writing continues to contribute to the analysis process, what concerns researchers should keep in mind as they draft their presentations of findings, and how best to organize qualitative research writing ... data collection, and analysis ...

  19. PDF Chapter 4 DATA ANALYSIS AND RESEARCH FINDINGS

    4.1 INTRODUCTION. This chapter describes the analysis of data followed by a discussion of the research findings. The findings relate to the research questions that guided the study. Data were analyzed to identify, describe and explore the relationship between death anxiety and death attitudes of nurses in a private acute care hospital and to ...

  20. What is the main difference between findings and analysis?

    More specifically, findings build logically from the problem, research questions, and design…..whereas analysis relates to searching for patterns and themes that emerge from the findings ...

  21. (Pdf) Chapter Four Data Analysis and Presentation of Research Findings

    CHAPTER FOUR. DATA ANALYSIS AND PRESENTATION OF RES EARCH FINDINGS 4.1 Introduction. The chapter contains presentation, analysis and dis cussion of the data collected by the researcher. during the ...

  22. The Difference Between Analysis & Findings in a Research Paper

    Research paper formats vary across disciples but share certain features. Some features include: introduction, literature review. methodology, data analysis, results or findings, discussion and conclusion. Introduction and literature review are often combined as are discussion and conclusion.

  23. Evaluating AI Literacy in Academic Libraries: A Survey Study with a

    While research is still emerging, initial findings highlight the need for rigorous, tailored AI literacy initiatives encompassing technical skills, critical perspectives, and ethical considerations. ... In summary, a deeper analysis of the data reveals a landscape where academic librarians possess moderate to low confidence in understanding ...

  24. Analysis of anthropometric outcomes in Indian children during ...

    Sharma, S. et al. Impact of COVID-19 on utilization of maternal and child health services in India: health management information system data analysis. Clin. Epidemiol. Glob. Health 21, 101285 (2023).

  25. Suicidal behaviours and associated factors among medical students in

    Methods and analysis The research team will search the PubMed (Medline), Scopus, PsycINFO and Google Scholar databases for papers published between January 2000 and May 2024 using truncated and phrase-searched keywords and relevant subject headings. Cross-sectional studies, case series, case reports and cohort studies published in English will be included in the review.

  26. First qualitative research study conducted in Turkmenistan focuses on

    Data collection and analysis were conducted using the COM-B framework, which looks at 3 key components: capability, opportunity and motivation for behaviour change.Study outcomesStudy findings revealed that attitudes toward HPV were generally positive, partially due to positive attitudes toward vaccination in general but also due to preparatory ...

  27. Getting stuck in a collective stigma: sex offense registrants

    Longitudinal data from 2008 to 2024 was used to examine registrant's group identities. Interviews were conducted with 115 registrants and 40 of their family members, and narrative research analysis was used to assess how participants' levels of liminality influence why some on the registry never come to see themselves as sex criminals.

  28. The Ideal Age Gap for a Long-Lasting Marriage, According to Research

    According to research by Dr. John Gottman, a renowned psychologist specializing in marital stability and relationship analysis, these perpetual problems can create ongoing tension and dissatisfaction.

  29. Regional fertility predictors based on socioeconomic determinants in

    The study's primary purpose was to recognise the effects of determinants on the level of fertility and thereby explain the differences in trends in the regions of Slovakia. At the turn of the century, the differences in fertility in regions increased, but the total fertility rate decreased. Multivariate statistical methods clarified the regional effects of the level and nature of fertility ...