Weekend batch
Avijeet is a Senior Research Analyst at Simplilearn. Passionate about Data Analytics, Machine Learning, and Deep Learning, Avijeet is also interested in politics, cricket, and football.
Free eBook: Top Programming Languages For A Data Scientist
Normality Test in Minitab: Minitab with Statistics
Machine Learning Career Guide: A Playbook to Becoming a Machine Learning Engineer
Run a free plagiarism check in 10 minutes, generate accurate citations for free.
Published on January 28, 2020 by Rebecca Bevans . Revised on June 22, 2023.
Statistical tests are used in hypothesis testing . They can be used to:
Statistical tests assume a null hypothesis of no relationship or no difference between groups. Then they determine whether the observed data fall outside of the range of values predicted by the null hypothesis.
If you already know what types of variables you’re dealing with, you can use the flowchart to choose the right statistical test for your data.
Statistical tests flowchart
What does a statistical test do, when to perform a statistical test, choosing a parametric test: regression, comparison, or correlation, choosing a nonparametric test, flowchart: choosing a statistical test, other interesting articles, frequently asked questions about statistical tests.
Statistical tests work by calculating a test statistic – a number that describes how much the relationship between variables in your test differs from the null hypothesis of no relationship.
It then calculates a p value (probability value). The p -value estimates how likely it is that you would see the difference described by the test statistic if the null hypothesis of no relationship were true.
If the value of the test statistic is more extreme than the statistic calculated from the null hypothesis, then you can infer a statistically significant relationship between the predictor and outcome variables.
If the value of the test statistic is less extreme than the one calculated from the null hypothesis, then you can infer no statistically significant relationship between the predictor and outcome variables.
Professional editors proofread and edit your paper by focusing on:
See an example
You can perform statistical tests on data that have been collected in a statistically valid manner – either through an experiment , or through observations made using probability sampling methods .
For a statistical test to be valid , your sample size needs to be large enough to approximate the true distribution of the population being studied.
To determine which statistical test to use, you need to know:
Statistical tests make some common assumptions about the data they are testing:
If your data do not meet the assumptions of normality or homogeneity of variance, you may be able to perform a nonparametric statistical test , which allows you to make comparisons without any assumptions about the data distribution.
If your data do not meet the assumption of independence of observations, you may be able to use a test that accounts for structure in your data (repeated-measures tests or tests that include blocking variables).
The types of variables you have usually determine what type of statistical test you can use.
Quantitative variables represent amounts of things (e.g. the number of trees in a forest). Types of quantitative variables include:
Categorical variables represent groupings of things (e.g. the different tree species in a forest). Types of categorical variables include:
Choose the test that fits the types of predictor and outcome variables you have collected (if you are doing an experiment , these are the independent and dependent variables ). Consult the tables below to see which test best matches your variables.
Parametric tests usually have stricter requirements than nonparametric tests, and are able to make stronger inferences from the data. They can only be conducted with data that adheres to the common assumptions of statistical tests.
The most common types of parametric test include regression tests, comparison tests, and correlation tests.
Regression tests look for cause-and-effect relationships . They can be used to estimate the effect of one or more continuous variables on another variable.
Predictor variable | Outcome variable | Research question example | |
---|---|---|---|
What is the effect of income on longevity? | |||
What is the effect of income and minutes of exercise per day on longevity? | |||
Logistic regression | What is the effect of drug dosage on the survival of a test subject? |
Comparison tests look for differences among group means . They can be used to test the effect of a categorical variable on the mean value of some other characteristic.
T-tests are used when comparing the means of precisely two groups (e.g., the average heights of men and women). ANOVA and MANOVA tests are used when comparing the means of more than two groups (e.g., the average heights of children, teenagers, and adults).
Predictor variable | Outcome variable | Research question example | |
---|---|---|---|
Paired t-test | What is the effect of two different test prep programs on the average exam scores for students from the same class? | ||
Independent t-test | What is the difference in average exam scores for students from two different schools? | ||
ANOVA | What is the difference in average pain levels among post-surgical patients given three different painkillers? | ||
MANOVA | What is the effect of flower species on petal length, petal width, and stem length? |
Correlation tests check whether variables are related without hypothesizing a cause-and-effect relationship.
These can be used to test whether two variables you want to use in (for example) a multiple regression test are autocorrelated.
Variables | Research question example | |
---|---|---|
Pearson’s | How are latitude and temperature related? |
Non-parametric tests don’t make as many assumptions about the data, and are useful when one or more of the common statistical assumptions are violated. However, the inferences they make aren’t as strong as with parametric tests.
Predictor variable | Outcome variable | Use in place of… | |
---|---|---|---|
Spearman’s | |||
Pearson’s | |||
Sign test | One-sample -test | ||
Kruskal–Wallis | ANOVA | ||
ANOSIM | MANOVA | ||
Wilcoxon Rank-Sum test | Independent t-test | ||
Wilcoxon Signed-rank test | Paired t-test | ||
This flowchart helps you choose among parametric tests. For nonparametric alternatives, check the table above.
If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.
Methodology
Research bias
Statistical tests commonly assume that:
If your data does not meet these assumptions you might still be able to use a nonparametric statistical test , which have fewer requirements but also make weaker inferences.
A test statistic is a number calculated by a statistical test . It describes how far your observed data is from the null hypothesis of no relationship between variables or no difference among sample groups.
The test statistic tells you how different two or more groups are from the overall population mean , or how different a linear slope is from the slope predicted by a null hypothesis . Different test statistics are used in different statistical tests.
Statistical significance is a term used by researchers to state that it is unlikely their observations could have occurred under the null hypothesis of a statistical test . Significance is usually denoted by a p -value , or probability value.
Statistical significance is arbitrary – it depends on the threshold, or alpha value, chosen by the researcher. The most common threshold is p < 0.05, which means that the data is likely to occur less than 5% of the time under the null hypothesis .
When the p -value falls below the chosen alpha value, then we say the result of the test is statistically significant.
Quantitative variables are any variables where the data represent amounts (e.g. height, weight, or age).
Categorical variables are any variables where the data represent groups. This includes rankings (e.g. finishing places in a race), classifications (e.g. brands of cereal), and binary outcomes (e.g. coin flips).
You need to know what type of variables you are working with to choose the right statistical test for your data and interpret your results .
Discrete and continuous variables are two types of quantitative variables :
If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.
Bevans, R. (2023, June 22). Choosing the Right Statistical Test | Types & Examples. Scribbr. Retrieved September 16, 2024, from https://www.scribbr.com/statistics/statistical-tests/
Other students also liked, hypothesis testing | a step-by-step guide with easy examples, test statistics | definition, interpretation, and examples, normal distribution | examples, formulas, & uses, what is your plagiarism score.
Graduate faster
Better quality online classes
Flexible schedule
Access to top-rated instructors
03.17.2022 • 10 min read
Subject Matter Expert
This article explains what a test statistic is, how to complete one with formulas, and how to find the value for t-tests.
In This Article
The general formula for calculating test statistics, types of test statistics with formulas, difference between t-tests and z-tests and when to use each, how to interpret a test statistic, don't overpay for college statistics.
Take Intro to Statistics Online with Outlier.org
From the co-founder of MasterClass, earn transferable college credits from the University of Pittsburgh (a top 50 global school). The world's best online college courses for 50% less than a traditional college.
A test statistic is a standardized score used in hypothesis testing. It tells you how likely the results obtained from your sample data are under the assumption that the null hypothesis is true. The more unlikely your results are under this assumption, the easier it becomes to reject the null hypothesis in favor of an alternative hypothesis. The more likely your results are, the harder it becomes to reject the null hypothesis.
There are different kinds of test statistics, but they all work the same way. A test statistic maps the value of a particular sample statistic (such as a sample mean or a sample proportion) to a value on a standardized distribution, such as the Standard Normal Distribution or the t-distribution. This allows you to determine how likely or unlikely it is to observe the particular value of the statistic you obtained.
Olanrewaju Michael Akande reviews normal distribution in the following lesson clip:
As a quick example, say you have a null hypothesis that the average wait time to get seated at your favorite restaurant—at a table for two without a reservation on a Friday night—is 45 minutes. You select a random sample of 100 parties that got seated under these conditions and ask them what their wait times were. You find that the average wait time for your sample is 55 minutes ( x ˉ \bar{x} x ˉ = 55 minutes). A test statistic will convert this sample statistic x ˉ \bar{x} x ˉ into a standardized number that helps you answer this question:
“Assuming that my null hypothesis is true—assuming that the average wait time at the restaurant actually is 45 minutes—what is the likelihood that I found an average wait time of 55 minutes for my randomly drawn sample?”
Remember, the lower the likelihood of observing your sample statistic, the more confident you can be rejecting the null hypothesis.
The type of test statistic you use in a hypothesis test depends on several factors including:
The type of statistic you are using in the test
The size of your sample
Assumptions you can make about the distribution of your data
Assumptions you can make about the distribution of the statistic used in the test
The formula for calculating test statistics takes the following general form:
Remember, a statistic is a measure calculated from a single sample or many samples. Examples include the sample mean x ˉ \bar{x} x ˉ , the difference between two sample means x 1 ˉ − x 2 ˉ \bar{x_{1}} - \bar{x_{2}} x 1 ˉ − x 2 ˉ , or a sample proportion p ^ \hat{p} p ^ .
A parameter is a measure calculated from a single population or many populations. Examples include the population mean μ \mu μ , the difference between two population means μ 1 − μ 2 \mu_{1}-\mu_{2} μ 1 − μ 2 , or a population proportion p p p .
In the denominator of the equation, you have the standard deviation—or the approximated standard deviation—of the statistic used in the numerator. If you use the sample mean x ˉ \bar{x} x ˉ , in the numerator, you should use the standard deviation of x ˉ \bar{x} x ˉ or an approximation of it in the denominator.
The test statistics you are most likely to encounter in an introductory statistics class are:
The Z-test statistic for a single sample mean
The Z-test statistic for population proportions
The t-test statistic for a single sample mean
The t-test statistic for two sample means
We use the Z-test statistic (or Z-statistic) for a sample mean in hypothesis tests involving a sample mean x ˉ \bar{x} x ˉ , calculated for a single sample.
You use this test statistic when:
Your sample size is greater than or equal to 30 (n ≥ \geq ≥ 30)
The sampling distribution of the sample mean is assumed to be normal
The standard deviation of the population parameter σ \sigma σ is known.
The formula for this type of Z-test statistic is:
Z Z Z is the symbol for the Z-test statistic
x ˉ \bar{x} x ˉ is the sample mean
μ 0 \mu_{0} μ 0 is the hypothesized value of the population mean according to the null hypothesis
σ \sigma σ is the population standard deviation
n n n is the sample size
σ n \frac{\sigma}{\sqrt{n}} n σ is the standard error of x ˉ \bar{x} x ˉ . The standard error is just the standard deviation of the sampling distribution of the sample mean.
You may notice that a Z-test statistic is just a z-score for a particular value of a normally distributed statistic. There are many variations of the Z-test statistic. We can use these in hypothesis tests, where the sample statistic is being used in the test is approximately normally distributed. One such variation of the Z-test statistic is the Z-test for proportions.
We use the Z-test statistic for proportions in hypothesis tests where a sample proportion p ^ \hat{p} p ^ is being tested against the hypothesized value of the population proportion, p 0 p_{0} p 0 . We use the Z-test for proportions when your sample size is greater than or equal to 30 (n ≥ \geq ≥ 30), and the distribution of the sample statistic is assumed to be normal. The formula for the Z-test statistic for population proportions is:
Z is the symbol for the Z-test statistic for population proportions
p ^ \hat{p} p ^ is the sample proportion
p 0 p_{0} p 0 is the hypothesized value of the population proportion according to the null hypothesis
When your sample size is smaller than 30 (n<30)—or when you cannot assume that the distribution of your sample statistic is normally distributed—you’ll often use a t-test statistic rather than a Z-test.
We use the t-test statistic (or t-statistic) for a sample mean in hypothesis tests involving a sample mean calculated for a single sample drawn from a population. Unlike the Z-test for a single sample mean, you use the t-test when:
Your sample size is less than 30 (n<30)
The distribution of the sample statistic is not approximated by a normal distribution
The standard deviation of the population parameter σ \sigma σ is unknown
A t-test statistic maps your statistics to a t-distribution as opposed to the normal distribution with a Z-test. A t-distribution is like a standard normal distribution, but it has thicker tails and changes depending on your sample size n n n . When n n n is large, the t-distribution is closer to the normal distribution; and as the sample size gets larger and larger, a t-distribution will converge to the normal distribution. As n n n gets smaller, the t-distribution gets flatter with thicker tails.
The formula for the t-test statistic for a sample mean is:
t t t is the symbol for the t-test statistic
μ 0 \mu_0 μ 0 is the value of the population mean according to the null hypothesis
s s s is the sample standard deviation
s n \frac{s}{\sqrt{n}} n s is an approximation of the standard error of x ˉ \bar{x} x ˉ . In a t-test, because you do not know the value of the population standard deviation, you need to approximate the standard error of x ˉ \bar{x} x ˉ using the sample standard deviation s s s .
We can also use t-test statistics in hypothesis tests where the values of two sample means ( x 1 x_{1} x 1 and x 2 x_{2} x 2 ) are being compared. You do this to test the null hypothesis that the two samples are drawn from the same underlying population. If the null hypothesis is true, then any difference between the sample means is due to random variations in the data. Rejecting the null hypothesis suggests that the samples were drawn from two distinct populations and that the difference in the sample means reflects actual differences in the characteristics of subjects in one population compared to the other.
Like the t-test for a single sample mean, you use the t-test for two sample means when:
Your sample sizes are less than 30 (n<30)
The distribution of the sample statistics are not approximated by a normal distribution
The formula for the t-test statistic for two sample means is:
x 1 ˉ \bar{x_1} x 1 ˉ is the sample mean of sample 1
x 2 ˉ \bar{x_2} x 2 ˉ is the sample mean of sample 2
μ 1 \mu_1 μ 1 is the mean of the population from which sample 1 was drawn
μ 2 \mu_2 μ 2 is the mean of the population from which sample 2 was drawn
s 1 2 s_1^2 s 1 2 is the variance of sample 1
s 2 2 s_2^2 s 2 2 is the variance of sample 2
n 1 n_{1} n 1 is the sample size for sample 1
n 2 n_{2} n 2 is the sample size for sample 2
T-tests are generally used in place of Z-tests when one or more of the following conditions hold: The sample size is less than 30 (n \sigma is unknown
If you know the population standard deviation σ \sigma σ and you are confident that the statistic used in your hypothesis test is normally distributed, then you can use a Z-test.
As with all test statistics, you should only use a Z-test or a t-test when your data is from a randomly and independently drawn sample.
We use test statistics together with critical values, p-values, and significance levels to determine whether to reject or not a null hypothesis.
A critical value is a value of a test statistic that marks a cutoff point. If a test statistic is more extreme than the critical value—greater than the critical value in the right tail of a distribution or less than the critical value in the left tail of a distribution—the null hypothesis is rejected.
Critical values are determined by the significance level (or alpha level) of a hypothesis test. The significance level you use is up to you, but the most commonly used significance level is 0.05 ( α \alpha α =0.05).
A significance level of 0.05 means that if the probability of observing a sample statistic at least as extreme as the one you observed is less than 0.05 (or 5%), you should reject your null hypothesis. In a one-sided hypothesis test that uses a Z-test statistic, a significance level of 0.05 is associated with a critical value of 1.645 when you conduct the test in the right tail and a value of -1.645 when you conduct the test in the left tail.
A p-value is the probability associated with your test statistic’s value. Let’s say you calculate a Z-test statistic that maps to the standard normal distribution. You find that the test statistic is equal to 1.75. For this value of a Z-test statistic, the associated p-value is 0.04 or 4%—you can find p-values using tables or statistical software.
A p-value of 0.04 means that the probability of observing a sample statistic at least as extreme as the one you found from your sample data is 4%. If you choose a significance level of 0.05 for your test, we would reject the null hypothesis, since the p-value of 0.04 is less than the significance level of 0.05.
It can be easy to confuse test statistics, critical values, significance levels, and p-values. Remember, these are all different measures involved in determining whether to reject or fail to reject a null hypothesis.
Critical values and significance levels provide cut-offs for your test. The difference between a critical value and a significance level is that the critical value is a point on the distribution, and the significance level is a probability represented by an area under the distribution.
You can compare the test statistic and the p-value against the critical value and the significance level.
If the test statistic is more extreme than the critical value, you reject the null hypothesis.
If the p-value is less than the significance level, you reject the null hypothesis.
If the test statistic is less extreme than the critical value, you fail to reject the null hypothesis.
If the p-value is greater than the significance level, you reject the null hypothesis.
Outlier (from the co-founder of MasterClass) has brought together some of the world's best instructors, game designers, and filmmakers to create the future of online college.
Check out these related courses:
How data describes our world.
Why small choices have big impact.
How money moves our world.
The science of the mind.
This article explains what subsets are in statistics and why they are important. You’ll learn about different types of subsets with formulas and examples for each.
Here is an overview of set operations, what they are, properties, examples, and exercises.
Knowing how to find definite integrals is an essential skill in calculus. In this article, we’ll learn the definition of definite integrals, how to evaluate definite integrals, and practice with some examples.
Further reading, understanding math probability - definition, formula & how to find it, calculate outlier formula: a step-by-step guide, z-score: formula, examples & how to interpret it, a step-by-step guide on how to calculate standard deviation, mean absolute deviation (mad) - meaning & formula.
Hypothesis testing is a fundamental statistical technique used to make inferences about populations based on sample data. This blog will guide you through the process of hypothesis testing, helping you understand and apply the concepts to solve similar assignments efficiently. By following this structured approach, you'll be able to solve your hypothesis testing homework problem with confidence.
Hypothesis testing involves making a decision about the validity of a hypothesis based on sample data. It comprises four key steps: defining hypotheses, calculating the test statistic, determining the p-value, and drawing conclusions. Let's explore each of these steps in detail.
The first step in hypothesis testing is to define the null and alternative hypotheses. These hypotheses represent the statements we want to test.
Null Hypothesis (H0)
The null hypothesis (H0) is a statement that there is no effect or difference. It serves as the default assumption that we aim to test against.
Alternative Hypothesis (Ha or H1)
The alternative hypothesis (Ha or H1) is a statement that indicates the presence of an effect or difference. It represents what we want to prove.
Depending on the direction of the hypothesis, we have three types of tests: left-tailed, right-tailed, and two-tailed tests.
Left-Tailed Test
A left-tailed test is used when we want to determine if the population mean is less than a specified value.
Right-Tailed Test
A right-tailed test is used when we want to determine if the population mean is greater than a specified value.
Two-Tailed Test
A two-tailed test is used when we want to determine if the population mean is different from a specified value, either higher or lower.
Consider a scenario where we want to test if the average vehicle price from a sample is less than $27,000. We would set up our hypotheses as follows:
Once the hypotheses are defined, the next step is to calculate the test statistic. The test statistic helps us determine the likelihood of observing the sample data under the null hypothesis.
The t-test statistic is calculated using the formula:
[ t = \frac{\bar{X} - \mu}{S / \sqrt{n}} ]
The denominator of the t-test formula, (S / \sqrt{n}), is known as the standard error (SE). It measures the variability of the sample mean.
Let's calculate the test statistic for our vehicle price example. Given:
First, we calculate the standard error (SE):
[ SE = \frac{S}{\sqrt{n}} = \frac{3488}{\sqrt{10}} \approx 1103 ]
Next, we calculate the test statistic (t):
[ t = \frac{25650 - 27000}{1103} \approx -1.2238 ]
The p-value is a critical component of hypothesis testing. It indicates the probability of obtaining a test statistic as extreme as the one observed, assuming the null hypothesis is true.
The method to calculate the p-value depends on the type of test (left-tailed, right-tailed, or two-tailed) and the direction of the alternative hypothesis.
For a left-tailed test, the p-value is calculated using the T.DIST() function in Excel.
For a right-tailed test, the p-value is calculated using the T.DIST.RT() function in Excel.
For a two-tailed test, the p-value is calculated using the T.DIST.2T() function in Excel. When the test statistic is negative, use the absolute value function (ABS()) to remove the negative sign before calculating the p-value.
For our vehicle price example with a left-tailed test, we calculate the p-value using the T.DIST() function in Excel:
[ \text{p-value} = T.DIST(-1.2238, 9, TRUE) \approx 0.1261 ]
The final step in hypothesis testing is to draw a conclusion based on the p-value and a pre-determined significance level ((\alpha)).
The significance level ((\alpha)) is the threshold for deciding whether to reject the null hypothesis. Common values for (\alpha) are 0.05, 0.01, 0.10, and 0.005.
For our vehicle price example with (\alpha = 0.05):
Since 0.1261 > 0.05, we fail to reject the null hypothesis. There is not enough evidence to suggest that the average vehicle price is less than $27,000.
To further illustrate hypothesis testing, let's explore three different scenarios: left-tailed test, right-tailed test, and two-tailed test.
In this example, we test if the average vehicle price is less than $27,000.
Step-by-Step Process
Define Hypotheses:
Calculate Test Statistic:
Determine P-Value:
Draw Conclusion:
In this example, we test if the average vehicle price is greater than $23,500.
In this example, we test if the average vehicle price is different from $23,500.
Successfully conducting hypothesis testing involves several critical steps. Here are some tips to help you perform hypothesis testing effectively.
Proper Data Collection
Accurate and reliable data collection is crucial for hypothesis testing. Ensure that your sample is representative of the population and collected using appropriate methods.
Random Sampling
Use random sampling techniques to avoid bias and ensure that your sample accurately represents the population.
Sample Size
Ensure that your sample size is large enough to provide reliable results. Larger sample sizes reduce the margin of error and increase the power of the test.
Hypothesis tests often rely on certain assumptions about the data. Verify these assumptions before proceeding with the test.
Many hypothesis tests, including the t-test, assume that the data follows a normal distribution. Use graphical methods (e.g., histograms, Q-Q plots) or statistical tests (e.g., Shapiro-Wilk test) to check for normality.
Independence
Ensure that the observations in your sample are independent of each other. Independence is a key assumption for most hypothesis tests.
Software tools like Excel , R , and SPSS can simplify the calculations involved in hypothesis testing and reduce the risk of errors.
Excel provides several functions for hypothesis testing, such as T.DIST(), T.DIST.RT(), and T.DIST.2T(). Use these functions to calculate p-values and make decisions based on your test statistics.
R is a powerful statistical software that offers various packages for hypothesis testing. Use functions like t.test() to perform t-tests and obtain p-values and confidence intervals.
Proper interpretation of the results is crucial for drawing accurate conclusions from hypothesis testing.
Statistical Significance
A statistically significant result (p-value < (\alpha)) indicates that there is strong evidence against the null hypothesis. However, it does not imply practical significance. Consider the context and the practical implications of the results.
Type I and Type II Errors
Be aware of the potential for Type I and Type II errors. A Type I error occurs when the null hypothesis is incorrectly rejected, while a Type II error occurs when the null hypothesis is not rejected despite being false. The significance level ((\alpha)) affects the probability of Type I errors, while the sample size and effect size influence the probability of Type II errors.
When reporting the results of hypothesis testing, include all relevant information to ensure transparency and reproducibility.
Detailed Description
Provide a detailed description of the hypotheses, test statistic, p-value, significance level, and the conclusion. This information helps others understand and evaluate your analysis.
Confidence Intervals
Include confidence intervals for the estimated parameters. Confidence intervals provide a range of plausible values for the population parameter and offer additional context for interpreting the results.
Hypothesis testing is a powerful tool, but it is essential to be aware of common pitfalls to avoid incorrect conclusions.
P-values indicate the probability of obtaining a test statistic as extreme as the one observed, assuming the null hypothesis is true. A small p-value suggests strong evidence against the null hypothesis, but it does not provide a measure of the effect size or practical significance.
P-Value Misconceptions
Avoid common misconceptions about p-values, such as believing that a p-value of 0.05 means there is a 5% chance that the null hypothesis is true. P-values do not measure the probability that the null hypothesis is true or false.
Ignoring the assumptions underlying hypothesis tests can lead to incorrect conclusions. Always verify the assumptions before proceeding with the test.
Assumption Violations
If the assumptions are violated, consider using alternative tests that do not rely on those assumptions. For example, if the data is not normally distributed, use non-parametric tests like the Wilcoxon rank-sum test or the Mann-Whitney U test.
Statistical significance does not imply practical significance. A result can be statistically significant but have a negligible practical effect. Always consider the context and practical implications of the results.
Effect Size
Report and interpret effect sizes alongside p-values. Effect sizes provide a measure of the magnitude of the observed effect and offer valuable context for interpreting the results.
Hypothesis testing is a critical tool in statistics for making inferences about populations based on sample data. By understanding the steps involved—defining hypotheses, calculating the test statistic, determining the p-value, and drawing conclusions—you can approach hypothesis testing with confidence.
Ensure proper data collection, verify assumptions, utilize software tools, interpret results carefully, and report findings transparently to enhance the reliability and validity of your hypothesis tests. By avoiding common pitfalls and considering both statistical and practical significance, you'll be well-equipped to tackle statistics homework and research projects effectively.
How to conduct hypothesis testing in statistics submit your homework, attached files.
File | Actions |
---|
Statistics By Jim
Making statistics intuitive
By Jim Frost 12 Comments
T-tests are statistical hypothesis tests that you use to analyze one or two sample means. Depending on the t-test that you use, you can compare a sample mean to a hypothesized value, the means of two independent samples, or the difference between paired samples. In this post, I show you how t-tests use t-values and t-distributions to calculate probabilities and test hypotheses.
As usual, I’ll provide clear explanations of t-values and t-distributions using concepts and graphs rather than formulas! If you need a primer on the basics, read my hypothesis testing overview .
The term “t-test” refers to the fact that these hypothesis tests use t-values to evaluate your sample data. T-values are a type of test statistic. Hypothesis tests use the test statistic that is calculated from your sample to compare your sample to the null hypothesis. If the test statistic is extreme enough, this indicates that your data are so incompatible with the null hypothesis that you can reject the null. Learn more about Test Statistics .
Don’t worry. I find these technical definitions of statistical terms are easier to explain with graphs, and we’ll get to that!
When you analyze your data with any t-test, the procedure reduces your entire sample to a single value, the t-value. These calculations factor in your sample size and the variation in your data. Then, the t-test compares your sample means(s) to the null hypothesis condition in the following manner:
Read the companion post where I explain how t-tests calculate t-values .
The tricky thing about t-values is that they are a unitless statistic, which makes them difficult to interpret on their own. Imagine that we performed a t-test, and it produced a t-value of 2. What does this t-value mean exactly? We know that the sample mean doesn’t equal the null hypothesis value because this t-value doesn’t equal zero. However, we don’t know how exceptional our value is if the null hypothesis is correct.
To be able to interpret individual t-values, we have to place them in a larger context. T-distributions provide this broader context so we can determine the unusualness of an individual t-value.
A single t-test produces a single t-value. Now, imagine the following process. First, let’s assume that the null hypothesis is true for the population. Now, suppose we repeat our study many times by drawing many random samples of the same size from this population. Next, we perform t-tests on all of the samples and plot the distribution of the t-values. This distribution is known as a sampling distribution, which is a type of probability distribution.
Related posts : Sampling Distributions and Understanding Probability Distributions
If we follow this procedure, we produce a graph that displays the distribution of t-values that we obtain from a population where the null hypothesis is true. We use sampling distributions to calculate probabilities for how unusual our sample statistic is if the null hypothesis is true.
Luckily, we don’t need to go through the hassle of collecting numerous random samples to create this graph! Statisticians understand the properties of t-distributions so we can estimate the sampling distribution using the t-distribution and our sample size.
The degrees of freedom (DF) for the statistical design define the t-distribution for a particular study. The DF are closely related to the sample size. For t-tests, there is a different t-distribution for each sample size.
Related posts : Degrees of Freedom in Statistics and T Distribution: Definition and Uses .
T-distributions assume that the null hypothesis is correct for the population from which you draw your random samples. To evaluate how compatible your sample data are with the null hypothesis, place your study’s t-value in the t-distribution and determine how unusual it is.
The sampling distribution below displays a t-distribution with 20 degrees of freedom, which equates to a sample size of 21 for a 1-sample t-test. The t-distribution centers on zero because it assumes that the null hypothesis is true. When the null is true, your study is most likely to obtain a t-value near zero and less liable to produce t-values further from zero in either direction.
On the graph, I’ve displayed the t-value of 2 from our hypothetical study to see how our sample data compares to the null hypothesis. Under the assumption that the null is true, the t-distribution indicates that our t-value is not the most likely value. However, there still appears to be a realistic chance of observing t-values from -2 to +2.
We know that our t-value of 2 is rare when the null hypothesis is true. How rare is it exactly? Our final goal is to evaluate whether our sample t-value is so rare that it justifies rejecting the null hypothesis for the entire population based on our sample data. To proceed, we need to quantify the probability of observing our t-value.
Related post : What are Critical Values?
Hypothesis tests work by taking the observed test statistic from a sample and using the sampling distribution to calculate the probability of obtaining that test statistic if the null hypothesis is correct. In the context of how t-tests work, you assess the likelihood of a t-value using the t-distribution. If a t-value is sufficiently improbable when the null hypothesis is true, you can reject the null hypothesis.
I have two crucial points to explain before we calculate the probability linked to our t-value of 2.
Because I’m showing the results of a two-tailed test, we’ll use the t-values of +2 and -2. Two-tailed tests allow you to assess whether the sample mean is greater than or less than the target value in a 1-sample t-test. A one-tailed hypothesis test can only determine statistical significance for one or the other.
Additionally, it is possible to calculate a probability only for a range of t-values. On a probability distribution plot, probabilities are represented by the shaded area under a distribution curve. Without a range of values, there is no area under the curve and, hence, no probability.
Related posts : One-Tailed and Two-Tailed Tests Explained and T-Distribution Table of Critical Values
Considering these points, the graph below finds the probability associated with t-values less than -2 and greater than +2 using the area under the curve. This graph is specific to our t-test design (1-sample t-test with N = 21).
The probability distribution plot indicates that each of the two shaded regions has a probability of 0.02963—for a total of 0.05926. This graph shows that t-values fall within these areas almost 6% of the time when the null hypothesis is true.
There is a chance that you’ve heard of this type of probability before—it’s the P value! While the likelihood of t-values falling within these regions seems small, it’s not quite unlikely enough to justify rejecting the null under the standard significance level of 0.05.
Learn how to interpret the P value correctly and avoid a common mistake!
Related posts : How to Find the P value: Process and Calculations and Types of Errors in Hypothesis Testing
The sample size for a t-test determines the degrees of freedom (DF) for that test, which specifies the t-distribution. The overall effect is that as the sample size decreases, the tails of the t-distribution become thicker. Thicker tails indicate that t-values are more likely to be far from zero even when the null hypothesis is correct. The changing shapes are how t-distributions factor in the greater uncertainty when you have a smaller sample.
You can see this effect in the probability distribution plot below that displays t-distributions for 5 and 30 DF.
Sample means from smaller samples tend to be less precise. In other words, with a smaller sample, it’s less surprising to have an extreme t-value, which affects the probabilities and p-values. A t-value of 2 has a P value of 10.2% and 5.4% for 5 and 30 DF, respectively. Use larger samples!
Click here for step-by-step instructions for how to do t-tests in Excel !
If you like this approach and want to learn about other hypothesis tests, read my posts about:
To see an alternative to traditional hypothesis testing that does not use probability distributions and test statistics, learn about bootstrapping in statistics !
May 25, 2021 at 10:42 pm
what statistical tools, is recommended for measuring the level of satisfaction
May 26, 2021 at 3:55 pm
Hi McKienze,
The correct analysis depends on the nature of the data you have and what you want to learn. You don’t provide enough information to be able to answer the question. However, read my hypothesis testing overview to learn about the options.
August 23, 2020 at 1:33 am
Hi Jim, I want to ask about standardizing data before the t test.. For example I have USD prices of a big Mac across the world and this varies by quite a bit. Doing the t-test here would be misleading since some countries would have a higher mean… Should the approach be standardizing all the usd values? Or perhaps even local values?
August 24, 2020 at 12:37 am
Yes, that makes complete sense. I don’t know what method is best. If you can find a common scale to use for all prices, I’d do that. You’re basically using a data transformation before analysis, which is totally acceptable when you have a good reason.
April 3, 2020 at 4:47 am
Hey Jim. Your blog is one of the only few ones where everything is explained in a simple and well structured manner, in a way that both an absolute beginner and a geek can equally benefit from your writing. Both this article as well as your article on one tailed and two tailed hypothesis tests have been super helpful. Thank you for this post
March 6, 2020 at 11:04 am
Thank you, Jim, for sharing your knowledge with us.
I have a 2 part question. I am testing the difference in walking distance within a busy environment compared with a simple environment. I am also testing walking time within the 2 environments. I am using the same individuals for both scenarios. I was planning to do a paired ttest for distance difference between busy and simple environments and a 2nd paired ttest for time difference between the environments.
My question(s) for you is: 1. Do you feel that a paired ttest is the best choice for these? 2. Do you feel that, because there are 2 tests, I should do a bonferroni correction or do you believe that because the data is completely different (distance as opposed to time), it is okay not to do a multiple comparison test?
August 13, 2019 at 12:43 pm
thank you very eye opening on the use of two or one tailed test
April 19, 2019 at 3:49 pm
Hi Mr. Frost,
Thanks for the breakdown. I have a question … if I wanted to run a test to show that the medical professionals could use more training with data set consisting of questions which in your opinion would be my best route?
January 14, 2019 at 2:22 pm
Hello Jim, I find this statement in this excellent write up contradicting : 1)This graph shows that t-values fall within these areas almost 6% of the time when the null hypothesis is true I mean if this is true the t-value =0 hypothesis is rejected.
January 14, 2019 at 2:51 pm
I can see how that statement sounds contradictory, but I can assure that it is quite accurate. It’s often forgotten but the underlying assumption for the calculations surrounding hypothesis testing, significance levels, and p-values is that the null hypothesis is true.
So, the probabilities shown in the graph that you refer to are based on the assumption that the null hypothesis is true. Further, t-values for this study design have a 6% chance of falling in those critical areas assuming the null is true (a false positive).
Significance levels are defined as the maximum acceptable probability of a false positive. Usually, we set that as 5%. In the example, there’s a large probability of a false positive (6%), so we fail to reject the null hypothesis. In other words, we fail to reject the null because false positives will happen too frequently–where the significance level defines the cutoff point for too frequently.
Keep in mind that when you have statistically significant results, you’re really saying that the results you obtained are improbable enough assuming that the null is true that you can reject the notion that the null is true. But, the math and probabilities are all based on the assumption that the null is true because you need to determine how unlikely your results are under the null hypothesis.
Even the p-value is defined in terms of assuming the null hypothesis is true. You can read about that in my post about interpreting p-values correctly .
I hope this clarifies things!
November 9, 2018 at 2:36 am
Jim …I was involved in in a free SAT/ACT tutoring program that I need to analyze for effectiveness .
I have pre test scores of a number of students and the post test scores after they were tutored (treatment ).
Glenn dowell
November 9, 2018 at 9:05 am
It sounds like you need to perform a paired t-test assuming.
COMMENTS
A test statistic assesses how consistent your sample data are with the null hypothesis in a hypothesis test. Test statistic calculations take your sample data and boil them down to a single number that quantifies how much your sample diverges from the null hypothesis. As a test statistic value becomes more extreme, it indicates larger ...
Present the findings in your results and discussion section. Though the specific details might vary, the procedure you will use when testing a hypothesis will always follow some version of these steps. Table of contents. Step 1: State your null and alternate hypothesis. Step 2: Collect data. Step 3: Perform a statistical test.
The researchers write their hypotheses. These statements apply to the population, so they use the mu (μ) symbol for the population mean parameter.. Null Hypothesis (H 0): The population means of the test scores for the two groups are equal (μ 1 = μ 2).; Alternative Hypothesis (H A): The population means of the test scores for the two groups are unequal (μ 1 ≠ μ 2).
Test statistic example. To test your hypothesis about temperature and flowering dates, you perform a regression test. The regression test generates: a regression coefficient of 0.36. a t value comparing that coefficient to the predicted range of regression coefficients under the null hypothesis of no relationship.
A statistical hypothesis test typically involves a calculation of a test statistic. Then a decision is made, either by comparing the test statistic to a critical value or equivalently by evaluating a p -value computed from the test statistic. Roughly 100 specialized statistical tests have been defined. [ 1 ][ 2 ]
A statistical hypothesis is an assumption about a population parameter.. For example, we may assume that the mean height of a male in the U.S. is 70 inches. The assumption about the height is the statistical hypothesis and the true mean height of a male in the U.S. is the population parameter.. A hypothesis test is a formal statistical test we use to reject or fail to reject a statistical ...
In hypothesis testing, the goal is to see if there is sufficient statistical evidence to reject a presumed null hypothesis in favor of a conjectured alternative hypothesis. The null hypothesis is usually denoted H0 while the alternative hypothesis is usually denoted H1. An hypothesis test is a statistical decision; the conclusion will either be ...
Hypothesis testing is a crucial procedure to perform when you want to make inferences about a population using a random sample. These inferences include estimating population properties such as the mean, differences between means, proportions, and the relationships between variables. This post provides an overview of statistical hypothesis testing.
The test statistic converts the sample mean (x̄) or sample proportion (p̂) to a Z- or t-score under the assumption that the null hypothesis is true. It is used to decide whether the difference between the sample statistic and the hypothesized claim is significant. The p-value is the area under the curve to the left or right of the test ...
Photo by Andrew George on Unsplash. Student's t-tests are commonly used in inferential statistics for testing a hypothesis on the basis of a difference between sample means. However, people often misinterpret the results of t-tests, which leads to false research findings and a lack of reproducibility of studies.
The test statistic is the standardized value following the sampled data under the assumption that the null hypothesis is true, and a chosen particular test. These tests depend on the statistic to be studied and the assumed distribution it follows, e.g. the population mean following a normal distribution.
Revised on June 22, 2023. A t test is a statistical test that is used to compare the means of two groups. It is often used in hypothesis testing to determine whether a process or treatment actually has an effect on the population of interest, or whether two groups are different from one another. t test example.
What is Hypothesis Testing? A statistical hypothesis is an assumption about a population parameter. This assumption may or may not be true. ... Find the value of the test statistic (mean score, proportion, t statistic, z-score, etc.) described in the analysis plan. Interpret results. Apply the decision rule described in the analysis plan.
HYPOTHESIS TESTING. A clinical trial begins with an assumption or belief, and then proceeds to either prove or disprove this assumption. In statistical terms, this belief or assumption is known as a hypothesis. Counterintuitively, what the researcher believes in (or is trying to prove) is called the "alternate" hypothesis, and the opposite ...
Hypothesis testing is the process that an analyst uses to test a statistical hypothesis. The methodology depends on the nature of the data used and the reason for the analysis.
One-Sample T Test Hypotheses. Null hypothesis (H 0): The population mean equals the reference value (µ = µ 0). Alternative hypothesis (H A): The population mean DOES NOT equal the reference value (µ ≠ µ 0). Reject the null when the p-value is less than the significance level (e.g., 0.05). This condition indicates the difference between ...
Hypothesis testing is a tool for making statistical inferences about the population data. It is an analysis tool that tests assumptions and determines how likely something is within a given standard of accuracy. Hypothesis testing provides a way to verify whether the results of an experiment are valid. A null hypothesis and an alternative ...
Hypothesis testing is based on making two different claims about a population parameter. The null hypothesis (H 0) and the alternative hypothesis (H 1) are the claims. The two claims needs to be mutually exclusive, meaning only one of them can be true. The alternative hypothesis is typically what we are trying to prove.
In a one-tailed test, the critical distribution area is one-sided, meaning the test sample is either greater or lesser than a specific value. ... Hypothesis testing is a statistical method used to determine if there is enough evidence in a sample data to draw conclusions about a population. It involves formulating two competing hypotheses, the ...
Test statistic is a quantity derived from the sample for statistical hypothesis testing. [1] A hypothesis test is typically specified in terms of a test statistic, considered as a numerical summary of a data-set that reduces the data to one value that can be used to perform the hypothesis test. In general, a test statistic is selected or ...
The test statistic tells you how different two or more groups are from the overall population mean, or how different a linear slope is from the slope predicted by a null hypothesis. Different test statistics are used in different statistical tests.
A test statistic is a standardized score used in hypothesis testing. It tells you how likely the results obtained from your sample data are under the assumption that the null hypothesis is true. The more unlikely your results are under this assumption, the easier it becomes to reject the null hypothesis in favor of an alternative hypothesis.
Once the hypotheses are defined, the next step is to calculate the test statistic. The test statistic helps us determine the likelihood of observing the sample data under the null hypothesis. Formula for the T-Test Statistic. The t-test statistic is calculated using the formula: [ t = \frac{\bar{X} - \mu}{S / \sqrt{n}} ] Where:
How t-Tests Work: t-Values, t-Distributions, and Probabilities. T-tests are statistical hypothesis tests that you use to analyze one or two sample means. Depending on the t-test that you use, you can compare a sample mean to a hypothesized value, the means of two independent samples, or the difference between paired samples.