How to Do a T-Test in Python

A t-test is a statistical test used to determine if there is a significant difference between a sample mean and a hypothesized population mean, or between the means of two independent groups. Here's how to perform a t-test in Python.

Written by Sohail Hosseini
data scientist conducting a t-test on computer
Image: Shutterstock / Built In
Brand Studio Logo
UPDATED BY
Brennan Whitfield | Sep 22, 2025
Summary: A t-test is a statistical test used to determine if there is a significant difference between the means of two groups. It can be performed in Python using libraries such as SciPy and NumPy.

A t-test is a statistical test used to determine whether there is a significant difference between the means of two groups.

In the field of statistics and data analysis, t-tests are widely used to compare sample populations and infer conclusions about the larger population. 

What Is a T-Test?

A t-test is a statistical test that determines whether there is a significant difference between the means of two groups. There are several Python libraries available to conduct a T-test, including SciPy and NumPy. 

In this article, we will discuss the assumptions of a t-test, its benefits when using Python and how to perform various types of t-tests in Python with examples. We will also provide troubleshooting tips for using t-tests in Python.

 

An introduction to t-test in Python. | Video: DataDaft

Assumptions of a T-Test

A t-test is based on several assumptions that need to be met for the results to be valid. Violations of these assumptions may lead to incorrect conclusions.

The assumptions of a t-test are independence of observations, normality and equal variances.

1. Independence of Observations

The data points in each group should be independent of each other. This means that the outcome of one observation should not affect the outcome of another observation. Independence can be achieved by using random sampling or experimental designs that account for potential confounding factors. In cases where independence can’t be assumed, alternative tests such as mixed-effects models or repeated measures analysis of variance (ANOVA) may be more appropriate.

2. Normality

The data should follow a normal distribution in each group. Normality can be visually assessed using histograms or quantile-quantile (Q-Q) plots, or tested using formal tests such as the Shapiro-Wilk test or the Kolmogorov-Smirnov test. However, t-tests are relatively robust to violations of normality when the sample size is large. For small sample sizes or severely non-normal data, non-parametric alternatives such as the Mann-Whitney U-test or the Wilcoxon signed-rank test can be used.

3. Equal Variances

The variances of the two groups should be approximately equal, although there are variations of the t-test that can handle unequal variances. To assess the equality of variances, you can use graphical methods like boxplots, or perform a formal test like Levene’s test or Bartlett’s test. If the variances are not equal, a Welch’s t-test can be used as it does not assume equal variances.

If these assumptions are not met, the results of the t-test may not be reliable. In these cases, it may be necessary to use a non-parametric test, which does not make these assumptions.

T-tests are a powerful tool for comparing the means of two groups. However, it’s important to make sure that the assumptions of the test are met before using it.

More on Data Science8 Types of Data Analysis

 

How to Perform a T-Test

A t-test involves several steps to determine whether there is a significant difference between the means of two groups.

These steps include:

1. Formulate a Hypothesis 

Start by formulating a null hypothesis, which states that there is no significant difference between the means of the two groups. You will also need an alternative hypothesis, which states that there is a significant difference between the means.

2. Choose a Significance Level 

The significance level, also known as alpha, is the probability of rejecting the null hypothesis when it is actually true. The most common significance level is 0.05, meaning that there is a 5 percent chance of rejecting the null hypothesis when it is actually true.

3. Collect and Prepare the Data

Collect the data for both groups and ensure that it meets the assumptions of the t-test: independence of observations, normality and equal variances).

4. Calculate the T-statistic

Use the appropriate t-test formula to calculate the t-statistic, which measures the difference between the means of the two groups relative to the variation within the groups.

5. Determine the P-Value

Calculate the p-value, which represents the probability of obtaining a t-statistic as extreme or more extreme than the observed value, assuming that the null hypothesis is true.

6. Make a Decision

Compare the p-value to the significance level. If the p-value is less than the significance level, reject the null hypothesis and conclude that there is a significant difference between the means of the two groups. If the p-value is greater than the significance level, fail to reject the null hypothesis and conclude that there is no significant difference between the means of the two groups.

 

Benefits of a T-Test in Python

Using Python for t-tests offers several benefits:

  • Python is a powerful and versatile programming language that can be used for a variety of tasks, including data analysis.
  • There are several Python libraries available for t-tests, including SciPy and NumPy.
  • T-tests can be performed quickly and easily in Python.
  • The results of t-tests can be easily visualized in Python.

 

One-Sample T-Test in Python Example

In statistical testing, a one-sample t-test is used when we want to compare a sample mean with a population mean. A one-sample t-test examines whether the mean of a sample is statistically different from a known or hypothesized population mean.

For instance, consider a hypothetical scenario where we have test scores from a sample of students and we want to compare the mean of these scores with a hypothesized population mean. Let’s assume the population mean is 70.

In this context, our null hypothesis is that the mean score of our sample is equal to 70 (the population mean). The alternative hypothesis is typically that the sample mean is not equal to the population mean (a two-tailed test). However, it could also be that the sample mean is greater than or less than the population mean (a one-tailed test), depending on the research question.

How to Calculate a One-Sample T-Test

We can use Python’s SciPy stats package to perform a one-sample t-test. Below is the Python code for this task:

# Import necessary libraries
import numpy as np
from scipy import stats

# Given student scores
student_scores = np.array([72, 89, 65, 73, 79, 84, 63, 76, 85, 75])

# Hypothesized population mean
mu = 70

# Perform one-sample t-test
t_stat, p_value = stats.ttest_1samp(student_scores, mu)
print("T statistic:", t_stat)
print("P-value:", p_value)

# Setting significance level
alpha = 0.05

# Interpret the results
if p_value < alpha:
    print("Reject the null hypothesis; there is a significant difference between the sample mean and the hypothesized population mean.")
else:
    print("Fail to reject the null hypothesis; there is no significant difference between the sample mean and the hypothesized population mean.")

One-Sample T-Test Example Results

The ttest_1samp() function in the code above performs a one-sample t-test and returns two values: the t-statistic and the p-value.

The t-statistic is a measure that shows the difference between the sample mean and the hypothesized population mean. The greater the absolute value of the t-statistic, the larger the difference between the two means.

The p-value is a probability that measures the evidence against the null hypothesis. A smaller p-value indicates stronger evidence against the null hypothesis. If the p-value is less than our chosen significance level (0.05 in this case), we reject the null hypothesis, suggesting there is a significant difference between the sample mean and the hypothesized population mean. 

If the p-value is greater than our significance level, we fail to reject the null hypothesis, suggesting there is no significant difference between the sample mean and the hypothesized population mean.

This way, by performing a one-sample t-test, we can determine whether the mean of a sample is significantly different from a given population mean.

In this case, our p-value (0.0478) is less than the significance level (0.05). Therefore, we reject the null hypothesis and conclude that there is a significant difference between the sample mean and the hypothesized population mean of 70. This suggests that the mean test score of our sample of students is significantly different from the population mean.

 

Two-Sample T-Test in Python Example

To demonstrate the use of a t-test, we will use the famous “Iris” data set available in the Seaborn library. The Iris data set contains information on 150 iris flowers from three different species (setosa, versicolor, and virginica), with 50 samples from each species. The data set has four features: sepal length, sepal width, petal length and petal width.

How to Calculate a Two-Sample T-Test

In this example, we will perform a t-test to compare the mean petal lengths of iris setosa and iris versicolor.

# Import the necessary libraries:
import seaborn as sns
import numpy as np
from scipy import stats

# Load the Iris dataset:
iris = sns.load_dataset('iris')

# Filter the dataset for the two species we want to compare:
setosa = iris[iris['species'] == 'setosa']
versicolor = iris[iris['species'] == 'versicolor']

# Extract the petal lengths for each species:
setosa_petal_lengths = setosa['petal_length']
versicolor_petal_lengths = versicolor['petal_length']

# Perform the t-test:
t_stat, p_value = stats.ttest_ind(setosa_petal_lengths, versicolor_petal_lengths)

# Interpret the results:
alpha = 0.05
if p_value < alpha:
    print("Reject the null hypothesis; there is a significant difference between the petal lengths of Iris setosa and Iris versicolor.")
else:
    print("Fail to reject the null hypothesis; there is no significant difference between the petal lengths of Iris setosa and Iris versicolor.")

This t-test compares the mean petal lengths of iris setosa and iris versicolor. The obtained p-value, which is approximately 5.40e-62, indicates that there is a significant difference between the two species.

More on Data ScienceAn Introduction to the Confusion Matrix in Python

 

Tips for Using a T-Test in Python

  • Check data assumptions: Ensure that your data meets the assumptions of the t-test before proceeding with the analysis. If your data violates any of the assumptions, consider using alternative statistical tests.
  • Handle missing values: Remove or impute missing values in your dataset before performing the t-test. Missing values can lead to inaccurate results.
  • Verify the data type: Ensure that the data type of your input is correct. For example, using a list instead of a NumPy array can lead to errors.
  • Interpret p-values cautiously: Always consider the context of your study when interpreting p-values. A low p-value indicates that the results are statistically significant, but it doesn't prove causality or guarantee the difference is large enough to be practically significant (meaningful or important in a real-world context).
  • Use appropriate libraries: Utilize libraries like SciPy and NumPy to simplify the process of performing a t-test and other statistical analyses in Python.

Frequently Asked Questions

A t-test is a statistical test used to determine if there is a significant difference between the means of two groups. It is widely used in statistics and data analysis to compare sample populations and draw conclusions about a larger population.

For the results of a t-test to be valid, the data should meet the following assumptions:

  1. Independence of observations: Each data point should be independent of the others.
  2. Normality: The data in each group should be approximately normally distributed, especially important for small sample sizes.
  3. Equal variances (homogeneity of variance): For a two-sample t-test, the variances of the two groups should be approximately equal. (This applies specifically to the standard independent t-test; the Welch’s t-test can be used if variances are unequal.)

If these assumptions are not met, alternative non-parametric tests such as the Mann-Whitney U test or Wilcoxon signed-rank test may be more appropriate.

To perform a t-test, the steps include:

  1. Formulate a hypothesis: Define a null hypothesis (no difference) and an alternative hypothesis (a difference exists).
  2. Choose a significance level: This is the probability of rejecting the null hypothesis when it is true (typically 0.05).
  3. Collect and prepare the data: Ensure your data meets the test's assumptions.
  4. Calculate the T-statistic: This measures the difference between the group means relative to their variation.
  5. Determine the p-value: This is the probability of obtaining a t-statistic as extreme as the one observed.
  6. Make a decision: Compare the p-value to the significance level to decide whether to reject or fail to reject the null hypothesis.
Explore Job Matches.