Z-Test for Statistical Hypothesis Testing Explained

The Z-test is a statistical hypothesis test used to determine where the distribution of the test statistic we are measuring, like the mean, is part of the normal distribution.

While there are multiple types of Z-tests, we’ll focus on the easiest and most well-known one, the one-sample mean test. This is used to determine if the difference between the mean of a sample and the mean of a population is statistically significant.

What Is a Z-Test?

A Z-test determines whether there are any statistically significant differences between the means of two populations. A Z-test can only be applied if the standard deviation of each population is known and a sample size of at least 30 data points is available.

The name Z-test comes from the Z-score of the normal distribution. This is a measure of how many standard deviations away a raw score or sample statistics is from the population’s mean. Z-tests are the most common statistical tests conducted in fields such as healthcare and data science, making them essential to understand.

Overview of a Z-test. | Video: Vectors Academy

Requirements for a Z-Test

In order to conduct a Z-test, your statistics need to meet a few requirements:

A sample size that’s greater than 30. This is because we want to ensure our sample mean comes from a distribution that is normal. As stated by the central limit theorem, any distribution can be approximated as normally distributed if it contains more than 30 data points.
The standard deviation and mean of the population is known.
The sample data is collected/acquired randomly.

More on Data Science: What Is Bootstrapping Statistics?

Z-Test Steps

There are four steps to complete a Z-test. Let’s examine each one:

1. State the Null Hypothesis

The first step in a Z-test is to state the null hypothesis, H_0. This is what you believe to be true from the population, which could be the mean of the population, μ_0:

Null hypothesis equation generated in LaTeX. | Image: Egor Howell

2. State the Alternate Hypothesis

Next, state the alternate hypothesis, H_1. This is what you observe from your sample. If the sample mean is different from the population’s mean, then we say the mean is not equal to μ_0:

Alternate hypothesis equation generated in LaTeX. | Image: Egor Howell

3. Choose Your Critical Value

Then, choose your critical value, α, which determines whether you accept or reject the null hypothesis. Typically, for a Z-test we would use a statistical significance of 5 percent which is z = +/- 1.96 standard deviations from the population’s mean in the normal distribution:

Z-test critical value plot. | Image: Egor Howell

This critical value is based on confidence intervals.

4. Calculate Your Z-Test Statistic

Compute the Z-test statistic using the sample mean, μ_1, the population mean, μ_0, the number of data points in the sample, n and the population’s standard deviation, σ:

Z-test statistic equation generated in LaTeX. | Image: Egor Howell

If the test statistic is greater (or lower depending on the test we are conducting) than the critical value, then the alternate hypothesis is true because the sample’s mean is statistically significant enough from the population mean.

Another way to think about this is if the sample mean is so far away from the population mean, the alternate hypothesis has to be true or the sample is a complete anomaly.

A tutorial of hypothesis testing problems comparing Z-test and T statistics. | Video: The Organic Chemistry Tutor

More on Data Science: Basic Probability Theory and Statistics Terms to Know

Z-Test Example

Let’s go through an example to fully understand the one-sample mean Z-test.

A school says that its pupils are, on average, smarter than other schools. It takes a sample of 50 students whose average IQ measures to be 110. The population, or the rest of the schools, has an average IQ of 100 and standard deviation of 20. Is the school’s claim correct?

The null and alternate hypotheses are:

Null hypothesis and alternate hypothesis generated in LaTeX. | Image: Egor Howell

Where we are saying that our sample, the school, has a higher mean IQ than the population mean.

Now, this is what’s called a right-sided, one-tailed test as our sample mean is greater than the population’s mean. So, choosing a critical value of 5 percent, which equals a Z-score of 1.96, we can only reject the null hypothesis if our Z-test statistic is greater than 1.96.

If the school claimed its students’ IQs were an average of 90, then we would use a left-tailed test, as shown in the figure above. We would then only reject the null hypothesis if our Z-test statistic is less than -1.96.

Computing our Z-test statistic, we see:

Therefore, we have sufficient evidence to reject the null hypothesis, and the school’s claim is right.

Types of Z-Tests

There are four main types of Z-tests to consider:

One-Tailed Z-Test

A one-tailed Z-test involves an alternative hypothesis that claims the value of a parameter is either greater or less than what the null hypothesis claims it to be. This means the region of rejection falls to one side of the distribution, not both. A one-tailed Z-test can then be either left-tailed or right-tailed.

Left-Tailed Z-Test

A left-tailed Z-test involves an alternative hypothesis that claims the value of a parameter is less than what the null hypothesis claims it to be. In a normal distribution, the region of rejection would then be to the far left of the distribution’s center.

Right-Tailed Z-Test

A right-tailed Z-test involves an alternative hypothesis that claims the value of a parameter is greater than what the null hypothesis claims it to be. In a normal distribution, the region of rejection would then be to the far right of the distribution’s center.

Two-Tailed Z-Test

A two-tailed Z-test involves an alternative hypothesis that claims there is a significant difference between the means of two populations while the null hypothesis claims there is no significant difference. The alternative hypothesis doesn’t designate a direction since it merely includes a “not equal” sign, creating two regions of rejection that fall on either side of a distribution’s center. If results land in either region, the alternative hypothesis is accepted.

Frequently Asked Questions

What is a Z-test used for?

A Z-test is used to determine whether there are any statistically significant differences in the means of two populations. Each population must have a known standard deviation and be large enough to provide a sample size of at least 30 data points.

When should you use a Z-test?

You should use a Z-test if you know a population’s standard deviation and can collect a sample size of at least 30 data points.

What is the difference of T-test and Z-test?

A T-test is used when the sample size is less than 30 data points and the population’s standard deviation is unknown. On the other hand, a Z-test is used when the sample size is at least 30 data points and the population’s standard deviation is known.