Imagine a scenario in which we wanted to compare the standardized test results from two students. Let’s call them Zoe and Mike. Zoe took the ACT and scored a 25, while Mike took the SAT and scored 1150. Which of the test takers scored better? And what proportion of people scored worse than Zoe and Mike?
How to Use a Z-Table
To be able to utilize a z-table and answer these questions, you have to turn the scores on the different tests into a standard normal distribution
N(mean = 0, std = 1).
Since these scores on these tests have a normal distribution, we can convert both of them into standard normal distributions by using the following formula.
With this formula, you can calculate z-scores for Zoe and Mike.
Since Zoe has a higher z-score than Mike, Zoe performed better on her test.
How to Use a Z-table
While we know that Zoe performed better, a z-table can tell you in what percentile the test takers are in. The following partial z-table — cut off to save space — can tell you the area underneath the curve to the left of our z-score. This is the probability.
How to Find Zoe’s Z-Score Probability
To use the z-score table, start on the left side of the table and go down to 1.2. At the top of the table, go to 0.05. This corresponds to the value of
1.2 + .05 = 1.25. The value in the table is .8944 which is the probability. Roughly 89.44 percent of people scored worse than her on the ACT.
How to Find Mike’s Z-Score Probability
Mike’s z-score was 1.0. To use the z-score table, start on the left side of the table and go down to 1.0. Now at the top of the table, go to 0.00. This corresponds to the value of
1.0 + .00 = 1.00. The value in the table is .8413, which is the probability. Roughly 84.13 percent of people scored worse than him on the SAT.
It is important to keep in mind that if you have a negative z-score, you can simply use a table that contains negative z-scores.
How to Create a Z-Table
This section will answer where the values in the z-table come from by going through the process of creating a z-score table. Please don’t worry if you don’t understand this section. It’s not important if you just want to know how to use a z-score table.
Finding the Probability Density Function
This is very similar to the 68–95–99.7 rule, but adapted for creating a z-table. Probability density functions (PDFs) are important to understand if you want to know where the values in a z-table come from. A PDF is used to specify the probability of the random variable falling within a particular range of values, as opposed to taking on any one value. This probability is given by the integral of this variable’s PDF over that range. That is, it’s given by the area under the density function but above the horizontal axis, and between the lowest and greatest values of the range.
This definition might not make much sense, so let’s clear it up by graphing the probability density function for a normal distribution. The equation below is the probability density function for a normal distribution
Let’s simplify it by assuming we have a mean (μ) of zero and a standard deviation (σ) of one (standard normal distribution).
This can be graphed using any language, but I choose to graph it using Python.
# Import all libraries for this portion of the blog post from scipy.integrate import quad import numpy as np import matplotlib.pyplot as plt import pandas as pd %matplotlib inline x = np.linspace(-4, 4, num = 100) constant = 1.0 / np.sqrt(2*np.pi) pdf_normal_distribution = constant * np.exp((-x**2) / 2.0) fig, ax = plt.subplots(figsize=(10, 5)); ax.plot(x, pdf_normal_distribution); ax.set_ylim(0); ax.set_title('Normal Distribution', size = 20); ax.set_ylabel('Probability Density', size = 20);
The graph above does not show you the probability of events but their probability density. To get the probability of an event within a given range, you need to integrate.
Finding the Cumulative Distribution Function
Recall that the standard normal table entries are the area under the standard normal curve to the left of z (between negative infinity and z).
To find the area, you need to integrate. Integrating the PDF gives you the cumulative distribution function (CDF), which is a function that maps values to their percentile rank in a distribution. The values in the table are calculated using the cumulative distribution function of a standard normal distribution with a mean of zero and a standard deviation of one. This can be denoted with the equation below.
This is not an easy integral to calculate by hand, so I am going to use Python to calculate it. The code below calculates the probability for Zoe, who had a z-score of 1.25, and Mike, who had a z-score of 1.00.
def normalProbabilityDensity(x): constant = 1.0 / np.sqrt(2*np.pi) return(constant * np.exp((-x**2) / 2.0) ) zoe_percentile, _ = quad(normalProbabilityDensity, np.NINF, 1.25) mike_percentile, _ = quad(normalProbabilityDensity, np.NINF, 1.00) print('Zoe: ', zoe_percentile) print('Mike: ', mike_percentile)
As the code below shows, these calculations can be done to create a z-table.
One important point to emphasize is that calculating this table from scratch when needed is inefficient, so we usually resort to using a standard normal table from a textbook or online source.