“Why should I care about random sampling?”
Here’s why: If you’re a data scientist and want to develop models, you need data. And if you need data, someone needs to collect that data. And if someone is collecting data, they need to make sure that it isn’t biased or it will be extremely costly in the long run.
Therefore, if you want to collect unbiased data, then you need to know about random sampling.
4 Types of Random Sampling Techniques
- Simple random sampling.
- Stratified random sampling.
- Cluster random sampling.
- Systematic random sampling.
What Is Random Sampling?
Random sampling simply describes a state wherein every element in a population has an equal chance of being chosen for the sample. Sounds simple, right? Well, it’s a lot easier said than done because you must consider a lot of logistics in order to minimize bias. These four types of random sampling techniques will allow you to do just that.
1. Simple Random Sampling
Simple random sampling requires the use of randomly generated numbers to choose a sample. More specifically, it initially requires a sampling frame, which is a list or database of all members of a population. You can then randomly generate a number for each element, using Excel for example, and take the first n number of samples that you require.
To give an example, imagine the table on the right was your sampling frame. Using software like Excel, you can then generate random numbers for each element in the sampling frame. If you need a sample size of three, then you would take the samples with the random numbers from one to three.
2. Stratified Random Sampling
Stratified random sampling involves dividing a population into groups with similar attributes and randomly sampling each group.
This method ensures that different segments in a population are equally represented. To give an example, imagine a survey is conducted at a school to determine overall satisfaction. Here, stratified random sampling can equally represent the opinions of students in each department.
3. Cluster Random Sampling
Cluster sampling starts by dividing a population into groups or clusters. What makes this different from stratified sampling is that each cluster must be representative of the larger population. Then, you randomly select entire clusters to sample.
For example, if a school had five different eighth grade classes, cluster random sampling means any one class would serve as a sample.
4. Systematic Random Sampling
Systematic random sampling is a common technique in which you sample every kth element. For example, if you were conducting surveys at a mall, you might survey every 100th person that walks in.
If you have a sampling frame, then you would divide the size of the frame, N, by the desired sample size, n, to get the index number, k. You would then choose every kth element in the frame to create your sample.
Using the same charts from the first example, if we wanted a sample size of two this time, then we would take every third row in the sampling frame.
Random Sampling Explained
You should now have an understanding of what random sampling is and several common techniques for conducting it. Mastering this concept is extremely important to minimize bias and create better models.