What Is Selection Effect, and How Can I Avoid It?

Selection effect is a pervasive threat to the validity of any marketing analysis. So, analysts should be acutely aware of this phenomenon to ensure they don’t overstate marketing impact.

This article offers a brief discussion about selection effect and how I try to combat this type of bias in my day-to-day work in marketing analytics.

This is by no means a definitive guide, however. You can find useful academic articles on selection effect here and here.

What Is Selection Effect?

Selection effect is a type of bias introduced when a methodology, respondent sample or analysis is skewed toward a specific subset of a target population. As a result, its conclusions don’t reflect the actual target population as a whole.

More in MarketingPredictive Behavior Modeling: How to Keep Your Customers With AI

Selection Effect: An Introduction

Selection effect is the bias introduced when a methodology, respondent sample or analysis is skewed toward a specific subset of a target population. As a result, its conclusions don’t reflect the actual target population as a whole.

Let’s dive into a few quick examples.

Example 1

You run an analysis of a search engine marketing (SEM) campaign. Your analysis looks at the return-on-investment (ROI) of your paid search ads via link click-through to purchase. The analysis does not account for those “link clickers” who would have purchased your product anyway, however. Not accounting for selection effect in this example means that your analysis gives undue credit to your SEM ads, overstating the ROI.

Example 2

You test overall brand awareness of your health food products and decide to collect data via in-person interviews at gyms and health stores. In this example, the data is biased because your methodology targets people who frequent health-related venues and are, therefore, likely predisposed to have knowledge of health food products. This error will likely lead the test to overstate the overall brand awareness of your products.

With a very small leap, example one shows how easily any attribution algorithm could give undue credit to SEM ads when you’ve ignored selection effect. Likewise, example two highlights the dangers of failing to carefully interrogate experiments for possible biases.

Both of these examples make it easy to imagine how ignoring selection effect results in error-riddled results that lead to a dark spiral of poor investment recommendations and buckets of wasted marketing resources. No one wants that, of course.

Ways to Minimize Selection Effect

Selection effect is an always-present challenge in marketing analytics. This obstacle is partly due to the nature of the work and partly due to organizational biases that favor cherry-picking analysis techniques, fast-tracked experimentation and positive results.

With that in mind, here is a number of ways I try to minimize selection effect in my own practices:

Randomized control trials (RCTs) — RCTs are my gold standard for experimentation and measuring incremental marketing activities. Good experimentation is at the heart of minimizing selection effect, and employing RCTs among a target population is one of the best ways of getting representative results. RCTs aren’t always possible in marketing due to the complex nature of some media strategies and the inability to control impressions. That said, I always start with RCTs as a best-practice.
Validating findings across multiple experiments — As long as experiments are well-designed, validating findings across multiple experiments is an excellent way to build confidence in a specific piece of evidence and minimize unexpected selection effects.
Document measurement design, goals and analysis type before starting — Defining the measurement design and analysis technique ahead of time helps minimize any selection effect as a result of the analysis type or segmentation. Selection effect can creep in at different stages in the analysis process, so it’s important to be diligent throughout.
Standardized templates, documented audience definitions and formal reporting processes — In addition to defining measurement design ahead of time, using standardized templates and reporting processes also helps minimize biases throughout the analysis. This approach works by ensuring that consistent methods, formats and audience definitions limit the analyst’s ability to enact selection effect bias in segmenting the audience or displaying results to highlight a certain result from a subset of the target population.
Randomized variability of the media mix — Randomized variability is the practice of introducing significant variability in the number of impressions delivered via media channels in a given timeframe. This method is specific to instances where RCTs aren’t an option due to a complex media mix, so you need to model marketing impact. Implementing random and high-variability media delivery is an experimental lever used to manipulate the independent variable (e.g., impressions) and try to assess any impact on the dependent variable (e.g., purchases). This unnatural randomization is one way of reducing selection effect by inserting an element of randomized control (from the RCT playbook) into the campaign even when the overall experiment isn’t controllable.
Peer reviews — Peer reviews are another way of checking the validity of some evidence. We can often get so caught up in our own pieces of work that it takes an outside opinion to notice any unchecked selection effects. Peer reviews of measurement plans and of analysis findings help limit any unintentional selection effects.

A Data-Driven WorldWant Better Research Results? Remember Your Priors.

Bias Is Like Death and Taxes

At the end of the day, bias is ever-present, and selection effect is no different. It’s a fact that anything and everything created by humans is biased in one way or another. The best we can do is to be aware of different biases and implement measures that limit these as much as possible.

Selection effect is particularly relevant for those of us in marketing analytics and, as a result, should be high up on our list of biases to track and minimize. In my mind, the best way to limit the possibility of selection effect at all stages in the analysis workflow is via a combination of RCTs, standardized processes and validated findings.