What Are Feature Flags? When Should Devs Use Them?
Feature flags are a popular tool in software development and DevOps. Their basic functionality is pretty simple — they are essentially
if-else statements that check exterior configurations before deciding which section of code to execute next. Small companies without development teams or companies with very few developers may find feature flags useful for making changes in the application without waiting for engineering.
Zoltán Dávid, founder of ConfigCat, a consulting company that specializes in feature flags, sees this type of use case often. One of his customers, a package delivery company, began using feature flags to mitigate volume spikes during peak shopping seasons.
“Around Christmas, they have a really high volume of requests to deliver packages all around the world,” Dávid said. “Their most important goal is to provide a quality service, so what they do not want is to get too many requests once they are overloaded already.”
The company set up a feature flag so that someone from the company could flip a switch and immediately prevent users from placing additional requests when orders are backed up. The company’s code is configured to make calls to ConfigCat’s API, which in normal circumstances returns a go-ahead message.
“What happens under the hood is that this application just regularly asks ConfigCat, ‘Hey, should I display the pattern?’”
“If they are overloaded, they just hide the delivery request buttons and delivery request forms,” Dávid said. “What happens under the hood is that this application just regularly asks ConfigCat, ‘Hey, should I display the pattern?’”
During occasional peak times, an employee from the company can flip a switch, causing ConfigCat’s API for that particular feature flag of the company to return a message instructing the application not to allow new orders.
“ConfigCat says, ‘Please display, please display, please display’ — but in the next minute it will say, ‘OK, it’s disabled, you must not display,’” Dávid said. “And then it’s the responsibility of the app to not display.”
The “switch” that customers use to manage their feature flags is not a physical switch — usually it’s a toggle on a dashboard.
Companies can build feature flag capabilities themselves, or use a service like ConfigCat to do it.
“We provide them this dashboard where they can toggle the feature flags,” Dávid said. “And we provide them SDKs or libraries, software development kits, that help them build feature flags into their source code or their applications.”
Feature Flags Allow for Manual Intervention
Feature flags are built around
if-else statements in code, but unlike normal
if-else statements, feature flags essentially cede control of run-time code decisions to an external source. This allows companies to make speedy adjustments to their applications without needing to go through engineering.
ConfigCat’s delivery customer’s choice to use a feature flag instead of hard-coded business logic to control when to stop accepting new orders gives the company flexibility to base the decision on different factors under different circumstances.
“If you do the hide-unhide stuff in the business logic in the code, then there are no people involved in the decision each time. The decision is made during application development time,” Dávid said. “But if you want to include the business issues each time a feature is hidden or displayed, that needs a feature flag.”
Feature Flags Enable Canary Releases
Feature flags can also help companies do phased rollouts of releases.
“Instead of doing your Big Bang release where all your users will get this feature at the same time, you do a phased rollout,” Dávid said. “Maybe you just release it to 1 percent of your global user base.”
This type of release is also called a “canary release,” named after the historical practice of using canaries in coal mines to give advance warning about dangerous fumes. In the case of software development, companies release to a smaller group of users to gauge their reactions before doing a wide release.
“The canaries are your low-risk users,” Dávid said. “Then you keep an eye on your analytics and on Twitter and see if those users give you good feedback or bad. This way, you will see if there is anything in your new feature that you want to fine-tune before releasing it to everyone else.”
“The canaries are your low-risk users.”
Some companies do phased rollouts geographically, such as one of ConfigCat’s customers, which uses a smaller pool of users from Turkey as the canaries for all users in Europe. The process shares similarities with A/B testing, with the difference being what each process is trying to accomplish.
“If you do A/B testing, then your goal is usually to optimize a feature you already have,” Dávid said. “If you do an A/B test, then you must have proper analytics set up in addition to the business logic that decides whether the user sees version A or version B. And those analytics need to be accessible by those decision-makers in your company who have the authority to say, ‘OK, we are just going to throw away this button color and go with the new one.’”
Canary releases, on the other hand, are about making sure the rollout process for new features is as smooth as possible.
Feature Flags Can Control the Timing of Rollouts — or Rollbacks
Companies can also use feature flags to coordinate the timing of releases across a variety of platforms.
“This is really useful for mobile developers and mobile apps,” Dávid said. “If you are a company that’s got a mobile application, you haven’t really any control over when your users installed that app.”
Unlike web applications, where users automatically get the latest version each time they visit the website, updates for mobile applications depend on review processes at various app stores and whether users have actually downloaded the latest version.
“If you want most of your users to be able to see your new functionality almost at the same time, then you can put the functionality into the new version and distribute it to all their other mobile phones,” Dávid said. “And once you’ve seen in your analytics that almost all users have this new version, now you just flick a flag in your headquarters and then all the users see the new functionality — but it was it was already in their applications.”
“If we detect that crash rate started going up on a certain version, rollback is as easy as just disabling that feature flag.”
The same concept works the other way around too — companies can use feature flags to easily remove problematic production releases. Mate Rakic, director of engineering at marketing software company Drift, said his team uses this strategy for time-sensitive rollbacks.
“If we detect that crash rate started going up on a certain version, rollback is as easy as just disabling that feature flag, versus submitting a new application for review,” Rakic said.
This rollback functionality isn’t only used for mobile applications. One of ConfigCat’s customers found feature flags useful for mitigating software update issues to its internet-connected farming equipment.
“We have a customer who builds machinery that works in the fields,” Dávid said. “They want to track their machines, and those machines are always online. They sell these machines, and then farmers buy those machines and use it on their farms.”
The company used to have issues with software updates, where some machines would have problems with the update and start to malfunction. The solution was to create feature flags that would allow the company to roll back changes to those machines.
“Today, whenever someone at the support desk receives a call from an angry farmer, that something went wrong with the machine, they just flip the toggle back,” Dávid said. “The feature flag value is set to false, which means, ‘Please run the in the old mode that’s still loaded in your memory.’”
Too Many Feature Flags Can Be a Red Flag
Feature flags are not without their downsides, mostly with having to do with having too many flags in the code.
“If [companies] use too many, that’s a problem for them,” Dávid said. “And there are multiple reasons for this.”
“One potential downside of a feature flag is anyone can flip a feature flag, which might lead to production outages,” said Rakic from Drift. “And other teams might not realize what just happened, because a feature flag changing is not as visible as engineers releasing something to production servers.”
Someone accidentally bumping a feature flag in production can lead to problems for customers, and could take time to track down. Another problem has to do with testing.
“If the software contains too many feature flags, it means the people doing the testing have a really hard time testing that software thoroughly,” Dávid said. “It makes your software easily incredibly complex.”
With each feature flag introduced in an application, the total numbers of paths to test in the code doubles.
“Anyone can flip a feature flag, which might lead to production outages,”
“If you have two flags, then it’s four combinations, if it’s three flags then it’s eight combinations — it’s exponential,” Dávid said. “And if the software has just 10 flags at the same time, that means 1,024 combinations. Maybe people working in QA will tell you that they have tested it, but it’s highly unlikely.”
The solution to both of these problems is pruning flags once they’ve served their purpose.
“Engineering teams need to regularly clean up their code based on feature flags.” Dávid said. “If you use feature flags only for temporary phase rollouts or canary releases ... once you are confident that you want to release the feature to your global user base and never want to roll it back, you tell your engineering team, ‘Hey, we will never roll this feature back, please delete the feature flags from all over the code.’”
“I would say the most important thing is to think about feature flags in a similar way as thinking about features,” Rakic said. “Some features get killed over time if they’re not useful to our customers or we decide to do something different. And the same approach is important for feature flags — as the features are evolving, or being removed, or there for temporary use, to actually go back and do clean-up.”
Feature Flags Can Be Used for Access Control — but Be Careful
Dávid also cautioned against using feature flags to control user access in an application. He gave the example of a SaaS company that offers tiered services using feature flags to control the services different tiers of users receive.
“But this would end up having too many flags in your software, which is a bad idea,” Dávid said. “Feature flagging sounds like a good way to do this, but at the end of the day it is not. There is subscription management software out there that basically allows developers to achieve the same result.”
Not everyone agrees on that front. Rakic said that, for Drift, a relatively young company, the benefits of using feature flags for access control can be substantial.
“Feature flags have been in the DNA of Drift pretty much from day one,” Rakic said. “It became a part of our core business logic. We use feature flags to determine permissions, or for access control on the app level.”
As a startup, Drift’s product is constantly evolving, and it made sense to manage access using feature flags rather than hard-coding rules into the application.
“Anytime we had to introduce a new role or a new user type we actually had to go through the whole app and modify it.”
“We had that situation at one point in time,” Rakic said about hard-coding access control. “Then the product started evolving, and it became obvious that anytime we had to introduce a new role or a new user type we actually had to go through the whole app and modify it.”
Drift subsequently changed to using feature flags, where introducing new roles in the application only required associating feature flags to the role.
Rakic said regularly pruning unnecessary feature flags and writing good unit tests help prevent issues caused by having a lot of feature flags.
“End-to-end tests will definitely take like a lot of time to run, so what we actually do is rely on a lot of unit tests,” Rakic said. “It’s not the business logic that determines what should be shown and what might be wrong, it’s probably just something on the UI side that is not being properly rendered.”