Soon, you may hear the song streaming in your earbuds fade, and a stranger’s silvery voice start talking to you about mayonnaise: “Want to find out how to take your grilled cheese from cheesy to yes please-y?” the voice asks. “Say ‘yes’ at the tone.”

If you say no, the ad politely directs you back to your music. Reply by saying yes, however, and it goes on to share a “super tasty hack,” just like you’re two friends swapping snacking tips.

This is the future of audio advertising, where ads are designed for you to talk back.

Music-streaming platforms have been experimenting with this new approach for months. Last year, Spotify ran ads asking users to utter verbal commands to play a podcast it was promoting. Pandora, too, has recently been serving some of its users voice-enabled ads from brands like Doritos, Ashley HomeStores and, yes, Hellmann’s mayonnaise. They use voice recognition software to analyze vocal feedback and trigger a follow-up message — or a sequence of ads in some cases — based on the user’s response.

While this new ad format is still in its infancy, positive early feedback may bring forth the day when the average listener will hear from advertisers looking to have a little chat.
 

speaking interactive voice ads
Image: Shutterstock

How interactive voice ads work

Pandora has been testing these ads on some of its mobile users who use Voice Mode — a smart voice assistant, introduced last summer, that lets listeners navigate the app with verbal commands (“Hey Pandora, start my workout playlist,” etc.). The ads facilitate verbal interactions between the user and the brand through vocal calls-to-action. The idea is that they fit neatly into Voice Mode’s screenless, voice-first experience.

The handful of interactive ads I recently listened to all begin roughly the same way. A voice actor name drops the brand, then informs the listener that “this is a new kind of ad, one you can talk to.” The voice then teases some information that’s unlocked only through a verbal “yes” — a recommendation, a joke, anything to get the listener to answer out loud in the affirmative.

In many cases, a “yes” from the listener will prompt brands to re-target them with follow-up ads, which may steer them toward a firmer call to action. Saying “no” will alert brands that the user’s not interested. Right now, the ads only encourage listeners to reply with a simple “yes” or “no.” But the technology that powers them is capable of much more.

“Its simple on purpose, because theres such a learning curve that both brands and consumers are going through.”

Stas Tushinskiy, chief executive officer of San Francisco-based Instreamatic, a startup specializing in AI-driven “dialogue advertising” that’s licensed its tech to Pandora, told Built In that its artificial intelligence can understand just about anything a user says.

According to Tushinskiy, the technology leverages its proprietary natural language understanding. Its AI — similar to Amazon’s Alexa — gets better every day at knowing what people’s words mean when and how they say them. He said it’s even capable of holding back-and-forth conversations.

“If what you say makes sense, [the AI] will get it,” Tushinskiy said, before adding the caveat, “If you respond with something like ‘I like flying cows’ it will not get you, because it just doesn’t make sense within the ad context.”

He says Instreamatics’ AI is also able to measure the granularity of the user’s response. One listener could reply to an ad by saying “No thanks, not today,” for example, while another listener might offer an expletive-laced negative response. According to Tushinskiy, the AI can spot — and sort — the difference. This data may be useful for brands deciding when, or if, to re-target listeners with subsequent ads.

If interactive voice ads are capable of intelligent conversations, why do they only ask a yes/no question? The reason, according to Claire Fanning, Pandora’s vice president of ad innovation strategy, is to get listeners used to the new ad tech.

“Its simple on purpose, because theres such a learning curve that both brands and consumers are going through,” Fanning told Built In. “So were very upfront about a first-time user experience knowing there are instructions that are necessary.”

The usage of voice assistants like Alexa and Siri may be on the rise, but people still need to learn how to talk back to ads. The dynamic is brand new. But once people get familiar with it, it will likely feel more normal, at which point brands might consider making the ads more complex.
 

hearing interactive voice ads
Image: Shutterstock

Solving the ad attribution problem

Why introduce interactive voice ads at all?

It’s certainly better for audio-streaming platforms, like Pandora and Spotify, who can charge advertisers higher rates for these ads than the non-interactive variety.

And in theory, it’s a better experience for listeners, who might prefer voice over touch and having a direct line of communication to advertisers. (A vegan can tell Hellmann’s that their egg-based mayo doesn’t interest them, for example.)

The thought is that this format is better for the advertisers too. Many advertisers have struggled to find a tidy way to measure the effectiveness of digital radio ads. Historically, that’s involved tracking click-through rates of display ads that appear on screen while audio ads play, and using third-party research to tease out the return on ad spend.

Audio ad spending has a lot of room to grow, and it hasn’t caught up with the boom in audio consumption. “The audio advertising market is so small in some markets,” Tushinskiy said. “And it's all because of this [engagement] measurement challenge.”

“We still need to discover, we still need to fine-tune.

Erik Barraud, senior vice president of product management at the audio advertising tech company AdsWizz, also sees the new interactive format solving other longstanding problems for users and advertisers.

“I think we can create very much a win-win model where you can create relevant, non-disruptive ad experiences for users so they are willing to engage with it,” he told Built In. “Hence higher value for advertisers, [who] are willing to pay a higher price to engage with those audiences.”

Interactive voice ads introduce shiny new metrics. A “say-through rate,” as Pandora calls it, can capture verbal engagement, both positive and negative, which excites advertisers who want a fresh way to reach consumers — and measure the effectiveness of that reach.

Tushinskiy said that being able to connect with interested users in a novel way is obviously attractive to advertisers. But even more intriguing, he says, is the ability to learn about the non-interested users, the ones who give a negative verbal response.

“This is the very first time, even in digital marketing history, when brands can learn about 99 percent of people who dont click on traditional digital advertising,” he said. “If an ad is irrelevant, there is no way for people to tell the brand ‘I dont want to hear or see this ever again because I already got a car.’ Now brands can actually hear that.”

The voice engagement metric is so new, though, that it’s impossible to look at historical data for benchmarks. These early tests will define what’s good. “We still need to discover, we still need to fine-tune,” Barraud said of determining baseline engagement rates for voice ads.

Adweek reported a case study from 2019 in which the car company Infiniti — partnering with Instreamatic — saw a 19 percent verbal engagement rate on its interactive voice ads, which is significantly higher than what radio streaming display ads typically receive.
 

enjoying interactive voice ads
Image: Shutterstock

Meeting listeners where they are

A lot of the time, when someone is listening to music or a podcast on their phone, the app window is out of view; the listener is busy doing something else. So, any advertisement that asks listeners to tap a button to learn more is fighting against friction.

That’s why AdsWizz, which Pandora acquired in 2018, began experimenting with screenless engagement, introducing a feature called Shakeme. If a user hears an ad and wants to answer its call to action, all they have to do is shake the phone; no screen-tapping required. They wanted ads to fit seamlessly into the existing context of listeners, to be unobtrusive but still engaging.

Barraud sees interactive voice ads as a natural extension — an evolution — of the experience that Shakeme was getting at.

“I am a big believer that, in audio, we actually can create an additive ad experience,” he said, “because the audio ad doesn’t come on top of your content. It wraps around it.”

Pandora’s Fanning agrees that interactive ads have the ability to create a value-add for listeners. “With voice ads, I think were operating in a world where we can take friction away and create a world where if someone wants to hear more, we can facilitate that,” she said. “And I think for smart brands, theyre thinking about utility in the truest sense of the word.”

 

Is voice the new norm?

Roughly one-third of everyone in the United States uses a voice assistant of some kind. The country’s voice-based ad market is expected to grow with it, to the tune of $19 billion by 2022, according to TechCrunch.

Pandora declined to disclose how many of its 66 million monthly active users have Voice Mode turned on and are thus eligible to engage with the new interactive ads. So far, though, Fanning reports that they are seeing “really high” engagement, which bodes well for the future of the format.

“I think as we get results back and start to understand best practices and what brands can expect from consumer engagement, [the ads] will evolve and become a more sophisticated conversation,” she said. “But right now we're having a lot of fun with it.”

Great Companies Need Great People. That's Where We Come In.

Recruit With Us