To create a deepfake, it’s most common for someone to create a video and replace a person’s face with that of a celebrity or other well-known person. Deepfakes have been around since the 1990s, but it’s only in recent years that the technology has gained wide accessibility and popularity.
How to Spot a Deepfake
With the wide availability of deepfake generation tools, it’s important that everyone has a basic understanding of how to spot a deepfake. In fact, companies including Google, Amazon and Meta have been actively encouraging the community to analyze and understand what gives a deepfake away. Through their and others’ research, they have discovered a variety of ways to detect a deepfake:
- Unnatural face or environment: Deepfake images or sections of videos can have unnatural facial expressions or facial feature placement, or the environment itself (such as the lighting) can be unrealistic.
- Unnatural behavior: In deepfake videos there must be continuity between images, but this is difficult to implement. As a result, you might spot unnatural behaviors such as uneven blinking or choppy motion.
- Image artifacts: Deepfake images may have weird artifacts such as blurriness around the neck where the body of one person is stitched together with the face of another.
- Audio: When deepfakes are combined with audio, the lips may follow an unexpected motion compared to what you would expect from the audio.
How Does a Deepfake Work?
Deepfakes use advanced deep learning techniques to first encode features, then reconstruct images from the encoded features. Autoencoders, a type of neural network, are the most commonly used deep learning architecture for creating deepfakes. Let’s explore this through an example of using the deepfake technique to swap faces of two people. This is shown in detail in the figure below.
The first step to producing a deepfake is transforming the face images into smaller feature-based representations using an encoder. This more information-rich representation is often referred to as the latent face. The latent face will contain representations for features such as the nose shape, skin tone and eye color. We use the same encoder for each person so the representations produced have the same meaning.
We then transform the latent face back into an image using a decoder. Depending on which images we use to train the decoder, the output face image will vary. The key part for face swapping is that the decoder for person A is applied on the latent face of person B, and vice versa. In this way, the output face will have the expression and structure of person A, but the style and look of person B.
This figure shows the deepfake process through training (top two rows) and the creation of a deepfake (bottom row). In the training phase, the encoder (blue) is shared between different images to create the latent faces (middle portraits) from the input images (left portraits).
Individual decoders are trained for each person (green and pink). In the creation phase, the same encoder is used, but the decoder for a different person is used which leads to the creation of an image which has the pose and expression of the input image but the style of the person used to create the decoder.
Are Deepfakes Only Videos?
No, deepfakes can be videos, photos and even audio. The idea of swapping faces, or creating synthetic media, is not a new one. We’ve had access to photo-editing software for many years and it’s allowed skilled people to make high-quality face swaps. What makes the technique behind deepfakes so important is that it allows almost anyone to produce the same thing without learning any artistry, and on a much larger scale.
The key applications of deepfake technology include:
- Face swap: Face swap is the most common and obvious technique, where the faces of two people are swapped. This is generally to produce an image of a celebrity in a scene they weren’t in.
- Face synthesis: Face synthesis stretches reality and generates a face for a person that never existed.
- Facial attributes and expression manipulation: In attribute and expression manipulation a face is altered by either changing specific features such as the eyes, or by altering their expression, for example, turning a frown upside down.
How Do You Make a Deepfake?
Creating deepfakes is surprisingly easy. There exist a wide array of smartphone apps that make deepfake production readily accessible to 80 percent of the population. In addition to apps, there exist computer programs that allow a person to run far more advanced reproductions on local CPUs (computer processing unit), or the best reproductions on GPUs (graphics processing unit).
Some of the most popular tools are:
- FaceApp: FaceApp transforms photos to add and remove features, thereby allowing users to craft celebrity-like photos in just a few clicks.
- Wombo: Wombo creates singing and dancing video clips from users’ uploaded photos.
- DeepFaceLab: DeepFaceLab PC software is used in the generation of 95 percent of deepfake videos.
- First Order Motion Model: First Order Motion Model is GAN-based PC software you can use to swap faces, clothing items and more.
What Are the Risks of Deepfakes?
As with many technological innovations, deepfakes have had both positive and negative effects.
Starting with the most prevalent use-case, some research has estimated that 96 percent of deepfake videos are created for pornography. With the use of deepfake technology, creators can produce new videos that place celebrity faces onto models. This application has led to critical response, with many countries deciding to ban pornographic deepfakes; various websites have stated deepfake pornography is against their terms and conditions. In May 2022, Google banned the use of their Colab service for the creation of deepfakes, a previously popular way to train deepfake models, which will further reduce the prevalence of these types of deepfakes.
Another key use-case of deepfakes is in entertainment. The variety of deepfake apps have been used by millions worldwide to create content which has been shared on social media platforms. Many of these have turned into viral trends, such as the “age yourself” and Nicholas Cage trends. Disney is also investing in deefake technology, with the aim to use it for future productions.
The application of deepfakes in the arena of politics is likely the most controversial. During the 2022 Russian invasion of Ukraine, a deepfake of Russian leader Vladimir Putin showed him surrendering to Ukraine circulated on Twitter. There have also been many deepfakes of the previous US president, Donald Trump, including an entire film developed by South Park creators Matt Stone and Trey Parker.
The high quality of deepfakes make it difficult for the general public to discern fact from fiction. Even if a deepfake is later revealed to be fake, the video can already have done significant damage.