Generative AI describes artificial intelligence models that, when trained on massive data sets, are capable of automatically producing content in the form of text, images, audio and video — all by predicting the next word or pixel.
What Is Generative AI?
Generative AI is a form of artificial intelligence in which algorithms automatically produce content in the form of text, images, audio and video. These systems have been trained on massive amounts of data, and work by predicting the next word or pixel to produce a creation.
Typically, it starts with a simple text input, called a prompt, in which the user describes the output they want. Then, various algorithms generate new content according to what the prompt was asking for.
“It’s essentially AI that can generate stuff ... The quality of the output is why people are so excited.”
What began as OpenAI’s release of ChatGPT in 2022 has now become a subcategory of artificial intelligence that is growing at a breakneck pace, with tech giants like Microsoft, Google and Amazon hopping on the bandwagon.
“It’s essentially AI that can generate stuff,” Sarah Nagy, the CEO of Seek AI, a generative AI platform for data, told Built In. And, these days, some of the stuff generative AI produces is so good, it appears as if it were created by a human. “The quality of the output is why people are so excited,” Nagy said.
Popular Generative AI Tools
- ChatGPT: ChatGPT is an AI-powered chatbot developed by OpenAI, with a unique ability to not only generate written content but also converse with users fluently.
- DALL-E 2: Also created by OpenAI, DALL-E 2 can generate realistic images with nothing more than a short text prompt thanks to a process called diffusion, which begins with a pattern of random dots that gradually form an image.
- Gemini: Gemini is a generative AI chatbot created by Google, based on its model of the same name. Formerly called Bard, it can answer questions asked by users or create new content from text or image prompts.
- Midjourney: Midjourney is a text-to-image generator that produces pieces that are so impressive that it is the only platform of its kind to win an art competition.
- Claude: Released by the company Anthropic, Claude is an AI assistant that runs on the Claude 2.1 LLM and uses ‘constitutional AI,’ relying on ethical principles to guide its outputs.
- Alexa: Amazon’s newly redesigned voice assistant is powered by a large language model that allows it to be more conversational.
How Does Generative AI Work?
At its core, generative AI technology is able to work due to three specific building blocks: Generative adversarial networks, transformers and large language models.
Generative Adversarial Networks
None of this was really possible until around 2014 with the introduction of generative adversarial networks, or GANs — machine learning models that have two neural networks competing with each other in order to become more accurate in their predictions. One neural network artificially manufactures fake outputs disguised as real data, while the other works to distinguish between the artificial data and real data — all the while using deep learning methods to improve their techniques. AI-generated images, videos and audio would not be possible without GANs.
Transformers are a type of machine learning model that makes it possible for AI models to process and form an understanding of natural language. Transformers allow models to draw minute connections between the billions of pages of text they have been trained on, resulting in more accurate and complex outputs. Without transformers, we would not have any of the generative pre-trained transformer, or GPT, models developed by OpenAI, Bing’s new chat feature or Google’s Gemini chatbot.
Large Language Models
The final ingredient of generative AI is large language models, or LLMs, which have billions or even trillions of parameters. LLMs are what allow AI models to generate fluent, grammatically correct text, making them among the most successful applications of transformer models.
In general, the recent acceleration of technical progress and usage of generative AI has been nothing short of revolutionary. And it doesn’t appear to be slowing down anytime soon.
How Are Generative AI Models Trained?
Generative AI models are trained by feeding their neural networks large amounts of data that is preprocessed and labeled — although unlabeled data may be used during training.
A common method for training generative AI models is employing diffusion models. Diffusion models add noise to training data, then remove the noise as they learn how to reconstruct the data as it was before. Before diffusion models arose, generative adversarial networks were the most popular training method.
Regardless of the approach, generative AI models must be evaluated after each iteration to determine how closely their generated data matches the training data. Teams can adjust parameters, add more training data and even introduce new data sets to accelerate the progress of generative AI models.
What Types of Output Can Generative AI Produce?
Generative AI has become known for producing:
- Text: With ChatGPT ushering in a generative AI rush, written text is commonly associated with generative AI tools.
- Images: Lensa first created buzz around generative AI images on social media, and more image generators are now available.
- Videos: Motion visuals are also undergoing change, with AI video generators providing various editing features.
- Audio: AI has left its mark in the music industry, providing audial support for professional and casual musicians alike.
How Is Generative AI Being Used?
The implementation of generative artificial intelligence is altering the way we work, live and create. It’s a source of entertainment and inspiration, as well as a means of convenience. And if a business or field involves code, words, images or sound, there is likely a place for generative AI. Looking ahead, some experts believe this technology could become just as foundational to everyday life as the cloud, smartphones and the internet itself.
Some Uses of Generative AI
- Debug code
- Write speeches
- Write song lyrics
- Idea generation
- Write personalized emails
- Write social media posts
- Create 3D objects in games
- Speed up game development with code completion
For one, software developers have increasingly been looking to generative AI tools like Tabnine, Magic AI and Github Copilot to not only ask specific coding-related questions, but also fix bugs and generate new code. And AI text generators are being used to simplify the writing process, whether it’s a blog, a song or a speech.
“I think it can be helpful for sparking creativity and ideation. That’s what I use it for,” Jordan Harrod, a Ph.D candidate at Harvard and MIT and host of an AI-related educational YouTube channel, told Built In. In fact, she used an AI text-generator to help write a speech for Gen AI, a generative AI conference recently hosted by Jasper. “That did not end up being the final talk, but it helped me get out of that writer’s block because I had something on the page that I could start working with,” she said.
“If you look at sales teams, they’re constantly on the phone, they’re sending emails, they’re on LinkedIn and social media trying to generate a lot of content and use a lot of content,” Srinath Sridhar, the co-founder and CEO of sales-focused generative AI startup Regie.ai, told Built In. His company Regie.ai and other similar tools automate all of that — the personalized emails, the call scripts, and so on. “We take generative AI and then apply it to all the sales workflows for sales people.”
The Democratization of Content Creation
Generative AI has also made waves in the gaming industry — a longtime adopter of artificial intelligence more broadly. Now, generative AI is transforming not only game development, but also game testing and even gameplay. Sony-owned Haven Studios and Electronic Arts have been working to fold this technology into the making of its games while Roblox unveiled plans to implement generative AI capabilities into its Roblox Studio building tool.
“We are seeing this incredible potential, where people can just use natural language to describe things the way they are used to describing them, and then create them.”
The goal is to “democratize content creation,” Stefano Corazza, the head of Roblox Studio, told Built In, removing the technological hurdles that usually come with game development and allowing anyone to be a content creator — whether they’re a game designer working in a professional studio or a 10-year-old just discovering video games.
“Generative AI is just the best thing that has happened to us,” he continued. “We are seeing this incredible potential, where people can just use natural language to describe things the way they are used to describing them, and then create them.”
In addition to the natural language interface, Roblox also plans to roll out generative AI code-completion functionality to help speed up the game development process.
“We are really pushing real-time collaboration for building worlds, coding and all aspects of experience creation. And at the same time we want to make it easier and faster to create new content,” Corazza added. “Generative AI is the best tool that we have at the moment to really make the process easier and more accessible.”
Advantages of Generative AI
Generative AI promises to simplify various processes, providing businesses, coders and other groups with many reasons to adopt this technology.
Easy to Use
Early versions of this technology typically required submitting data via an API, or some other complicated process. Developers then had to familiarize themselves with special tools and then write applications using coding languages like Python. Today, using a generative AI system usually requires nothing more than a plain language prompt of a couple sentences. And once an output is generated, they can usually be customized and edited by the user.
For instance, Seek allows companies to essentially ask their data questions without ever having to touch the data itself. By adding Seek to their data stack, a given company’s employees can get whatever information they need of their proprietary data by typing in a simple query, instead of having to bombard their data science team with ad-hoc questions — allowing them to get whatever information they need quickly and efficiently.
“Anybody is able to ask or instruct AI in natural language,” Seek CEO Nagy said. “And able to get so many things done so quickly that they just can’t get done now without spending weeks of manual work.”
To be sure, generative AI’s promise of increased efficiency is another selling point. This technology can be used to automate tasks that would otherwise require manual labor — days of writing and editing, hours of drawing, and so on.
The speed and automation that generative AI brings to a company not only produces results faster than they would ordinarily be produced, but it also has the potential to save businesses money. Products and tasks completed in less time leads to a better customer experience, which then contributes to greater revenue and ROI.
Faster Business Operations
The speed, efficiency and ease of use permitted by generative AI is what makes it such an appealing tool to so many companies today. It’s why companies like Salesforce, Microsoft and Google are all scrambling to incorporate generative AI across their products, and why businesses are eager to find ways to fold it into their operations.
“People are looking for nails to hit using this hammer,” Sridhar said. “It’s a very new piece of tech that has fundamentally changed what you can do from even five years back.”
Challenges of Generative AI
Still, this technology also comes with quite a few challenges. Its mass adoption is fueling various concerns around its accuracy, its potential for bias and the prospect of misuse and abuse.
Lack of Accountability
Because tools like ChatGPT and DALL-E were trained on content found on the internet, their capacity for plagiarism has become a big concern. And issues related to whether AI companies have rights to use the data that trained their system, whether the output of generative engines can be copyrighted, and who is responsible if an AI system generates defamatory or dangerous outputs, do not have clear answers.
“There is going to be an explosion of content ... With big powers comes big responsibility.”
“It’s all coming from the same training data, so the creativity and originality of creating things kind of goes away when you do that,” YouTuber Harrod said. “We don’t really have a great framework for things like attribution, in this particular case. And then compensation and royalty systems.”
Less Supervision and Safeguards
For the most part, laws specific to the creation and use of artificial intelligence do not exist. This means most of these issues will have to be handled through existing law, at least for now. It also means it will be up to companies themselves to monitor the content being generated on their platform — no small task considering just how quickly this space is moving.
“There is going to be an explosion of content,” Roblox Studio’s Corazza said. “[Companies’] responsibility is to make sure the content that’s generated doesn’t offend anyone, and lets people create with civility.”
Generative AI systems also tend to get things completely wrong. Their propensity for “hallucinations,” or creating information that is factually inaccurate, can lead to a mass spread of misinformation.
Nagy likens generative AI to an improv comedy performer: “If you’re pretending to be a character, you have to just spit out content that conveys that you’re that character, when in reality if you don’t know what you’re talking about you’re still going to make the scene work.”
This is true of all generative AI. At the moment, there is no fact-checking mechanism built into this technology. Models don’t have any intrinsic mechanism to verify their outputs, and users don’t necessarily do it either.
“That’s a really hard problem to solve,” Harrod said. “When it comes to most generative AI outputs, I do worry about people just taking the output as fact and moving on.”
Limited Capabilities and Access
While much of the recent progress pertaining to generative artificial intelligence has focused on text and images, the creation of AI-generated audio and video is still a work in progress.
In 2020, OpenAI released Jukebox, a neural network that generates music (including “rudimentary singing”) as raw audio in a variety of genres and styles. A series of other AI music generators have followed, including one created by Google called MusicLM, and the creations are continuing to improve.
The same goes for AI generated voices. For instance, VALL-E, a new text-to-speech model created by Microsoft, can reportedly simulate anyone’s voice with just three seconds of audio, and can even mimic their emotional tone. It’s worth noting, however, that much of this technology is not fully available to the public yet.
Development of Deepfakes
There are a number of platforms that use AI to generate rudimentary videos or edit existing ones. Unfortunately, this has led to the development of deepfakes, which are deployed in more sophisticated phishing schemes. But this facet of generative AI isn’t quite as advanced as text, still images or even audio.
“We’re not quite at the point where you can type in ‘Make me a YouTube video that does XYZ,’ and have something come out that’s really quite as useful in terms of something that you’d use for actual content,” Harrod said. Still, she added, “it’s definitely a field that’s moving fast.”
A Brief History of Generative AI
While breakthroughs like ChatGPT and DALL-E have certainly placed generative AI in the spotlight, the concept of AI-generated content can actually be traced all the way back to the 1960s with the invention of ELIZA — a simple chatbot created by MIT professor Joseph Weizenbaum.
That being said, generative AI as we understand it now is much more complicated than what it was half a century ago. Thanks to advancements in natural language processing, generative AI systems can take raw data in the form of written and spoken words and turn them into written sentences and speech, which are represented as vectors using various encoding techniques. Raw images can be transformed into visual elements, too, also expressed as vectors.
The Future of Generative AI
Despite its challenges and shortcomings, the future of generative AI looks bright, particularly in the wake of OpenAI announcing the release of API access to ChatGPT — which promises to usher in a wave of new chatbots and other generative AI interfaces.
“I hope we make tools that can be used for good. And I hope we make tools that people need. Not just make tools for the sake of making them, but make tools because they further our goals as people and societies,” Harrod said.
“I hope we make tools that can be used for good. And I hope we make tools that people need.”
OpenAI also unveiled its much-anticipated GPT-4 in March 2023, which will be used as the underlying engine for ChatGPT going forward. In addition, the company has started selling access to GPT-4’s API so that businesses and individuals can build their own applications on top of it.
While GPT-4 promises more accuracy and less bias, the detail getting top-billing is that the model is multimodal, meaning it accepts both images and text as inputs, although it only generates text as outputs. Right now, an AI text generator tends to only be good at generating text, while an AI art generator is only really good at generating images. Multimodal capabilities could be a real game changer.
“What we are going to have in the next few years is all of those pieces coming together so that you can have multiple modes of communication simultaneously,” Sridhar said. “So you can write a script, you can have a video that goes with the script, you can actually have voice-overs that go with the video.”
Frequently Asked Questions
What is generative AI?
Generative AI is a type of artificial intelligence that can produce various types of data — images, text, video, audio, etc. — after being fed large volumes of training data.
What is generative AI vs. traditional AI?
Traditional AI simply analyzes data to reveal patterns and glean insights that human users can apply. Generative AI takes this process a step further, leveraging these patterns and insights to create entirely new data.
What is an example of generative AI?
A common example of generative AI is ChatGPT, which is a chatbot that responds to statements, requests and questions by tapping into its large pool of training data that goes up to 2021.