AI chips refer to specialized computing hardware used in the development and deployment of artificial intelligence systems. As AI has become more sophisticated, the need for higher processing power, speed and efficiency in computers has also grown — and AI chips are essential for meeting this demand.
What Is an AI Chip?
An AI chip is a specialized integrated circuit designed to handle AI tasks. Graphics processing units (GPUs), field programmable gate arrays (FPGAs) and application-specific integrated circuits (ASICs) are all considered AI chips.
Many AI breakthroughs of the last decade — from IBM Watson’s historic Jeopardy! win to Lensa’s viral social media avatars to OpenAI’s ChatGPT — have been powered by AI chips. And if the industry wants to continue pushing the limits of technology like generative AI, autonomous vehicles and robotics, AI chips will likely need to evolve as well.
“As the cutting edge keeps moving and keeps changing,” said Naresh Shanbhag, an electrical and computer engineering professor at the University of Illinois Urbana-Champaign, “then the hardware has to change and follow, too.”
What Is an AI Chip?
The term “AI chip” is a broad classification, encompassing various chips designed to handle the uniquely complex computational requirements of AI algorithms quickly and efficiently. This includes graphics processing units (GPUs), field-programmable gate arrays (FPGAs) and application-specific integrated circuits (ASICs). Central processing units (CPUs) can also be used in simple AI tasks, but they are becoming less and less useful as the industry advances.
How Do AI Chips Work?
In general, a chip refers to a microchip, which is an integrated circuit unit that has been manufactured at a microscopic scale using semiconductor material. Components like transistors (tiny switches that control the flow of electrical current within a circuit) are etched into this material to power computing functions, such as memory and logic. While memory chips manage data storage and retrieval, logic chips serve as the brains behind the operation that processes the data.
AI chips largely work on the logic side, handling the intensive data processing needs of AI workloads — a task beyond the capacity of general-purpose chips like CPUs. To achieve this, they tend to incorporate a large amount of faster, smaller and more efficient transistors. This design allows them to perform more computations per unit of energy, resulting in faster processing speeds and lower energy consumption compared to chips with larger and fewer transistors.
AI chips also feature unique capabilities that dramatically accelerate the computations required by AI algorithms. This includes parallel processing — meaning they can perform multiple calculations at the same time.
Parallel processing is crucial in artificial intelligence, as it allows multiple tasks to be performed simultaneously, enabling quicker and more efficient handling of complex computations. Because of the way AI chips are designed, they are “particularly effective for AI workloads and training AI models,” Hanna Dohmen, a research analyst at Georgetown University’s Center for Security and Emerging Technology (CSET), told Built In.
GPUs vs. FPGAs vs. ASICs vs. NPUs
The various types of AI chips differ in their hardware and functionality:
GPUs
GPUs are most often used in the training of AI models. Originally developed for applications that require high graphics performance, like running video games or rendering video sequences, these general-purpose chips are typically built to perform parallel processing tasks. Because AI model training is so computationally intensive, companies connect several GPUs together so they can all train an AI system synchronously.
FPGAs
FPGAs are useful in the application of AI models because they can be reprogrammed “on the fly,” as Tim Fist, a fellow with the Technology and National Security Program at CNAS, put it, meaning they are “hyper-specialized.” In other words, FPGAs are highly efficient at a variety of different tasks, particularly those related to image and video processing.
ASICs
ASICs are accelerator chips, designed for a very specific use — in this case, artificial intelligence. They are custom-built to support specific applications. ASICs offer similar computing ability to the FPGAs, but they cannot be reprogrammed. Because their circuitry has been optimized for one specific task, they often offer superior performance compared to general-purpose processors or even other AI chips. Google’s tensor processing unit is an example of an ASIC that has been crafted explicitly to boost machine learning performance.
NPUs
NPUs are modern add-ons that enable CPUs to handle AI workloads and are similar to GPUs, except they’re designed with the more specific purpose of building deep learning models and neural networks. As a result, NPUs excel at processing massive volumes of data to perform a range of advanced AI tasks like object detection, speech recognition and video editing. Because of their capabilities, NPUs often outperform GPUs when it comes to AI processes.
AI Chip Uses
Modern artificial intelligence simply would not be possible without these specialized AI chips. Here are just some of the ways they are being used.
Large Language Models
AI chips speed up the rate at which AI, machine learning and deep learning algorithms are trained and refined, which is particularly useful in the development of large language models (LLMs). They can leverage parallel processing for sequential data and optimize operations for neural networks, enhancing the performance of LLMs — and, by extension, generative AI tools like chatbots, AI assistants and text-generators.
Edge AI
AI chips make AI processing possible on virtually any smart device — watches, cameras, kitchen appliances — in a process known as edge AI. This means that processing can take place closer to where data originates instead of on the cloud, reducing latency and improving security and energy efficiency. AI chips can be used in anything from smart homes to smart cities.
Autonomous Vehicles
AI chips help advance the capabilities of driverless cars, contributing to their overall intelligence and safety. They are able to process and interpret vast amounts of data collected by a vehicle’s cameras, LiDAR and other sensors, supporting sophisticated tasks like image recognition. And their parallel processing capabilities enable real-time decision-making, helping vehicles to autonomously navigate complex environments, detect obstacles and respond to dynamic traffic conditions.
Robotics
AI chips are useful in various machine learning and computer vision tasks, allowing robots of all kinds to perceive and respond to their environments more effectively. This can be helpful across all areas of robotics, from cobots harvesting crops to humanoid robots providing companionship.
Why Are AI Chips Better Than Regular Chips?
When it comes to the development and deployment of artificial intelligence, AI chips are much better than regular chips, thanks to their many distinctive design attributes.
AI Chips Have Parallel Processing Capabilities
Perhaps the most prominent difference between more general-purpose chips (like CPUs) and AI chips is their method of computing. While general-purpose chips employ sequential processing, completing one calculation at a time, AI chips harness parallel processing, executing numerous calculations at once. This approach means that large, complex problems can be divided up into smaller ones and solved at the same time, leading to swifter and more efficient processing.
AI Chips Are More Energy-Efficient
AI chips are designed to be more energy-efficient than conventional chips. Some AI chips incorporate techniques like low-precision arithmetic, enabling them to perform computations with fewer transistors, and thus less energy. And because they are adept at parallel processing, AI chips can distribute workloads more efficiently than other chips, resulting in minimized energy consumption. Long-term this could help reduce the artificial intelligence industry’s massive carbon footprint, particularly in data centers.
Using AI chips could also help edge AI devices run more efficiently. For example, if you want your cellphone to be able to collect and process your personal data without having to send it to a cloud server, the AI chips powering that cellphone must be optimized for energy efficiency so they don’t drain the battery.
AI Chips Yield More Accurate Results
Because AI chips are specifically designed for artificial intelligence, they tend to be able to perform AI-related tasks like image recognition and natural language processing with more accuracy than regular chips. Their purpose is to perform intricate calculations involved in AI algorithms with precision, reducing the likelihood of errors. This makes AI chips an obvious choice for more high-stakes AI applications, such as medical imaging and autonomous vehicles, where rapid precision is imperative.
AI Chips Can Be Customized
Unlike general-purpose chips, some AI chips (FPGAs and ASICs, for example) can be customized to meet the requirements of specific AI models or applications, allowing the hardware to adapt to different tasks.
Customizations include fine-tuning certain parameters (variables within a trained model) and optimizing the chip’s architecture for specific AI workloads. This flexibility is essential to the advancement of AI, as it enables developers to tailor the hardware to their unique needs, accommodating variations in algorithms, data types and computational requirements.
The Future of AI Chips
While AI chips play a crucial role in advancing the capabilities of AI, their future is full of challenges, such as supply chain bottlenecks, a fragile geopolitical landscape and computational constraints.
Monopoly Concerns
At the moment, Nvidia is a top supplier of AI hardware and software, controlling about 80 percent of the global market share in GPUs. But this dominance hasn’t come without controversy. Alongside Microsoft and OpenAI, Nvidia has come under scrutiny for potentially violating U.S. antitrust laws.
More recently, Xockets has accused Nvidia of patent theft and antitrust violations. The startup claims networking company Mellanox first committed patent theft, and now Nvidia is responsible since it acquired Mellanox in 2020. If Nvidia is found guilty, the fallout could cause a major shake-up within the AI chip industry.
Additionally, a number of large technology players, including AMD, Amazon, Google, Meta and Microsoft, are hard at work trying to catch up to Nvidia’s AI chip lead. Market research firm Omdia predicts that spending on non-Nvidia computers will grow 49 percent in 2024, according to the New York Times.
Supply Chain Bottlenecks
Taiwan Semiconductor Manufacturing Corporation (TSMC) makes roughly 90 percent of the world’s advanced chips, powering everything from Apple’s iPhones to Tesla’s electric vehicles. It is also the sole manufacturer of Nvidia’s powerful H100 and A100 processors, which power the majority of AI data centers.
TSMC’s control over the market has created severe bottlenecks in the global supply chain. The company has limited production capacity and resources, which hinders its ability to meet escalating demand for AI chips.
“The demand for these chips is currently far exceeding the supply,” CNAS’ Fist said. “If you’re an AI developer and you want to buy 10,000 of Nvidia’s latest GPUs, it’ll probably be months or years before you can get your hands on them.”
These supply shortages won’t last forever though. TSMC’s subsidiary, Japan Advanced Semiconductor Manufacturing (JASM), is constructing a factory in Kumamoto that is expected to be at full production by the end of 2024. TSMC is also building two state-of-the-art plants in Arizona, the first of which is set to begin chip production in 2025.
In the meantime, prominent AI makers like Microsoft, Google and Amazon are designing their own custom AI chips to reduce their reliance on Nvidia.
There have also been wider attempts to counter Nvidia’s dominance, spearheaded by a consortium of companies called the UXL Foundation. For example, the Foundation has developed an open-source alternative to Nvidia’s CUDA platform, and Intel has directly challenged Nvidia with its latest Gaudi 3 chip. In addition, Intel and AMD have created their own processors for laptops and computers while Qualcomm has joined the crowded field with its AI PC processor.
A Fragile Geopolitical Landscape
Taiwan, which plays a central role in the global supply of AI chips, is viewed by China as a rogue province as opposed to an independent nation. Because of this, some analysts believe a Chinese invasion could occur within the decade, which would affect TSMC’s ability to manufacture AI chips and put the entire AI industry in jeopardy.
Meanwhile, amid tensions between the United States and China, President Joe Biden rolled out a sweeping set of export controls in 2022 that dramatically limit China’s access to AI chips, chip-making equipment and chip design software (much of which is controlled by U.S. companies like Nvidia). Although companies like Intel can still introduce new AI chips in China, they must limit the performance of these chips. China has also sought homegrown alternatives to Nvidia like Huawei, but software bugs have frustrated these efforts.
“We want to restrict China’s military modernization, and we are concerned about the Chinese government using AI chips to develop weapons of mass destruction,” Dohmen, whose research focuses on U.S.-China tech competition, said. But it also comes down to a desire for AI dominance. “We want to be the first, we want to be the best in tech and AI innovation.”
As the U.S. works to limit China’s access to AI hardware, it is also taking steps to reduce its own reliance on chip fabrication facilities in East Asia. In addition to facilitating the two TSMC plants in Arizona, the government has secured a third TSMC site in Phoenix through the CHIPS and Science Act and also set aside more than $52 billion in federal funding and incentives to support U.S. semiconductor manufacturing, research and development.
Computational Constraints
Developers are creating bigger and more powerful models, driving up computational demands. And “chips need to keep up,” Fist said. But AI chips have finite computational resources.
“The amount of chips that you need to scale a state-of-the-art AI system is growing by about four times every year, which is huge,” Fist added. Meanwhile, the algorithmic efficiency of chips, or the ability to do more with fewer chips, is growing by two times every year. “The requirements, in terms of how many chips we need and how powerful they need to be, are outstripping what the industry is currently able to provide.”
Instead of simply throwing more chips at the problem, companies are rushing to figure out ways to improve AI hardware itself.
One key area of interest is in-memory computing, which eliminates the separation between where the data is stored (memory) and where the data is processed (logic) in order to speed things up. And AI chip designers like Nvidia and AMD have started incorporating AI algorithms to improve hardware performance and the fabrication process. All of this work is essential to keeping up with the breakneck pace at which AI is moving.
“There are all of these different exponential trends at play,” Fist said. “So there’s this big rush to figure out how do we build even more specialized chips for AI? Or, how do we innovate in other parts of the stack?”
Frequently Asked Questions
What is the difference between an AI chip and a regular chip?
While regular chips are typically general-purpose and designed to accomplish all kinds of computer functions, AI chips are made to handle the complex computations involved in AI-related tasks. Unlike regular chips, AI chips are optimized for specific AI workloads, offering improved performance, speed and energy efficiency.
What is the difference between a CPU and a GPU?
A CPU (central processing unit) is a general-purpose chip that can handle a wide range of tasks in a computer system, including running operating systems and managing applications. GPUs (graphics processing units) are also general-purpose, but they are typically built to perform parallel processing tasks. They are best-suited for rendering images, running video games, and training AI models.
How much does an AI chip cost?
The costs of AI chips vary and depend on factors like performance. While AMD’s MI300X chip falls between $10,000 and $15,000, Nvidia’s H100 chip can cost between $30,000 to $40,000, often surpassing the $40,000 threshold.
Which companies make AI chips?
Nvidia dominates the AI chip manufacturing industry, but it faces competition from other major tech companies like Microsoft, Google, Intel, Amazon, IBM and AMD.
What’s the best AI chip?
Cerebras’ WSE-3 chip is considered the most powerful AI chip available. According to the AI supercomputer firm, the WSE-3 chip “surpasses all other processors in AI-optimized cores, memory speed, and on-chip fabric bandwidth.”
Where are Nvidia chips made?
Nvidia’s chips are manufactured in Taiwan by TSMC.