Friday, November 21
Breaking down AI chips, from Nvidia GPUs to ASICs by Google and Amazon

Nvidia outperformed all expectations, reporting soaring profits Wednesday thanks to its graphics processing units that excel at AI workloads. But more categories of AI chips are gaining ground.

Custom ASICs, or application-specific integrated circuits, are now being designed by all the major hyperscalers, from Google’s TPU to Amazon’s Trainium and OpenAI’s plans with Broadcom. These chips are smaller, cheaper, accessible and could reduce these companies’ reliance on Nvidia GPUs. Daniel Newman of the Futurum Group told CNBC that he sees custom ASICs “growing even faster than the GPU market over the next few years.”

Besides GPUs and ASICs, there are also field-programmable gate arrays, which can be reconfigured with software after they’re made for use in all sorts of applications, like signal processing, networking and AI. There’s also an entire group of AI chips that power AI on devices rather than in the cloud. Qualcomm, Apple and others have championed those on-device AI chips.

CNBC talked to experts and insiders at the Big Tech companies to break down the crowded space and the various kinds of AI chips out there.

GPUs for general compute

Once used primarily for gaming, GPUs made Nvidia the world’s most valuable public company after their use shifted toward AI workloads. Nvidia shipped some 6 million current-generation Blackwell GPUs over the past year.

Nvidia senior director of AI infrastructure Dion Harris shows CNBC’s Katie Tarasov how 72 Blackwell GPUs work together as one in a GB200 NVL72 rack-scale server system for AI at Nvidia headquarters in Santa Clara, California, on November 12, 2025.

Marc Ganley

The shift from gaming to AI started around 2012, when Nvidia’s GPUs were used by researchers to build AlexNet, what many consider to be modern AI’s big bang moment. AlexNet was a tool that was entered into a prominent image recognition contest. Whereas others in the contest used central processing units for their applications, AlexNet reliance on GPUs provided incredible accuracy and obliterated its competition.

AlexNet’s creators discovered that the same parallel processing that helps GPUs render lifelike graphics was also great for training neural networks, in which a computer learns from data rather than relying on a programmer’s code. AlexNet showcased the potential of GPUs.

Today, GPUs are often paired with CPUs and sold in server rack systems to be placed in data centers, where they run AI workloads in the cloud. CPUs have a small number of powerful cores running sequential general-purpose tasks, while GPUs have thousands of smaller cores more narrowly focused on parallel math like matrix multiplication.

Because GPUs can perform many operations simultaneously, they’re ideal for the two main phases of AI computation: training and inference. Training teaches the AI model to learn from patterns in large amounts of data, while inference uses the AI to make decisions based on new information.

GPUs are the general-purpose workhorses of Nvidia and its top competitor, Advanced Micro Devices. Software is a major differentiator between the two GPU leaders. While Nvidia GPUs are tightly optimized around CUDA, Nvidia’s proprietary software platform, AMD GPUs use a largely open-source software ecosystem.

AMD and Nvidia sell their GPUs to cloud providers like Amazon, Microsoft, Google, Oracle and CoreWeave. Those companies then rent the GPUs to AI companies by the hour or minute. Anthropic’s $30 billion deal with Nvidia and Microsoft, for example, includes 1 gigawatt of compute capacity on Nvidia GPUs. AMD has also recently landed big commitments from OpenAI and Oracle.

Nvidia also sells directly to AI companies, like a recent deal to sell at least 4 million GPUs to OpenAI, and to foreign governments, including South Korea, Saudi Arabia and the U.K.

The chipmaker told CNBC that it charges around $3 million for one of its server racks with 72 Blackwell GPUs acting as one, and ships about 1,000 each week. 

Dion Harris, Nvidia’s senior director of AI infrastructure, told CNBC he couldn’t have imagined this much demand when he joined Nvidia over eight years ago.

“When we were talking to people about building a system that had eight GPUs, they thought that was overkill,” he said.

ASICs for custom cloud AI

Training on GPUs has been key in the early boom days of large language models, but inference is becoming more crucial as the models mature. Inference can happen on less powerful chips that are programmed for more specific tasks. That’s where ASICs come in.

While a GPU is like a Swiss Army Knife able to do many kinds of parallel math for different AI workloads, an ASIC is like a single-purpose tool. It’s very efficient and fast, but hard-wired to do the exact math for one type of job.

Google released its 7th generation TPU, Ironwood, in November 2025, a decade after making its first custom ASIC for AI in 2015.

Google

“You can’t change them once they’re already carved into silicon, and so there’s a trade off in terms of flexibility,” said Chris Miller, author of “Chip War.”

Nvidia’s GPUs are flexible enough for adoption by many AI companies, but they cost up to $40,000 and can be hard to get. Still, startups rely on GPUs because designing a custom ASIC has an even higher up-front cost, starting at tens of millions of dollars, according to Miller.

For the biggest cloud providers who can afford them, analysts say custom ASICs pay off in the long-run.

“They want to have a little bit more control over the workloads that they build,” Newsom said. “At the same time, they’re going to continue to work very closely with Nvidia, with AMD, because they also need the capacity. The demand is so insatiable.”

Google was the first Big Tech company to make a custom ASIC for AI acceleration, coining the term Tensor Processing Unit when its first ASIC came out in 2015. Google said it considered making a TPU as far back as 2006, but the situation became “urgent” in 2013 as it realized AI was going to double its number of data centers. In 2017, the TPU also contributed to Google’s invention of the Transformer, the architecture powering almost all modern AI.

A decade after its first TPU, Google released its seventh generation TPU in November. Anthropic announced it will train its LLM Claude on up to 1 million TPUs. Some people think TPUs are technically on par or superior to Nvidia’s GPUs, Miller said.

“Traditionally, Google has only used them for in-house purposes,” Miller said. “There’s a lot of speculation that in the longer run, Google might open up access to TPUs more broadly.”

Amazon Web Services was the next cloud provider to design its own AI chips, after acquiring Israeli chip startup Annapurna labs in 2015. AWS announced Inferentia in 2018, and it launched Trainium in 2022. AWS is expected to announce Trainium’s third generation as soon December.

Ron Diamant, Trainium’s head architect, told CNBC that Amazon’s ASIC has 30% to 40% better price performance compared to other hardware vendors in AWS.

“Over time, we’ve seen that Trainium chips can serve both inference and training workloads quite well,” Diamant said.

CNBC’s Katie Tarasov holds Amazon Web Services’ Trainium 2 AI chip that fill its new AI data center in New Carlisle, Indiana, on October 8, 2025.

Erin Black

In October, CNBC went to Indiana for the first on-camera tour of Amazon’s biggest AI data center, where Anthropic is training its models on half a million Trainium2 chips. AWS fills its other data centers with Nvidia GPUs to meet the demand from AI customers like OpenAI.

Building ASICs isn’t easy. This is why companies turn to chip designers Broadcom and Marvell. They “provide the IP and the know-how and the networking” to help their clients build their ASICs, Miller said.

“So you’ve seen Broadcom in particular be one of the biggest beneficiaries of the AI boom,” Miller said.

Broadcom helped build Google’s TPUs and Meta‘s Training and Inference Accelerator launched in 2023, and has a new deal to help OpenAI build its own custom ASICs starting in 2026.

Microsoft is also getting into the ASIC game, telling CNBC that its in-house Maia 100 chips are currently deployed in its data centers in the eastern U.S. Others include Qualcomm with the A1200, Intel with its Gaudi AI accelerators and Tesla with its AI5 chip. There’s also a slew of start-ups going all in on custom AI chips, including Cerebras, which makes huge full-wafer AI chips, and Groq, with inference-focused language processing units.

In China, Huawei, ByteDance, and Alibaba are making custom ASICs, although export controls on the most advanced equipment and AI chips pose a challenge.

Edge AI with NPUs and FPGAs

The final big category of AI chips are those made to run on devices, rather than in the cloud. These chips are typically built into a device’s main System on a Chip, SoC. Edge AI chips, as they’re called, enable devices to have AI capabilities while helping them save battery life and space for other components.

“You’ll be able to do that right on your phone with very low latency, so you don’t have to have communication all the way back to a data center,” said Saif Khan, former White House AI and semiconductor policy advisor. “And you can preserve privacy of your data on your phone.”

Neural processing units are a major type of edge AI chip. Qualcomm, Intel and AMD are making NPUs that enable AI capabilities in personal computers.

Although Apple doesn’t use the term NPU, the in-house M-series chips inside its MacBooks include a dedicated neural engine. Apple also built neural accelerators into the latest iPhone A-series chips.

“It is efficient for us. It is responsive. We know that we are much more in control over the experience,” Tim Millet, Apple platform architecture vice president, told CNBC in an exclusive September interview

The latest Android phones also have NPUs built into their primary Qualcomm Snapdragon chips, and Samsung has its own NPU on its Galaxy phones, too. NPUs by companies like NXP and Nvidia power AI embedded in cars, robots, cameras, smart home devices and more.

“Most of the dollars are going towards the data center, but over time that’s going to change because we’ll have AI deployed in our phones and our cars and wearables, all sorts of other applications to a much greater degree than today,” Miller said.

Then there’s field-programmable gate arrays, or FPGAs, which can be reconfigured with software after they’re made. Although far more flexible than NPUs or ASICs, FPGAs have lower raw performance and lower energy efficiency for AI workloads.

AMD became the largest FPGA maker after acquiring Xilinx for $49 billion in 2022, with Intel in second thanks to its $16.7 billion purchase of Altera in 2015.

These players designing AI chips rely on a single company to manufacture them all: Taiwan Semiconductor Manufacturing Company.

TSMC has a giant new chip fabrication plant in Arizona, where Apple has committed to moving some chip production. In October, Nvidia CEO Jensen Huang said Blackwell GPUs were in “full production” in Arizona, too. 

Although the AI chip space is crowded, dethroning Nvidia won’t come easily.

“They have that position because they’ve earned it and they’ve spent the years building it,” Newman said. “They’ve won that developer ecosystem.”

Watch the video to see a breakdown of how all the AI chips work: https://www.cnbc.com/video/2025/11/21/nvidia-gpus-google-tpus-aws-trainium-comparing-the-top-ai-chips.html

https://www.cnbc.com/2025/11/21/nvidia-gpus-google-tpus-aws-trainium-comparing-the-top-ai-chips.html

Share.

Leave A Reply

twenty + 15 =

Exit mobile version