Introduction
We’ve advanced in software so much that we’ve sometimes overlooked the fact that software ultimately runs on hardware.
Currently, AI models require processors capable of handling a huge number of parameters needed for their training. While Nvidia provides AI hardware for servers, companies are also developing their own AI chips tailored to their specific workloads. This approach reduces latency and energy costs. Companies like Microsoft, Google, AWS, and Meta are heavily investing in custom silicon to gain a competitive edge, enhance performance, and reduce reliance on generic hardware.
This article showcases the custom silicon chips of each of these software giants which run our online world.
Follow us on Linkedin for everything around Semiconductors & AI
There are several reasons why tech giants are pouring resources into developing their own custom AI chips
Optimization for AI Workloads: Traditional CPUs and GPUs are good all-rounders, but they aren’t perfect for the specific demands of AI applications. Custom chips can be designed to excel at the kind of tasks AI models perform, like matrix multiplications in deep learning. This translates to better performance and faster processing for AI tasks.
Efficiency Gains: Custom chips can be fine-tuned to reduce power consumption while delivering the processing power needed for AI. This translates to significant cost savings on electricity bills, especially for data centers running AI models 24/7.
Control and Flexibility: By designing their own chips, tech giants have more control over the hardware and software integration. This allows for tighter optimization and the ability to tailor future chip iterations to their specific needs.
Reduced Reliance on Others: Developing custom chips lessens dependence on other chip makers like NVIDIA. This can be a strategic advantage, especially during chip shortages or when there’s a need for specialized features not offered by existing options.
Potential New Revenue Streams: Some tech giants might aim to not only use their custom chips internally but also sell them to other companies or cloud service providers. This could open up new revenue streams in the future.
Read More: 4 Reasons Why Big Tech Companies are Designing their Silicon Chips?
1. Microsoft’s Custom AI Chips: Azure Maia and Azure Cobalt
Microsoft has made significant strides in the custom AI chip market with the introduction of the Azure Maia AI Accelerator and the Azure Cobalt CPU.
Azure Maia AI Accelerator is designed specifically for AI tasks, particularly generative AI. It is one of the largest chips that can be made with current technology, boasting 105 billion transistors and manufactured on a 5-nanometer process. Tailored for large language models, this GPU benefits from a fully custom Ethernet-based network protocol. This enables better scaling and improves end-to-end workload performance.
The Azure Cobalt CPU customizes an Arm-based processor for general-purpose compute workloads on the Microsoft Cloud. With 128 cores, it offers up to 40% better performance than current Azure Arm servers. Microsoft designs it for power efficiency, aligning with the goal of optimizing and integrating every layer of the infrastructure stack.
Both chips are set to roll out to Microsoft’s datacenters, initially powering services such as Microsoft Copilot or Azure OpenAI Service.
Read more 5 Top AI Cloud Services You Need to Know in 2024 – techovedas
2. Google’s Custom AI Chips: Google Axion Processors
Google has introduced its custom Arm-based CPUs, the Google Axion Processors, which deliver industry-leading performance and energy efficiency. These processors are part of Google’s significant investment in custom silicon. This investment includes five generations of Tensor Processing Units (TPUs) and the Tensor chips for mobile devices.
The Google Axion Processors utilize the Arm Neoverse™ V2 CPU. They aim to deliver instances with up to 30% better performance than the fastest general-purpose Arm-based instances available in the cloud today. These processors optimize for various general-purpose workloads. This includes web and app servers, containerized microservices, open-source databases, in-memory caches, data analytics engines, media processing, and CPU-based AI training and inferencing.
Read more 50% Faster and 60% More Efficient: Google Axion CPU Compared to x86 – techovedas
3. AWS’s Custom AI Chips: AWS Inferentia and Trainium
Amazon Web Services’ custom silicon incorporates the AWS Graviton processors, which engineers custom-build as ARM-based CPUs for a wide range of general cloud workloads. The latest addition, Graviton4, aims to deliver exceptional performance and energy efficiency for cloud workloads. It boasts 96 Neoverse V2 cores, each with 2 MB of L2 cache, and operates at speeds of up to 3.4GHz.
The latest version is 40% faster for databases. It’s 30% faster for web applications and 45% faster for large Java applications compared to the previous generation Graviton3.
The second-generation ML accelerator, AWS Trainium, is designed to facilitate deep learning training for models with over 100 billion parameters. Each Trainium accelerator includes two NeuronCores and boasts 32 GB of high-bandwidth memory, delivering up to 190 teraflops of FP16/BF16 compute power. It features NeuronLink, a high-speed interconnect technology that supports efficient data and model parallelism
While AWS Inferentia accelerators are purpose-built for high-performance machine learning inference at the lowest cost. Inferentia accelerators power Amazon EC2 Inf1 instances. They deliver up to 2.3x higher throughput and up to 70% lower cost per inference than comparable Amazon EC2 instances.
Amazon EC2 Inf1 instances are a type of virtual machine offered by Amazon Web Services (AWS) that are specifically designed for high-performance machine learning inference at a low cost.
Read more Why Hardware Accelerators Are Essential for the Future of AI – techovedas
4. Meta’s Custom AI Chips: Meta Training and Inference Accelerator (MTIA)
Meta has introduced the next generation of its custom-made chips designed for AI workloads, the Meta Training and Inference Accelerator (MTIA).
The first generation MTIA v1 chip, introduced in 2021, helped train and run Meta’s deep learning recommendation models across its platforms.
Meta has now unveiled the second generation of its custom AI chip called MTIA v2, which promises significantly improved performance. The MTIA v2 chip utilizes a 5nm process node. This allows it to accommodate more transistors and achieve higher clock speeds than MTIA v1, which uses a 7nm process node. It also contains over twice as much on-chip memory at 256MB compared to 128MB on the previous chip. Meta’s initial internal testing shows that the MTIA v2 outperforms MTIA v1 by up to 3 times on key AI models. The company plans to deploy MTIA v2 more widely in its data centers to power the ranking and recommendation systems for Facebook and Instagram.
Read more Why AI Needs a New Chip Architecture ? – techovedas
Conclusion
In conclusion, forecasts suggest that the AI chip market could experience explosive growth, potentially surging to between $110 billion and $400 billion by 2027. As companies continue to innovate and invest in custom silicon, we can expect a dynamic market around these chips.