What is Huang Law: 1000x Chip Performance in 10 Years

Huang's law is predicting that GPUs will become 1000x faster in the next 10 years.

Introduction:


In a recent online talk, NVIDIA Chief Scientist Bill Dally shed light on a transformative shift in computer performance delivery in a post-Moore’s law era. A remarkable 1000x improvement in single GPU performance for AI inference over the past decade has spearheaded this shift, earning it the moniker “Huang’s Law” after NVIDIA’s CEO, Jensen Huang. The advancements are in response to the exponential growth of large language models utilized in generative AI.

(Image credit: Nvidia)

Follow us on Linkedin for everything around Semiconductors & AI

What is Huang Law ?

Huang’s law is an observation that Nvidia CEO Jensen Huang has outlined- that is a new law of computing, predicting that GPUs will become 1000x faster in the next 10 years. This is leading to a situation where GPUs are now more powerful than CPUs for many tasks, including machine learning, artificial intelligence, and data science.

Huang has been a vocal advocate for the power of GPUs, and he has been instrumental in driving their rapid development.

Factors Contributing Huang Law

There are a number of factors that are contributing to Huang’s law. One factor is that GPUs are inherently better suited for parallel processing than CPUs. Parallel processing is the ability to divide a problem into smaller parts and then solve each part simultaneously. This is ideal for many computationally intensive tasks, such as machine learning and artificial intelligence.

Another factor is that the design of GPUs is constantly being improved. For example, Nvidia has developed new technologies such as Tensor cores and RT cores that are specifically designed for machine learning and graphics workloads.

Finally, the availability of high-quality software libraries and frameworks for GPUs is also playing a role in Huang’s law. These libraries and frameworks make it easier for developers to write code that takes advantage of the power of GPUs.

Huang’s law is having a major impact on the computing landscape. GPUs are now the preferred platform for many computationally intensive tasks. This is leading to a shift in the way that software is developed and deployed.

For example, GPUs now typically train machine learning models, significantly reducing training time and costs. Moreover, they accelerate the development of new AI applications like self-driving cars and chatbots.

Additionally, various industries, including finance, healthcare, and marketing, utilize GPUs to analyze large datasets and extract valuable insights.

The New Math: A Sixteen-Fold Gain in Number Representation

The latest NVIDIA Hopper architecture, equipped with the Transformer Engine, uses a flexible mix of eight- and 16-bit math suited for modern generative AI models. Dally detailed how this unique math improves both performance and energy efficiency.

Moreover, Dally’s team significantly boosted performance by 12.5x through sophisticated instructions that optimize GPU workload organization. These instructions enable the execution of more work with less energy.

This approach enables computers to rival specialized accelerators in efficiency while preserving GPU programmability, as emphasized by Dally.

Additionally, NVIDIA Ampere architecture introduced structural sparsity, simplifying AI model weights without sacrificing accuracy. Dally emphasized this brought a 2x performance boost and hints at more progress to come.

Dally also explained how NVLink interconnects between GPUs and NVIDIA networking between systems work together, amplifying the 1,000x gains in single GPU performance.

“it’s a fun time to be a computer engineer,”

NVIDIA Chief Scientist Bill Dally

They achieved a 12.5x leap by crafting advanced instructions that efficiently guide the GPU in organizing its work, enabling the execution of more work with less energy. This enhancement allows computers to be as efficient as dedicated accelerators while retaining the programmability of GPUs.

(Image credit: Nvidia)

Structural Sparsity: Innovations in Simplifying AI Models

The NVIDIA Ampere architecture introduced structural sparsity, an innovative technique that simplifies the weights in AI models without compromising accuracy. This brought an additional 2x performance increase, with promising future advancements in the pipeline.

Compound Gains: NVLink Interconnects and Networking

NVLink interconnects between GPUs within a system and NVIDIA networking among systems compounded the 1,000x gains in single GPU performance, further optimizing and streamlining computational processes.

Read More: Why Moore’s law is not a law?

The Role of Semiconductor Technology: Beyond Moore’s Law

NVIDIA’s migration from 28nm to 5nm semiconductor nodes accounted for 2.5x of the total gains. However, unlike the past, where Moore’s Law dictated exponential growth through smaller and faster chips, this is a significant departure. The diminishing gains from Moore’s Law have necessitated a shift towards innovative approaches in computer design.

IEEE Spectrum initially coined the remarkable boost as “Huang’s Law” after NVIDIA’s CEO and founder, Jensen Huang. Subsequently, this term gained widespread popularity through a column featured in the Wall Street Journal.

Future Prospects: An Upbeat Outlook

Despite the diminishing gains from Moore’s Law, Dally expressed confidence in the continuation of Huang’s Law. He detailed future opportunities: simplifying number representation, introducing more sparsity in AI models, and enhancing memory and communication circuits. This shift in design offers exciting prospects for engineers to collaborate in successful teams, engage with brilliant minds, and create impactful designs.


The computer performance landscape is changing significantly, demonstrated by NVIDIA’s remarkable 1000x enhancement in AI inference on a single GPU in the last ten years. Coined “Huang’s Law,” this marks a shift from relying solely on Moore’s Law. It emphasizes the need for innovative computer design approaches to meet evolving technology demands. NVIDIA’s journey showcases how creativity and technology fusion shape the future of computer performance, presenting exciting opportunities for engineers to drive groundbreaking advancements and leave a lasting influence.

Kumar Priyadarshi
Kumar Priyadarshi

Kumar Joined IISER Pune after qualifying IIT-JEE in 2012. In his 5th year, he travelled to Singapore for his master’s thesis which yielded a Research Paper in ACS Nano. Kumar Joined Global Foundries as a process Engineer in Singapore working at 40 nm Process node. Working as a scientist at IIT Bombay as Senior Scientist, Kumar Led the team which built India’s 1st Memory Chip with Semiconductor Lab (SCL).

Articles: 2622