What are MAC Operations and Why Are They Useful for AI

AI models often involve large-scale matrix operations, such as multiplying input data by weight matrices and adding biases. This process forms the basis of forward and backward propagation algorithms used in training neural networks.

Introduction

The world of AI is abuzz with the latest advancements, particularly in generative models that can create realistic images, text, and even music. But behind the scenes, powering these impressive feats lies a fundamental concept: the humble MAC operations. This article is designed to cut through the confusion and explain what MAC operations are, why they’re crucial for training AI, and how they impact the world around you.

What Does MAC Stand For?

At its core, a MAC operation combines two steps: multiplication and accumulation. Imagine multiplying two numbers, say 3 and 5, to get 15. Instead of stopping there, you “accumulate” this result by adding another number, like 2, resulting in 17. This simple yet efficient process lies at the heart of MAC operations.

For instance, (2 * 3) + 4 = 10, where 2 and 3 are multiplied, and the result is accumulated with 4.

These are two input MAC operations. AI on the other hand requires a lot of parameters for accurate prediction of results. For example, if you are told to predict the name of the book based on the information that the book is yellow and is 60 pages long, you won’t be able to guess it accurately. But if you are given more information such as the author’s name, color, publication house, and page count then probably you will be able to guess more accurately.

Here color, publication house, author’s name etc are parameters. Thus, the larger the number of parameters, the more accurate the results.

Matrix representation

These large number of parameters/inputs are arranged in the form of matrices & vectors. Consider a simple example where we have a neural network with two layers: an input layer with three neurons and a hidden layer with two neurons. The input data is represented as a vector x = [x1, x2, x3], and the weights (parameters) connecting the input layer to the hidden layer are represented as a matrix W:

To compute the output of the hidden layer h, we perform matrix multiplication between the input vector x and the weight matrix W, followed by the addition of biases b.

In this example, each element of the output vector is computed using a MAC operation: multiplying the corresponding elements of the input vector and weight matrix, and accumulating the results with the biases.

Why bias needs to be “added”?

In a neural network, each neuron takes inputs, multiplies them by weights, and sums them up. This process helps the neuron decide whether to “fire” (activate) or not. Now, sometimes, even when all inputs are zero, we might want the neuron to fire. That’s where the bias comes in. It’s like a constant value that’s added to the sum before the neuron decides whether to activate or not. This bias gives the neuron flexibility, allowing it to learn and represent different patterns in the data more effectively.

Read more Explained: What the hell is Neural networks? – techovedas

How are MAC operations implemented in hardware?

In hardware, MAC operations involve multiplying elements of a matrix (weights) by corresponding elements of a vector (inputs), and then accumulating these products to produce an output. This operation can be achieved using resistive memory elements arranged in an array.

Resistive memory elements are akin to programmable resistive elements, forming the foundation of in-memory computing architectures. These elements, arranged in arrays, enable efficient computation within hardware

A simple crossbar structure, comprising metal wires with resistive elements, allows each resistive element (or “device”) to be programmed to a specific conductance value. When voltage is applied, Ohm’s law governs the resulting current flow through the device, facilitating the multiplication of applied voltage and conductance value.

Expanding the array to include multiple rows of devices enables the accumulation of currents, following Kirchhoff’s law. Moreover,this accumulation operation facilitates the addition of currents from multiple devices within a column, mimicking the accumulation step of MAC operations.

Read more What is Artificial Intelligence (AI) Memory Bottleneck and How to fix it? – techovedas

Further extension of the array and assignment of different conductance values to each resistive element enables the creation of a matrix representing the weights (conductance matrix G). Input voltages applied to each row represent the input vector (vector V), while resulting currents from the bottom of each column depict the output vector (resultant vector I).

Real-life example: Image processing

1. From pixels to numbers

Imagine your favorite photo. Each pixel, the tiny building block of the image, holds a numerical value representing its color intensity. To work with images digitally, we convert these pixel values into a matrix, essentially a grid of numbers representing the entire image.

2. MAC Operations Take Centre Stage:

Now, picture applying a filter. This involves modifying each pixel’s color based on its value and the values of its surrounding pixels. Here’s where MAC operations shine:

  • Filter Kernel: We define a small matrix called a filter kernel containing numbers representing the filter’s effect.
  • Pixel-by-Pixel Transformation: For each pixel in the image, the MAC operation performs.
  1. Multiplication: It multiplies the pixel’s value with the corresponding value in the filter kernel.
  2. Accumulation: It adds up all these multiplied values, capturing the combined influence of surrounding pixels.
  3. Result: This sum becomes the new color value for the pixel, reflecting the filter’s effect.

This process repeats for every pixel, creating a transformed image.

Read more AI goes Analog: How Analog AI chips are more energy efficient – techovedas

Conclusion

AI models often involve large-scale matrix operations, such as multiplying input data by weight matrices and adding biases. These operations are repeated millions or even billions of times during the training process. Additionally, operations efficiently combine multiplication and addition steps, reducing the computational burden and speeding up the training process.

This process forms the basis of forward and backward propagation algorithms used in training neural networks. Additionally, matrix multiplication enables the parallel processing of data, making it well-suited for implementation on parallel computing architectures like GPUs and TPUs.

himansh_107
himansh_107
Articles: 122