JP Data LLC
101 Jefferson Dr, 1st Floor
Menlo Park CA 94025
Phone
(408) 623 9165
Email
info at jpdata dot co
sales at jpdata dot co
JP Data LLC
101 Jefferson Dr, 1st Floor
Menlo Park CA 94025
Phone
(408) 623 9165
Email
info at jpdata dot co
sales at jpdata dot co
The increasing complexity of AI algorithms posed has given rise to a new industry of AI chipsets. In the past two years, all the top semiconductor companies along with cloud companies as well as startups have jumped in to build chipsets. By Tractica’s own estimates, there are over 100 design starts with more being announced every day and companies coming into existence.
The use cases for AI applications are quite widespread and they vary from ultra low power IoT devices to enterprise/data center. This is dictating need for different power performance at different nodes that translates to a wide range of SoC requirements. For example, a SoC capable of running AI algorithms for smart camera may need to embed video encoder along with AI engine whereas an accelerator that only performs NN computation on PC or server would only need NN calculation engine.
Power and silicon technology node are setting hardcore limits on the performance envelope for AI algorithm acceleration. Take an example of enterprise chipsets for AI acceleration. PCI Express by and far is the standard for connecting X86 motherboards to peripheral cards. The current version of PCI standard puts power limit of 250 W (with special connectors) and 75 W without special connector. This has direct correlation to the power (and performance) that enterprise chipsets an offer. Nvidia’s high end V100 PCIe accelerator consumes 250 W whereas inference oriented solution T4 consumes at 75 W. Nvidia could easily have added more gates and consumed more power, however the need for additional connectors would have meant reluctance of buyers to upgrade their PCIe slot and thus less market penetration. Similarly with V100, Nvidia would have had to invent new power management methodology to go beyond 250W.
Every silicon technology node comes with its own power consumption requirements. Currently state of the art silicon technology ranges from 12 nm to 7 nm. Physics dictates the power consumption at a given technology node and total power consumed for any SoC is the sum of static and dynamic power. Static power is dependent on the size of chip whereas dynamic power depends on the number of events within chip. There are tricks and tips that designers use to reduce the power but no matter what you do, but the power consumption eventually boils down the physics of given node and overall chip power consumption must conform to the requirements of the application. Adding liquid cooling and other things have issues not known
V100 is based on 12nm semiconductor technology and next version of chip may go to lower node giving additional compute capacity. Today most of the servers and PCs are using PCI Express 3 and PCI Express 4 is just starting to show up. PCI Express 5 is being frozen however there is no significant increase in the power allowed by next higher standards. Thus it would seem that anyone who is looking to penetrate the enterprise market will have to build chipsets to adhere to these numbers and the amount of compute they can provide will eventually be controlled by the technology node physics and the power budget.
Thankfully we have not reached the saturation point of the technology node so we have few years to go before the node stops shrinking. So till them industry can continue to use the technology as the means to add more compute capacity. However, in the long run, a fundamental rethink is necessary to keep adding to compute performance.
AI chip vendors are already working on fundamentally different approaches to AI acceleration to minimize power. These include architectures such as Mythic’s Processing in Memory (PIM) from vendros such as Mythic, optical from companies such as Lightspeed, Neuromorphic from Intel Loihi’s. Today they have limited ability to solve the AI algorithm acceleration problem but in the long run, they could become dominant technologies.
The compute capacity has increased drastically in the last few years but algorithmic complexity has outpaced that increase. We still have some time before we need to start looking into alternate architectures for AI acceleration but it seems that power limitations imposed by different device form factor and semiconductor technology will drive the need to seek alternate architectures sooner than later.
On the other extreme, needs for cell phones are well known. The battery represents a key component and the power is limited to a few watts at the max. This power budget is then divided into power for display, baseband and other processing elements leaving 1W for CPU/AI chipset. Again, the amount of compute performance that one can put in chipset with this envelope is limited by technology node and dictated by physics.
Power limitations will force AI chip makers to look into alternate architectures for AI acceleration
The increasing complexity of AI algorithms posed has given rise to a new industry of AI chipsets. In the past two years, all the top semiconductor companies along with cloud companies as well as startups have jumped in to build chipsets. By Tractica’s own estimates, there are over 100 design starts with more being announced every day and companies coming into existence.
The use cases for AI applications are quite widespread and they vary from ultra low power IoT devices to enterprise/data center. This is dictating need for different power performance at different nodes that translates to a wide range of SoC requirements. For example, a SoC capable of running AI algorithms for smart camera may need to embed video encoder along with AI engine whereas an accelerator that only performs NN computation on PC or server would only need NN calculation engine.
Power and silicon technology node are setting hardcore limits on the performance envelope for AI algorithm acceleration. Take an example of enterprise chipsets for AI acceleration. PCI Express by and far is the standard for connecting X86 motherboards to peripheral cards. The current version of PCI standard puts power limit of 250 W (with special connectors) and 75 W without special connector. This has direct correlation to the power (and performance) that enterprise chipsets an offer. Nvidia’s high end V100 PCIe accelerator consumes 250 W whereas inference oriented solution T4 consumes at 75 W. Nvidia could easily have added more gates and consumed more power, however the need for additional connectors would have meant reluctance of buyers to upgrade their PCIe slot and thus less market penetration. Similarly with V100, Nvidia would have had to invent new power management methodology to go beyond 250W.
Every silicon technology node comes with its own power consumption requirements. Currently state of the art silicon technology ranges from 12 nm to 7 nm. Physics dictates the power consumption at a given technology node and total power consumed for any SoC is the sum of static and dynamic power. Static power is dependent on the size of chip whereas dynamic power depends on the number of events within chip. There are tricks and tips that designers use to reduce the power but no matter what you do, but the power consumption eventually boils down the physics of given node and overall chip power consumption must conform to the requirements of the application. Adding liquid cooling and other things have issues not known
V100 is based on 12nm semiconductor technology and next version of chip may go to lower node giving additional compute capacity. Today most of the servers and PCs are using PCI Express 3 and PCI Express 4 is just starting to show up. PCI Express 5 is being frozen however there is no significant increase in the power allowed by next higher standards. Thus it would seem that anyone who is looking to penetrate the enterprise market will have to build chipsets to adhere to these numbers and the amount of compute they can provide will eventually be controlled by the technology node physics and the power budget.
Thankfully we have not reached the saturation point of the technology node so we have few years to go before the node stops shrinking. So till them industry can continue to use the technology as the means to add more compute capacity. However, in the long run, a fundamental rethink is necessary to keep adding to compute performance.
AI chip vendors are already working on fundamentally different approaches to AI acceleration to minimize power. These include architectures such as Mythic’s Processing in Memory (PIM) from vendros such as Mythic, optical from companies such as Lightspeed, Neuromorphic from Intel Loihi’s. Today they have limited ability to solve the AI algorithm acceleration problem but in the long run, they could become dominant technologies.
The compute capacity has increased drastically in the last few years but algorithmic complexity has outpaced that increase. We still have some time before we need to start looking into alternate architectures for AI acceleration but it seems that power limitations imposed by different device form factor and semiconductor technology will drive the need to seek alternate architectures sooner than later.
On the other extreme, needs for cell phones are well known. The battery represents a key component and the power is limited to a few watts at the max. This power budget is then divided into power for display, baseband and other processing elements leaving 1W for CPU/AI chipset. Again, the amount of compute performance that one can put in chipset with this envelope is limited by technology node and dictated by physics.
This is not to underscore mechanisms such as clock gating and other that can be put in place by companies to minimize the power. However, this brings to the point of utilization of a chip. To get maximum performance out of chipset, you want the chip utilization as high as possible, leading to maximum number of events and thus higher power consumption.