May 14, 2021
Qualcomm products mentioned within this post are offered by Qualcomm Technologies, Inc. and/or its subsidiaries.
It is not often a new AI benchmark is released for both datacenters and edge infrastructure, and even more rare to see a new entrant showcasing leading performance per watt. Performance and power efficiency are important metrics for any size organization, as every joule spent impacts your bottom line. Hot off the press is the new MLPerf Inference v1.0 benchmark from MLCommons which includes an impressive submission for the Qualcomm Cloud AI 100.
Introducing new metrics to measure power consumption, ML Commons is responding to the need to benchmark power efficiency. Datacenters need to not only be powerful but also provide high efficiency for the lowest Total Cost of Ownership (TCO). Products that can deliver highest performance at lowest power are most critical.
Qualcomm Technologies submitted its Qualcomm Cloud AI 100 accelerator and convincingly topped the Inferencing Datacenter and Edge charts with the highest Inference performance (at lowest power) among all MLPerf 1.0 submissions. The two images below illustrate the latest MLPerf 1.0 benchmark scores with performance on the Y-axis and power consumption on the X-axis. With the leading products occupying the top/right hand side of the chart, it is clear that the Qualcomm Cloud AI 100 is the most efficient AI accelerator on the MLPerf 1.0 list, beating out the competition.
Qualcomm Cloud AI 100 even topped the highly touted NVIDIA A100, in performance per watt by a large margin as shown in the chart below. You can see the results here on their .
Beyond power/performance leadership, Qualcomm Technologies also led in the important metric of latency performance in edge AI inferencing. Low latency is vital to achieving the fastest response time to ensure best possible user experience. The Qualcomm Cloud AI 100 delivers the lowest latency (ms) at the lowest energy (joules) among all Edge devices power submissions. The image below demonstrates the Edge solutions with respect to their single stream latency for Resnet-50. The optimal devices are located in the lower left corner, showcasing that the DM.2 and DM.2e accelerators from Qualcomm Technologies lead the pack in terms of the lowest latency and lowest energy.
The MLPerf 1.0 Power submission demonstrates Qualcomm Cloud AI 100 as the platform of choice for all AI inferencing applications for both the Edge and datacenter categories by delivering the highest number of inferences at lowest latency and lowest energy utilization. We also recently broke the Peta Operations Per Second (POPS) performance barrier along with AMD and Gigabyte with over 120+ POPS of our AI horsepower within a single standard datacenter server rack. The Qualcomm Cloud AI 100 provides a unique blend of high computational performance, low latency and low power utilization and is well suited for a broad range of applications ranging from Edge to Cloud.
Here’s what the media had to say:
“Nvidia's only defeats that Kharya acknowledged came in the new, separate MLPerf benchmarking for energy efficiency, in which it was narrowly bested by Qualcomm's [Cloud] AI 100 in two of six energy efficiency test categories on the basis of performance per watt.” -- Dan O’Shea, FierceElectronics
“The MLPerf V1.0 release is the first time to include power metrics, measured as total system power over at least a 10-minute run-time. While Qualcomm [Technologies] only submitted [Qualcomm Cloud] AI100 results for image classification and small object detection, the power efficiency looks good. The chip performs reasonably well, delivering up to 3X performance over the (aging) NVIDIA T4, while the more expensive and power hungry NVIDIA A100 roughly doubles the Qualcomm [Cloud AI] performance on a chip-to-chip basis, based on these limited benchmark submissions….The new Qualcomm Cloud AI100 platform delivers up to 70% better performance per watt for some data center inference workloads, at least on image classification. These submissions were run on the Gigabyte AMD EPYC server we recently mentioned.” -- Karl Freund, Forbes