OnQ Blog

Qualcomm Cloud AI 100 delivers top marks in MLPerf 1.0 tests

Results show Qualcomm Cloud AI 100 provides the most efficient AI Inference acceleration across multiple AI frameworks and models.

May 14, 2021

Qualcomm products mentioned within this post are offered by Qualcomm Technologies, Inc. and/or its subsidiaries.

It is not often a new AI benchmark is released for both datacenters and edge infrastructure, and even more rare to see a new entrant showcasing leading performance per watt. Performance and power efficiency are important metrics for any size organization, as every joule spent impacts your bottom line. Hot off the press is the new MLPerf Inference v1.0 benchmark from MLCommons which includes an impressive submission for the Qualcomm Cloud AI 100.  

Introducing new metrics to measure power consumption, ML Commons is responding to the need to benchmark power efficiency. Datacenters need to not only be powerful but also provide high efficiency for the lowest Total Cost of Ownership (TCO). Products that can deliver highest performance at lowest power are most critical.

Qualcomm Technologies submitted its Qualcomm Cloud AI 100 accelerator and convincingly topped the Inferencing Datacenter and Edge charts with the highest Inference performance (at lowest power) among all MLPerf 1.0 submissions. The two images below illustrate the latest MLPerf 1.0 benchmark scores with performance on the Y-axis and power consumption on the X-axis. With the leading products occupying the top/right hand side of the chart, it is clear that the Qualcomm Cloud AI 100 is the most efficient AI accelerator on the MLPerf 1.0 list, beating out the competition.

Qualcomm Cloud AI 100 even topped the highly touted NVIDIA A100, in performance per watt by a large margin as shown in the chart below. You can see the results here on their webpage.

Beyond power/performance leadership, Qualcomm Technologies also led in the important metric of latency performance in edge AI inferencing. Low latency is vital to achieving the fastest response time to ensure best possible user experience. The Qualcomm Cloud AI 100 delivers the lowest latency (ms) at the lowest energy (joules) among all Edge devices power submissions. The image below demonstrates the Edge solutions with respect to their single stream latency for Resnet-50. The optimal devices are located in the lower left corner, showcasing that the DM.2 and DM.2e accelerators from Qualcomm Technologies lead the pack in terms of the lowest latency and lowest energy.

The MLPerf 1.0 Power submission demonstrates Qualcomm Cloud AI 100 as the platform of choice for all AI inferencing applications for both the Edge and datacenter categories by delivering the highest number of inferences at lowest latency and lowest energy utilization. We also recently broke the Peta Operations Per Second (POPS) performance barrier along with AMD and Gigabyte with over 120+ POPS of our AI horsepower within a single standard datacenter server rack. The Qualcomm Cloud AI 100 provides a unique blend of high computational performance, low latency and low power utilization and is well suited for a broad range of applications ranging from Edge to Cloud.

Here’s what the media had to say:

“Nvidia's only defeats that Kharya acknowledged came in the new, separate MLPerf benchmarking for energy efficiency, in which it was narrowly bested by Qualcomm's [Cloud] AI 100 in two of six energy efficiency test categories on the basis of performance per watt.” -- Dan O’Shea, FierceElectronics

“The MLPerf V1.0 release is the first time to include power metrics, measured as total system power over at least a 10-minute run-time. While Qualcomm [Technologies] only submitted [Qualcomm Cloud] AI100 results for image classification and small object detection, the power efficiency looks good. The chip performs reasonably well, delivering up to 3X performance over the (aging) NVIDIA T4, while the more expensive and power hungry NVIDIA A100 roughly doubles the Qualcomm [Cloud AI] performance on a chip-to-chip basis, based on these limited benchmark submissions….The new Qualcomm Cloud AI100 platform delivers up to 70% better performance per watt for some data center inference workloads, at least on image classification. These submissions were run on the Gigabyte AMD EPYC server we recently mentioned.” -- Karl Freund, Forbes

 

Qualcomm Cloud AI is a product of Qualcomm Technologies, Inc. and/or its subsidiaries.

 

 

 

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries ("Qualcomm"). Qualcomm products mentioned within this post are offered by Qualcomm Technologies, Inc. and/or its subsidiaries. The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.

Mike Vildibill

Vice President, Product Management, Qualcomm Technologies

John Kehrli

Senior Director, Product Management, Qualcomm Technologies

©2021 Qualcomm Technologies, Inc. and/or its affiliated companies.

References to "Qualcomm" may mean Qualcomm Incorporated, or subsidiaries or business units within the Qualcomm corporate structure, as applicable.

Qualcomm Incorporated includes Qualcomm's licensing business, QTL, and the vast majority of its patent portfolio. Qualcomm Technologies, Inc., a wholly-owned subsidiary of Qualcomm Incorporated, operates, along with its subsidiaries, substantially all of Qualcomm's engineering, research and development functions, and substantially all of its products and services businesses. Qualcomm products referenced on this page are products of Qualcomm Technologies, Inc. and/or its subsidiaries.

Materials that are as of a specific date, including but not limited to press releases, presentations, blog posts and webcasts, may have been superseded by subsequent events or disclosures.

Nothing in these materials is an offer to sell any of the components or devices referenced herein.