Mar 5, 2020
Qualcomm products mentioned within this post are offered by Qualcomm Technologies, Inc. and/or its subsidiaries.
Today’s apps demand a lot from our mobile devices. From games to VR, and signal processing to AI, modern apps are truly multifaceted, placing all sorts of demands on our devices. For mobile developers this creates a major challenge: how to optimize for performance while meeting the demand for power efficiency and long battery life. And how can we deliver on both needs when the growing requirement for computing power seems to always outpace advances in battery technology?
The key lies in heterogeneous computing, which has been at the heart of mobile processor designs from the beginning. Our flagship Qualcomm Snapdragon 8xx mobile platform offers developers a number of specialized processors, which continue to evolve with each generation. In this blog we’ll review the processors found inside Snapdragon and look at some of our tools aimed at supporting developers with heterogeneous and power efficient computing.
Heterogeneous from the core
The three specialized processors found in our Snapdragon series around which heterogeneous and power efficient computing optimizations can be made are:
- Qualcomm Kryo CPU: an ARM-based CPU featuring multiple cores configured in a big.LITTLE architecture. Our Kryo CPU supports multiple task scheduling methods.
- Qualcomm Adreno GPU: the mobile platform’s GPU. While primarily used for rendering, developers have the option to offload CPU operations (e.g., vector processing) to our Adreno GPU, which can be run in parallel with the CPU. (Note: our Adreno SDK provides interfaces for rendering, but not for heterogeneous computing.)
- Qualcomm Hexagon DSP: the Hexagon DSP provides both CPU and DSP functionality to which developers can offload compute-intensive tasks. Our Hexagon DSP excels at handling vector data and provides hardware multithreading. It is optimized to work under reduced clock rates while accomplishing more work per clock cycle.
Of the three processors, our Hexagon DSP is one of the most interesting because it has evolved from a DSP for image and audio processing, into a vector-processing powerhouse for neural networks, virtual reality, and other process-intensive applications. Developers effectively upload functions to our Hexagon processor where it performs operations using its multiple hardware threads, very long instruction word (VLIW) instruction set, and RTOS (for scheduling). Most notably, the Hexagon DSP features both a 64-bit wide scalar unit, and its two 128-bit wide hexagon vector extensions (HVX) for vector processing.
Tools from Qualcomm Technologies
Out of our rich set of tools, developers can implement and optimize for heterogeneous computing and power efficiency through the following:
- Snapdragon Heterogeneous Compute SDK: provides developers with the ability to allocate work to any of the three processors on Snapdragon. The SDK provides C++ API’s for the Kryo CPU and Adreno GPU, the latter of which interacts through OpenGL and OpenCL calls. For the Hexagon DSP, developers should utilize the Hexagon DSP SDK.
- Snapdragon Power Optimization SDK: provides C++ APIs for balancing processing performance and power efficiency on the Kryo CPU and the Adreno GPU.
- Snapdragon Profiler: a standalone tool that connects with Android devices powered by Snapdragon processors over USB, and allows developers to analyze CPU, GPU, DSP, memory, power, thermal, and network data.
While the Heterogeneous Compute SDK is primarily focused on heterogeneous computing (e.g., processor and core selection), it’s often paired with the Power Optimization SDK to fulfill power efficiency requirements which typically go hand in hand with heterogeneous compute requirements. On top of this, developers should also use the Snapdragon profiler to monitor performance, power, and heat while integrating these SDKs.
The following diagram summarizes these tools and their dependencies, and outlines the processors that each of these tool/SDKs interacts with:
The Heterogeneous Compute SDK provides the following constructs that abstract away low-level system calls and device interactions for heterogeneous computing, making it easy to allocate work across the Snapdragon CPU, GPU, and DSP (click the links for more detailed App Notes for each):
- Kernel: encapsulates a unit of work to be executed on one of the three processors. A CPU kernel encapsulates a function, lambda, or function pointer, a GPU kernel encapsulates OpenCL and OpenGL calls, and a DSP kernel encapsulates logic written using the Hexagon DSP SDK.
- Affinity: interfaces for selecting a Kryo CPU core on which to execute a CPU kernel of work. Developers can use this to take advantage of the Kryo CPU’s bit.LITTLE architecture, dynamically at runtime.
- Patterns: constructs for performing common operations (e.g., iterating on a collection) in parallel, either across CPU cores or across the three processors. Developers can further optimize performance using the SDK’s pattern tuners (e.g., to set the degree of concurrency across threads).
- Tasks: executes code asynchronously on any of the processors, and provides facilities to manage dependencies (e.g., output) between concurrent tasks.
- Buffers: interfaces for managing the Snapdragon memory, which is shared between the three processors. Buffers include facilities for optimizing shared memory operations between dependent tasks.
The Power Optimization SDK allows developers to programmatically optimize power efficiency at runtime using reactive and proactive models. In the reactive model, the system adjusts power based on thermal constraints, while the proactive model involves the use of the SDK’s static and dynamic mode APIs to dynamically optimize for power efficiency. The static mode APIs allow developers to select a power efficiency profile, while the dynamic mode APIs provide much more granular control, such as the ability to adjust the number of CPU cores and their frequencies.
To see how all of this comes together, we recommend you check out this four-part Getting Started guide. It provides an excellent introduction on how to use the Heterogeneous Compute SDK and Power Optimization SDK in conjunction with the Snapdragon Profiler for heterogeneous and power efficient computing.
The first part of the guide shows how to use the Snapdragon profiler to profile CPU usage and frequency for iterating over 10 million values. The second part shows how the APIs of the Heterogeneous Compute SDK can be used to optimize the loop via parallel processing, and discusses the trade-off between performance and power consumption. The third part shows how the loop can then be optimized for power consumption using the Power optimization SDK. The fourth part of the guide summarizes the optimizations, and just how easy it can be to achieve results using the SDKs.
Today’s multi-faceted applications require developers to balance performance and power efficiency in their mobile apps. Our flagship Snapdragon 8xx series mobile platform provides three processors backed by a rich set of SDKs that developers can use to meet these demands. So be sure to check out the Getting Started guide mentioned above.