OnQ Blog

Mobile Heterogeneous Computing in Action [w/videos]

2013년 10월 15일

Qualcomm products mentioned within this post are offered by Qualcomm Technologies, Inc. and/or its subsidiaries.

Talk is cheap. If you’ve been following my series of blogs about mobile heterogeneous computing, you’ll see I’ve talked about the importance of it for enabling breakthrough experiences and how Qualcomm Technologies offers a comprehensive solution. Rather than continue to just talk about it, I wanted show some real-world examples of heterogeneous computing in action.

Qualcomm Technologies looked at this issue in a recent webinar, in which third-party companies illustrated how they were able to deliver breakthrough mobile experiences at low power by taking advantage of heterogeneous computing on the Snapdragon processor. The three companies that presented were Pelican Imaging, MuseAmi, and ArcSoft. Besides explaining the application, each company emphasized how it took advantage of the compute capabilities of the programmable processing engines (the CPU, GPU, and DSP) on Snapdragon processors. (The following is based on their respective presentations.)

Pelican Imaging: Computational camera with depth-enabled imaging

Pelican Imaging spoke about all the great experiences possible with depth-enabled imaging by using an array camera. Their array camera generates 16 low resolution images and combines them into a super resolution image along with a depth map by performing complex computation. Having the depth map enables many interesting experiences such as image refocus, selective filtering and image segmentation, and the unique capability of measuring the distance to any object in the photo or video.

The imaging algorithms Pelican Imaging developed to enable these experiences specifically run on the CPU, GPU and DSP. By using these diverse processing engines, Pelican Imaging claimed up to 100x performance improvement and up to 10x power improvement for key algorithms. By taking advantage of the heterogeneous computing capability of Snapdragon, the desired computational imaging experiences can be achieved while fitting within the performance, power, and thermal constraints of mobile devices.

MuseAmi: Computer vision and audio analysis

MuseAmi talked about the software they’ve created that can see and hear, the way human beings see and hear. MuseAmi’s technology uses machine learning and digital signal processing to create software that detects, analyzes, and categorizes both audio and images. The MusicPal application allows a person to snap a photo of notated music and then playback that music on their choice of instruments.

In addition, the application can act as a tutor by providing real-time evaluation of a student’s playback accuracy vs. the golden reference. The MusicPal application is optimized by using the CPU, GPU, and DSP for these complex algorithms. MuseAmi expects that further optimization within the GPU and CPU on the Snapdragon processor would result in even faster on-device processing, possibly 8x faster.

ArcSoft: Image processing

ArcSoft talked about the powerful image processing enabled by their software algorithms. ArcSoft develops many different sophisticated imaging algorithms to provide a better camera experience, such as face recognition, high dynamic range, and improved image quality. NightHawk™ is ArcSoft’s low-light video capture technology that greatly improves taking video in poor lighting conditions. By only using the CPU, NightHawk™ would not be able to run in real time since the frame rate would be too low for a good user experience, not to mention that the power consumption would be too high. By using the CPU, GPU, DSP and ISP, ArcSoft claims that the NightHawk™ application is able to run in real time with more than 30% power savings.

Be sure to check out the webinar for many more details about the applications, how the diverse processing engines are being used, and the actual benefits in terms of performance and power.

Want to learn more? Look for future blogs and webinars to learn about Qualcomm’s view on mobile heterogeneous computing.

Pat Lawlor

Senior Manager, Technical Marketing

More articles from this author

About this author

Related News

OnQ

Heterogeneous Computing: An architecture and a technique

If you’re looking to create great mobile experiences, optimization isn’t optional: it’s a crucial step that helps transform good ideas into great execution. In our previous “Start Cooking with Heterogeneous Computing Tools on QDN” blog, we discussed the concept of heterogeneous computing and how it can help you get more from mobile hardware by sending computational tasks to the best suited processor. Heterogeneous computing is designed to help you achieve better application performance while improving thermal and power efficiency.

However, not all systems capable of heterogeneous computing are created equal and it’s important to understand why. Heterogeneous computing is both a computational technique and a hardware architecture. To achieve greater benefits, you are better served with hardware architected for heterogeneous computing from the ground up along with a software stack that facilitates heterogeneous computing techniques. It’s the combination of purpose-built hardware and a software stack offering granular control within a larger framework of system abstraction that allows for the deep optimizations that heterogeneous computing can deliver.

The Qualcomm Snapdragon Mobile Platform is designed on these principles. This starts with the microarchitecture – the choices made in platform circuitry that include how individual processors are engineered for high performance and how to optimize compute paths between the processors. Let’s look at the main components of the Snapdragon mobile platform and a few of the microarchitecture considerations that went into its design:

Qualcomm Kryo 280 CPU

Designed to handle complex workloads like web browsing and in-game artificial intelligence, the Kryo 280 features an octa-core processor with independent high efficiency and high performance core clusters. During normal operation, the high efficiency cores run most tasks while the high-performance cores activate for anything needing more power.

Qualcomm Hexagon 682 DSP

With the Hexagon wide Vector eXtensions (HVX), the Hexagon DSP excels at applications requiring heavy vector data processing, such as 6-DOF (or Degrees of Freedom) head motion tracking for virtual reality, image processing, and neural network computations. With a 1024-bit instruction word capability and dual execution of the control code processor and the computational code processor within the DSP, Hexagon can achieve breakthrough performance without draining system power.

Qualcomm Adreno 540 GPU

Ideal for arithmetic-heavy workloads that require substantial, parallel number crunching like 3D graphics rendering and camcorder image stabilization, the Adreno GPU is engineered to achieve improved power efficiency and 40% better performance than predecessors. Designed to deliver up to 25% faster graphics rendering and 60x more display colors compared to previous designs, the Adreno GPU supports real-life-quality visuals, and can perform stunning visual display feats like stitching together 4K 360 video in real time.

Heterogeneous computing in microarchitecture design

Beyond the performance enhancements among the individual processors, the Snapdragon mobile platform was designed to optimize the use of the processors together. For example, the Hexagon DSP can bypass DDR memory and the associated data shuffling CPU cycles by streaming data directly from sensors to the DSP cache. Similarly, the Adreno GPU supports 64-bit virtual addressing, allowing for shared virtual memory (SVM) and efficient co-processing with the Kryo CPU. These are just two of the microarchitecture design choices in the Snapdragon mobile platform that make it cutting-edge for heterogeneous computing.

Software

As we noted at the beginning of this post, heterogeneous computing is also a technique. And to truly support heterogeneous computing requires a software stack that provides developers the abstractions and the control to leverage the optimizations in the hardware per the requirements of their application.

To program the DSP or the GPU for heterogeneous computation, and to maximize their performance, developers can use the Qualcomm Hexagon SDK and the Qualcomm Adreno SDK, respectively. These SDKs open a toolbox of controls allowing for precision manipulation of data and computational resources.

For system-wide heterogeneous computing control, Qualcomm Symphony system manager SDK provides the software utilities designed to achieve better performance and lower power consumption from the Snapdragon mobile platform. Symphony is designed to manage the entire platform in different configurations so that the most efficient and effective combination of processors and specialized cores are chosen to get the job done as quickly as possible, with minimal power consumption.

On top of these SDKs it is possible for developers to build their applications directly – many developers opt for this route. However, there is a growing ecosystem of SDKs, frameworks and supporting libraries for accelerating development within a given application domain. Two examples of this are QDN's Adreno SDK for Vulkan for the Vulkan graphics API and our recently released Snapdragon VR SDK.

How to Put Heterogeneous Computing Techniques into Practice with Tools from Qualcomm Developer Network

The Snapdragon mobile platform’s microarchitecture, software development kits and high-level frameworks layer upon each other to provide developers ultimate control of their application performance and power usage. This combination of hardware purposely designed for heterogeneous computing and software allowing the use of heterogeneous computing techniques is at the heart of delivering great user experiences.

Whether you’re a developer that has always written code that executes serially and are interested in how to push more work to other processors, or whether you already have heterogeneous computing know-how and want to use it more effectively – perhaps wanting precise control over the dynamic distribution of your workloads – visit QDN for the resources you need to level up your heterogeneous computing skills.

 

 

2017년 3월 23일