Back to All
OnQ Blog

How on-device AI is enabling generative AI to scale [The future of AI is hybrid]

Providing cost savings as well as performance, personalization, privacy and security benefits
Qualcomm-image

As generative artificial intelligence (AI) adoption grows at record-setting speeds1 and computing demands increase2, hybrid processing is more important than ever. But just like traditional computing evolved from mainframes and thin clients to today’s mix of cloud and edge devices, AI processing must be distributed between the cloud and devices for AI to scale and reach its full potential. 

A hybrid AI architecture distributes and coordinates AI workloads among cloud and edge devices, rather than processing in the cloud alone. The cloud and edge devices — smartphones, cars, personal computers, and Internet of Things (IoT) devices — work together to deliver more powerful, efficient and highly optimized AI.

 

The main motivation is cost savings. For instance, generative AI-based search cost per query is estimated to increase by 10 times compared to traditional search methods3 — and this is just one of many generative AI applications.

 

Hybrid AI will allow generative AI developers and providers to take advantage of the compute capabilities available in edge devices to reduce costs. A hybrid AI architecture (or running AI on-device alone) offers the additional benefits of performance, personalization, privacy and security at a global scale.

These architectures can have different offload options to distribute processing among cloud and devices depending on factors such as model and query complexity. For example, if the model size, prompt and generation length is less than a certain threshold and provides acceptable accuracy, inference can run completely on the device. If the task is more complex, the model can run across cloud and devices. 

Hybrid AI even allows for devices and cloud to run models concurrently — with devices running light versions of the model while the cloud processes multiple tokens of the full model in parallel and corrects the device answers if needed.

 

The future of AI is hybrid — Part I: Unlocking the generative AI future with on-device and hybrid AI
Qualcomm-image
In a device-centric hybrid AI architecture, the cloud is only used to offload AI tasks that the device cannot sufficiently perform.

 

Scaling generative AI with edge devices

The potential of hybrid AI grows further as powerful generative AI models become smaller while on-device processing capabilities continue to improve. AI models with more than 1 billion parameters are already running on phones with performance and accuracy levels similar to those of the cloud, and models with 10 billion parameters or more are slated to run on devices in the near future.

The hybrid AI approach is applicable to virtually all generative AI applications and device segments — including phones, laptops, extended reality headsets, cars and IoT. The approach is crucial for generative AI to scale and meet enterprise and consumer needs globally. We truly believe that the future of AI is hybrid. Read our whitepaper to learn more.

 

References

1: Buchholz, K. (Jan. 24, 2023). ChatGPT sprints to one million users. Statista. Retrieved on May 2, 2023 from https://www.statista.com/chart/29174/time-to-one-million-users/.

2: Sheth, S. (Feb. 25, 2023). Generative AI drives an explosion in compute: The looming need for sustainable AI. Silicon Angle. Retrieved on May 2, 2023 from https://siliconangle.com/2023/02/05/generative-ai-drives-explosion-compute-looming-need-sustainable-ai/.

3: Dastin, J. et al. (Feb. 22, 2023). For tech giants, AI like Bing and Bard poses billion-dollar search problem. Reuters. Retrieved on May 2, 2023 from https://www.reuters.com/technology/tech-giants-ai-like-bing-bard-poses-billion-dollar-search-problem-2023-02-22/.

 

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries ("Qualcomm"). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.

About the Authors
Ziad Asghar
Ziad AsgharSVP & GM, XR, Qualcomm Technologies, Inc.
Dr. Jilei Hou
Dr. Jilei HouVice President, Engineering, Qualcomm Technologies, Inc.
Qualcomm relentlessly innovates to deliver intelligent computing everywhere, helping the world tackle some of its most important challenges. Our leading-edge AI, high performance, low-power computing, and unrivaled connectivity deliver proven solutions that transform major industries. At Qualcomm, we are engineering human progress.

Stay connected

Get the latest Qualcomm and industry information delivered to your inbox.

Subscribe
Manage your subscription

© Qualcomm Technologies, Inc. and/or its affiliated companies.

Snapdragon and Qualcomm branded products are products of Qualcomm Technologies, Inc. and/or its subsidiaries. Qualcomm patented technologies are licensed by Qualcomm Incorporated.

Note: Certain services and materials may require you to accept additional terms and conditions before accessing or using those items.

References to "Qualcomm" may mean Qualcomm Incorporated, or subsidiaries or business units within the Qualcomm corporate structure, as applicable.

Qualcomm Incorporated includes our licensing business, QTL, and the vast majority of our patent portfolio. Qualcomm Technologies, Inc., a subsidiary of Qualcomm Incorporated, operates, along with its subsidiaries, substantially all of our engineering, research and development functions, and substantially all of our products and services businesses, including our QCT semiconductor business.

Materials that are as of a specific date, including but not limited to press releases, presentations, blog posts and webcasts, may have been superseded by subsequent events or disclosures.

Nothing in these materials is an offer to sell or license any of the services or materials referenced herein.