Qualcomm® 
AI Inference Suite
For cloud and on-premises deployments.
Qualcomm®
AI Inference Suite
Get Software
chip image

A comprehensive set of ready-to-use AI applications, agents, tools, and libraries for developing and deploying AI inference on premises or via cloud deployments.  

The Qualcomm® AI Inference Suite comprises a Python SDK and OpenAI-compatible APIs. These interfaces simplify deployment of AI applications and agents powered by Qualcomm® Cloud AI inference accelerator cards to achieve industry-leading performance per watt at a low total-cost-of-ownership (TCO).

Use pre-trained generative AI models for chat, generative multimedia, and retrieval-augmented generation (RAG).

Start experimenting now in three easy steps: 

1. Run:
pip install python-imagine-sdk
2. Select an AI inference playground and generate an API key.
 
3. Visit our Getting Started Guide.

Qualcomm Dev Playground Intro

02:13
Qualcomm Dev Playground Intro

2:13

Video Player is loading.
Current Time 0:00
Duration 2:12
Loaded: 4.48%
Stream Type LIVE
Remaining Time 2:12
 
1x
  • Chapters
  • descriptions off, selected
  • captions off, selected
  • en (Main), selected

Deployment Options

Whether you want to experiment or are ready to deploy your AI inference application, choose from cloud developer playgrounds, cloud AI inference-as-a-service, or on-prem appliances powered by Qualcomm Cloud AI inference accelerators.

Developer Playgrounds

The region-specific AI inference playgrounds below provide free-to-test examples, documentation, and other learning resources so you can immediately experiment with AI inference.

Cirrascale:
United States

powered by Qualcomm Cloud AI

Core42:
United Arab Emirates

powered by Qualcomm Cloud AI

ALLaM Developer Playground:
Saudi Arabia

powered by Qualcomm Cloud AI

Cloud AI Inference Services

Ready to deploy your AI applications at scale? Qualcomm Cloud AI, as part of cloud operators’ inference services, delivers the performance and power efficiency necessary to deploy and accelerate AI inference at scale.

On-Prem Appliances

Qualcomm Dragonwing™ AI On-Prem Appliance Solution with Qualcomm Cloud AI cards and Qualcomm AI Inference Suite is designed for generative AI inference and computer vision workloads on dedicated on-premises hardware. This setup allows your sensitive customer data, fine-tuned models, and inference loads to remain on premises.

The Qualcomm Cloud AI Inference Suite is built upon lower-level frameworks and runtimes, allowing you to bring your own model if needed.

Playground GUI

Try out the inference cloud 
in a web browser.

Python SDK

Easily use models in the 
Qualcomm AI Inference Suite directly with Python.

REST API

Use the REST API with 
examples in your choice of programming language.

Benefits

AI Inference with Ease

Gain immediate access to the playgrounds and choose from a wide selection of AI models, agents, application examples, and inference services.

Top Performance and Efficiency

Use software optimized for inference on Qualcomm Cloud AI inference accelerators for leading performance and lower total cost of ownership.

Flexible Deployments at Scale

Deploy in the cloud or to on-prem appliances supported by multi-tenancy, containerization, and auto-scaling with Kubernetes.

Robust Developer Options

Programmatically interact using the Python SDK, OpenAI-compatible APIs, and familiar AI frameworks like LangChain.

Application and Agent Samples

The Qualcomm AI Inference Suite supports a variety of models for common AI inference scenarios. Once you log in to one of the providers above, you’ll find samples and code you can use.

Agent examples using the Qualcomm AI Inference Suite are provided as starting samples to understand common scenarios. For example: a research agent that can pull together content from multiple locations, chatbots based on current events, or a RAG pattern for constantly updated source data.

Summarization

simplify longer content

Code generation

accelerate software development

Interactive chat

answer typical LLM queries

Translation

from one language to another

RAG

context-aware Q&A

Image generation

based on text descriptions

Agents

automating AI tasks

Get hands-on with tutorial and demo videos

6:41

Qualcomm AI Inference Suite Introduction

5:58

Qualcomm AI Inference Suite Tour

6:58

Getting Started with Cloud AI Inference

7:48

Cloud-Based AI Inference via a Web Page

6:27

Using Google Colab to Access Qualcomm AI Inference Suite

Connect with our communities

Stay ahead of the curve

Receive the latest updates, exclusive offers, and valuable insights delivered through the Qualcomm newsletter straight to your inbox.

Stay ahead of the curve

Receive the latest updates, exclusive offers, and valuable insights delivered through the Qualcomm newsletter straight to your inbox.

Qualcomm relentlessly innovates to deliver intelligent computing everywhere, helping the world tackle some of its most important challenges. Our leading-edge AI, high performance, low-power computing, and unrivaled connectivity deliver proven solutions that transform major industries. At Qualcomm, we are engineering human progress.

Stay connected

Get the latest Qualcomm and industry information delivered to your inbox.

Subscribe
Manage your subscription

© Qualcomm Technologies, Inc. and/or its affiliated companies.

Snapdragon and Qualcomm branded products are products of Qualcomm Technologies, Inc. and/or its subsidiaries. Qualcomm patented technologies are licensed by Qualcomm Incorporated.

Note: Certain services and materials may require you to accept additional terms and conditions before accessing or using those items.

References to "Qualcomm" may mean Qualcomm Incorporated, or subsidiaries or business units within the Qualcomm corporate structure, as applicable.

Qualcomm Incorporated includes our licensing business, QTL, and the vast majority of our patent portfolio. Qualcomm Technologies, Inc., a subsidiary of Qualcomm Incorporated, operates, along with its subsidiaries, substantially all of our engineering, research and development functions, and substantially all of our products and services businesses, including our QCT semiconductor business.

Materials that are as of a specific date, including but not limited to press releases, presentations, blog posts and webcasts, may have been superseded by subsequent events or disclosures.

Nothing in these materials is an offer to sell or license any of the services or materials referenced herein.