On-Prem AI Appliances
US - English
Qualcomm DragonwingTM
AI On-Prem Appliance

The Dragonwing AI On-Prem Appliance delivers cloud-grade AI performance with the efficiency, security, and control of local deployment. Powered by Qualcomm Technologies' cutting-edge hardware technologies and software capabilities, these high-performance systems are engineered to transform operations across industries—without the risks or dependencies of cloud-only infrastructure.

Get started developing with Qualcomm AI On-Prem appliances. From ready-to-use, pre‑compiled models to optimized computer vision and large language models, the resources below help you move from evaluation to production faster—no cloud required.

Benefits

Cloud-grade AI
on your premises

Up to 870 TOPS AI performance, delivered locally for maximum speed, security, and control—no cloud required.

Lower TCO and
higher efficiency

More than 3x performance/cost and nearly 2x performance/watt compared to other cloud-based AI products.

Enhanced data
control & security

Adopt AI on your terms: private, powerful, scalable, and fully under your control.

Rapid, easy deployment

Enables multi-user, multi-tenancy support for broader AI adoption enterprise-wide deployment with no IT overhaul.

Enterprise intelligence with smarter performance per dollar — at a fraction of the footprint.

Delivers industry-leading performance while optimizing for power efficiency and total cost of ownership— in a easily deployable form factor.

Dragonwing AI On-Prem Appliance

01:48
Dragonwing AI On-Prem Appliance

1:48

Video Player is loading.
Current Time 0:00
Duration 1:47
Loaded: 5.52%
Stream Type LIVE
Remaining Time 1:47
 
1x
  • Chapters
  • descriptions off, selected
  • captions off, selected
  • en (Main), selected

The Dragonwing AI on-prem solution integrates hardware and software into a unified, end-to-end solution.

HARDWARE

AI Accelerator Card

Purpose-built AI accelerators deliver high performance,
low latency, and efficient on-site processing—all within
a compact, power-optimized footprint.

SOFTWARE

Models & AI Inference Suite

A flexible software layer powers easy deployment, orchestration, and inference for computer vision, LLMs, and other AI models—scalable to meet enterprise needs.

Enterprise-ready AI inference use cases

Dynamic information access

Deliver real-time access to maintenance guides, process documentation, and live records—right at the edge. Workers get instant, AI-powered support to troubleshoot, follow procedures, and boost productivity without relying on cloud connectivity.

Operational diagnostics

Just snap a photo—leverage AI inference combining computer vision and LLMs to analyze machine health, detect early signs of wear, and answer operational questions instantly.

Guided repair execution

Capture images of faulty equipment and receive dynamic, step-by-step repair guidance. Computer vision identifies components, LLMs interpret issues, and agentic AI adapts instructions in real time—enabling faster, more accurate fixes with minimal supervision.

Vision analytics

Deploy computer vision at the edge to monitor operations, detect anomalies, and deliver real-time, scalable insights from visual data across enterprise environments.

Multilingual translation

Run LLMs and other AI models to translate documents across languages and convert scanned data to structured text via OCR—accelerating part identification, procurement, and global supply chain workflows with real-time, on-prem inference.

On-site skill gap support

Democratize AI adoption by empowering every worker with real-time, contextual guidance, bridging skill gaps and enabling consistent execution across roles and experience levels.

Blazing memory bandwidth

544 GBps memory bandwidth enables ultra-fast inference speeds for demanding workloads.

Large model ready

128 GB memory supports running larger models, including multi-modal and complex LLMs.

Lower power

150W power envelope delivers high compute efficiency with low energy footprint.

LLM powerhouse

Supports models up to 120B parameters, with 200k+ Hugging Face models validated.

Multi-tenant scalability

Multiple users per appliance, enabling collaborative and concurrent AI development.

High throughput inference

LLM: Up to 300 tokens/sec (LLAMA 3.1 8B)
CV: 23,000+ inferences/sec

Agentic AI capabilities

Supports autonomous agents that can reason, plan, and act across tasks—ideal for RAG, tool use, and workflow orchestration.

Case studies

Get started with these resources

Interested in reducing your inference cost and improving your performance per watt in data centers?

SOFTWARE

Evaluate the Qualcomm Cloud AI 100 Ultra

Try gen AI inferencing on Qualcomm Cloud AI Ultra in our developer playgrounds from Cirrascale.

CLOUD

Cirrascale Inference Cloud Powered by Qualcomm Technologies

Scale your offerings efficiently with inference as a service powered by Qualcomm Cloud AI.

HARDWARE

Run Inference On Premises

Certain applications need low-latency, secure, and private solutions at a low cost not found with current cloud inference providers.

Helpful Links

Qualcomm relentlessly innovates to deliver intelligent computing everywhere, helping the world tackle some of its most important challenges. Our leading-edge AI, high performance, low-power computing, and unrivaled connectivity deliver proven solutions that transform major industries. At Qualcomm, we are engineering human progress.

Stay connected

Get the latest Qualcomm and industry information delivered to your inbox.

Subscribe
Manage your subscription

© Qualcomm Technologies, Inc. and/or its affiliated companies.

Snapdragon and Qualcomm branded products are products of Qualcomm Technologies, Inc. and/or its subsidiaries. Qualcomm patented technologies are licensed by Qualcomm Incorporated.

Note: Certain services and materials may require you to accept additional terms and conditions before accessing or using those items.

References to "Qualcomm" may mean Qualcomm Incorporated, or subsidiaries or business units within the Qualcomm corporate structure, as applicable.

Qualcomm Incorporated includes our licensing business, QTL, and the vast majority of our patent portfolio. Qualcomm Technologies, Inc., a subsidiary of Qualcomm Incorporated, operates, along with its subsidiaries, substantially all of our engineering, research and development functions, and substantially all of our products and services businesses, including our QCT semiconductor business.

Materials that are as of a specific date, including but not limited to press releases, presentations, blog posts and webcasts, may have been superseded by subsequent events or disclosures.

Nothing in these materials is an offer to sell or license any of the services or materials referenced herein.