Run Nexa AI agents locally on Snapdragon X PCs with Hexagon NPU
Sign up for Developer monthly newsletter
Join thousands of developers around the globe who receive latest news and updates from our monthly curated newsletter.
Sign upCome for support, stay for the community
Get support from experts, connect with like-minded developers, and access exclusive virtual events.
Join Developer DiscordAs generative AI adoption accelerates across industries, the next frontier is clear: running meaningful AI agentic workflows fully on-device harnessing the power of on-device SLM & LLM inference.
Cloud-only AI no longer satisfies the requirements of latency-critical, private, and personalized computing. This is where Nexa AI - a next-generation, multimodal AI framework - and Snapdragon X Series processors come together to define new multimodal AI agents running on-device with zero-cloud.
Nexa AI developers and end-users, the pairing with PCs powered by Snapdragon X Series with the Qualcomm Hexagon NPU unlocks a new level of capability: local LLMs, multimodal reasoning, and agentic workflows running at the edge for groundbreaking on-device AI performance while maintaining power efficiency, long battery life, and real-time responsiveness.
A leap forward in on-device AI: why NPU acceleration matters
Traditional AI inference pipelines rely heavily on CPU or GPU execution. But while powerful, these units are not optimized for sustained inference at lower power. The Hexagon NPU is explicitly designed for this purpose:
- The Hexagon NPU in the first generation Snapdragon X Series processors is capable of up to 45 TOPS, exceeding Microsoft’s baseline Copilot+ PC requirements for AI-centric tasks.
- The next-gen Hexagon NPU in the Snapdragon X2 Series processors reaches up to 80 TOPS, a 78% peak performance uplift over the previous generation.
Nexa AI meets Snapdragon: real performance gains
In late 2025, Nexa AI released an SDK optimized specifically for Hexagon NPUs. The update enabled multiple Nexa and third-party state-of-the-art LLMs and multimodal models to execute directly on the NPU, rather than consuming CPU or GPU cycles.
- Nexa’s own OmniNeural-4B multimodal model runs fully on the NPU with strong efficiency and responsiveness.
- Dedicated NPU-compiled versions of Ministral-3-3B, Granite-4, Microsoft Phi-4 mini, and Qwen3-4B are supported.
- These early NPU-tuned models operate within a ~4B parameter budget—ideal for edge inference on PCs.
Developer reality: what running Nexa AI on PCs with Snapdragon looks like
With NPU-accelerated Nexa AI, developers can now ship apps that run offline.
Local LLM Inference
Models like Granite-4 execute at the edge, allowing:
- Offline chat & summarization
- Private document processing
- Local semantic search through user files
Multimodal Experiences
Nexa AI supports latest text, audio, and image reasoning models using the NPU—ideal for apps like:
- Real-time transcription
- Smart meeting assistants
- On-device vision classification
- Local media understanding
Agentic Workflows
With a Hexagon NPU, Nexa agents can perform real-time, multi-step actions without cloud round-trips, improving:
- Latency
- Reliability
- Privacy
Battery-Friendly AI
PCs powered by Snapdragon X Series offer optimized power consumption by spreading workloads across the CPU, GPU, and NPU – ideal for power efficient use cases.
What this means for the future of on-device AI
The combination of Nexa AI + PCs with Snapdragon is a preview of the next decade of computing:
Edge-Native AI Experiences
Models run where users work—not in distant data centers.
A Growing Ecosystem of NPU-Accelerated Apps
Productivity, creative, and agentic apps can increasingly target the Hexagon NPU as Windows AI APIs mature.
Responsive, Private, Local Intelligence
User data never leaves the device—critical for regulated industries and privacy-sensitive workflows.
Conclusion: Nexa AI + PCs with Snapdragon X Series = the Future of local, on-device AI agents
With the rise of Copilot+ PCs with Snapdragon X Series processors and Nexa AI’s NPU-optimized SDK, developers and users can finally experience powerful AI—without dependence on the cloud.
Real-time multimodal reasoning, background AI agents, advanced document understanding, and local LLMs are no longer theoretical—they’re running today on Windows PCS with Snapdragon which has the Hexagon NPU incorporated.
This is the beginning of a new era: on-device AI that’s fast, private, efficient, and always available. And Nexa AI is helping lead the way with Windows PCs with Snapdragon X Series.



