Back to All
Developer Blog

NexaSDK for Android: a simple way to bring on-device AI to smartphones with Snapdragon

Sign up for Developer monthly newsletter-image

Sign up for Developer monthly newsletter

Join thousands of developers around the globe who receive latest news and updates from our monthly curated newsletter.

Sign up
Come for support, stay for the community-image

Come for support, stay for the community

Get support from experts, connect with like-minded developers, and access exclusive virtual events.

Join Developer Discord

Artificial intelligence (AI) features are becoming standard in Android apps but getting them to run well on a phone has usually required a lot of custom work. Nexa AI and Qualcomm Technologies, Inc. have now made this easier with NexaSDK for Android.

The software development kit (SDK) lets apps tap into the Qualcomm Hexagon NPU (the preferred processing engine for AI), as well as the Qualcomm Adreno GPU and Qualcomm Oryon CPU, on Snapdragon mobile platforms, giving developers an easy-to-use, single runtime that handles the heavy lifting so they can focus on building actual product features.

In Nexa's early numbers, on Samsung S25 Ultra powered by Snapdragon 8 Elite, models like Granite 4.0-h-350M reach 92 token per second on the NPU, compared to 40 tokens per second on the CPU.

Energy measurments show up to 9 times better efficiency when workloads run on the NPU instead of the CPU.

These gains make a real difference in everyday use, especially for apps that need
to stay responsive while keeping battery impact low.

How NexaSDK works with Snapdragon hardware

Snapdragon platforms combine three processing engines. When it comes to classic AI and generative AI, the Hexagon NPU is the engine of choice to run AI workloads. It is equipped with industry-leading and dedicated scalar, vector and tensor accelerators and a large shared memory.

Precision-wise, the latest Snapdragon 8 Elite Gen 5 can support INT2, INT4, INT8, INT16, FP8 and FP16. The combination of these features results in flexibility to process diverse AI workloads, energy efficiency, and memory and bandwidth savings.

For those who seek more fine tuning across the NPU, CPU and GPU, the Nexa SDK allows developers to choose their preferred engine for inferencing, without the challenges of device-specific tuning. Nexa SDK supports OpenCL and Vulcan APIs for easy access.

The Qualcomm Oyron CPU is general purpose and can be utilized for latency critical AI tasks.

Getting the most out of this mix typically requires device-specific tuning, complex configuration, and switching between different toolkits.

Nexa SDK removes this complexity by providing a unified interface: developers can simply select their preferred backend and, with just three lines of code, leverage the NPU, GPU, or CPU to run state-of-the-art models—including embedding, rerank, ASR, OCR, LLM, and VLM—directly on mobile devices.

 

NexaSDK for Android
Figure 1. NexaSDK for Android

Running SOTA models with minimal code

The SDK is built to keep setup simple. Loading and running a model only takes a few lines. The first beta supports a solid set of models that cover many common app needs:

  • Multimodal
    • OmniNeural-4B from Nexa AI
  • LLM 
    •  GPT-OSS-20B from OpenAI
    • Granite-4.0-h-350M, Granite-4.0-Micro from IBM
    • LFM2-1.2B from Liquid AI
    •  Gemma-3n-E2B, Gemman-3n-E4B from Google
    • Qwen3-4B from Alibaba Qwen
    • Phi3.5-mini, Phi4-mini from Microsoft
    • Llama3.2-3B from Meta
       
  • Embeddings 
    • EmbedNeural from Nexa AI
    • EmbeddingGemma from Google
  • ASR 
    • Parakeet-v3 from NVIDIA
    •  ConvNeXt from Meta
  • OCR 
    • PaddleOCR from Baidu
       
  • Reranking 
    • jina-reranker from Jina AI

Among them, GPT-OSS-20B has become a standout choice for enterprise customers. Sam Altman recently said that it offers “real-world performance comparable to o4-mini—and you can run it locally on your phone.”

However, many believed running a 20B-parameter model on mobile devices was still years away. Now, Nexa SDK enables GPT-OSS-20B to run entirely on-device with phones powered by Snapdragon processor equipped with Qualcomm Hexagon NPU phones (≥16GB RAM) through the Nexa Android SDK, providing private, low-latency inference with no cloud dependency.

Nexa also focuses on Day-0 availability for new models so developers don’t have to wait for backend or operator update and can start developing right away.

 

Benefits of running models on-device

Moving AI inference to the device brings several benefits that matter in shipping products:

  • No dependency on network conditions
  • Better privacy because data stays local
  • No cloud token or API charges
  • Reliable performance offline
  • Improved performance and battery life when using the NPU

These advantages are especially useful for assistants, translation tools, optical character recognition (OCR) and document apps, camera and imaging pipelines, and lightweight LLM features meant to run interactively on the phone.

A quick start for Android developer teams

The SDK includes a sample app and starter guide, so most developers can get a test build running quickly. It’s a practical option for teams experimenting with multimodal input, voice workflows, OCR, summarization, reranking or any feature that benefits from low-latency inference.

You can follow the QuickStart section of the Nexa SDK documentation and try the sample app. There’s also a tutorial video demonstrating how to run multimodal inference with the Nexa SDK within 40 seconds in Android Studio.

Nexa AI will continue refining the framework, and Qualcomm Technologies will keep adding capabilities to the Qualcomm AI Stack to further optimize for power efficient on-device.

Together, they give Android developers a more direct path to on-device intelligence without having to build their own runtime.

NexaAI-Android-SDK-video-demo

Nov 26, 2025 | 0:57

Video Player is loading.
Current Time 0:00
Duration 0:56
Loaded: 10.55%
Stream Type LIVE
Remaining Time 0:56
 
1x
  • Chapters
  • descriptions off, selected
  • captions off, selected
  • en (Main), selected

Start building on-device AI

If you want to bring AI features directly onto Android phones powered by Snapdragon, NexaSDK for Android (Beta) is an easy way to begin.

Download NexaSDK today

Nexa SDK has now earned over 6,000 GitHub stars, and our developer community is proactively expanding the ecosystem with derivative projects, such as the Android-focused nexa_ai_flutter package.

Browse
Nexa AI and Qualcomm Technologies GitHub communities



More useful resources:

NexaSDK documentation

NexaSDK Android sample app

NexaSDK onboarding tutorial video

 

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries ("Qualcomm"). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.

Snapdragon and Qualcomm branded products are products of Qualcomm Technologies, Inc. and/or its subsidiaries.

About the Authors
Alan Zhu
Alan ZhuProduct Lead at Nexa AI
Zack Li
Zack LiCTO & Co-founder of Nexa AI
Alex Chen
Alex ChenCEO & Founder of Nexa AI
Manoj Khilnani
Manoj Khilnani Director of Global Partner Marketing
Jerry Chang
Jerry ChangSenior Manager, Marketing, Qualcomm Technologies
Qualcomm relentlessly innovates to deliver intelligent computing everywhere, helping the world tackle some of its most important challenges. Our leading-edge AI, high performance, low-power computing, and unrivaled connectivity deliver proven solutions that transform major industries. At Qualcomm, we are engineering human progress.

Stay connected

Get the latest Qualcomm and industry information delivered to your inbox.

Subscribe
Manage your subscription

© Qualcomm Technologies, Inc. and/or its affiliated companies.

Snapdragon and Qualcomm branded products are products of Qualcomm Technologies, Inc. and/or its subsidiaries. Qualcomm patented technologies are licensed by Qualcomm Incorporated.

Note: Certain services and materials may require you to accept additional terms and conditions before accessing or using those items.

References to "Qualcomm" may mean Qualcomm Incorporated, or subsidiaries or business units within the Qualcomm corporate structure, as applicable.

Qualcomm Incorporated includes our licensing business, QTL, and the vast majority of our patent portfolio. Qualcomm Technologies, Inc., a subsidiary of Qualcomm Incorporated, operates, along with its subsidiaries, substantially all of our engineering, research and development functions, and substantially all of our products and services businesses, including our QCT semiconductor business.

Materials that are as of a specific date, including but not limited to press releases, presentations, blog posts and webcasts, may have been superseded by subsequent events or disclosures.

Nothing in these materials is an offer to sell or license any of the services or materials referenced herein.