Developer Blog

Bringing Edge AI performance to PyTorch developers with ExecuTorch 1.0

Written by

Felix Baum

Written by

Charlotte Mallo

Oct 22, 2025

Sign up for Developer monthly newsletter

Join thousands of developers around the globe who receive latest news and updates from our monthly curated newsletter.

Sign up

Come for support, stay for the community

Get support from experts, connect with like-minded developers, and access exclusive virtual events.

Join Developer Discord

ExecuTorch 1.0, an open source solution to training and inference on the Edge, becomes available to all developers
Qualcomm Technologies contributed the ExecuTorch repository for developers to access Qualcomm® Hexagon™ NPU directly
This streamlines the developer workflow and unlocks the benefits of local AI inference, from personalization to performance, privacy, and reduced reliance on cloud infrastructure

Today, the ExecuTorch Team announced the general availability of ExecuTorch 1.0. enabling developers to create seamless and high-performance edge AI experiences.

We are excited to share that we at Qualcomm contributed directly to the ExecuTorch repository with the ExecuTorch delegate for Qualcomm Hexagon NPU that lets developers access the NPU, our piece of hardware specifically designed for performance and power efficient inference on-device. This lets developers offload AI/ML and Gen AI inference directly to the Hexagon NPU while streamlining their development workflow.

This effort builds off our long-term collaboration with the PyTorch Edge team in an effort to bring edge AI to revery developers and address the challenges of deploying AI and Gen AI on resource-constrained edge devices.

Streamlined development workflow

With ExecuTorch 1.0, developers can port any model – large language models (LLMs) and vision-language models (VLMs) to image detection and other AI/ML models - and apps across various computing platforms, while using the same toolchains for model authoring, conversion, debugging, and deployment.

The lightweight runtime leverages full hardware capabilities, including CPUs, GPUs, and NPUs, leading to experiences that can tap into the device’s context for more personalization, faster and more power efficient inference, and reduced cloud inference costs.

On-device AI performance on billions of devices

With the inclusion of the ExecuTorch Delegate for Hexagon NPU, developers can deploy or port AI-powered apps to billions of devices powered by Qualcomm hardware – including mobile phones, PCs, AI smart glasses, cars, and IoT devices.

Tapping into the power of the Hexagon NPU means developers will unlock not just performance and power efficiency gains, but many more benefits that respond to a growing demand from consumers and industries, such as:

Maintain data on devices for privacy and personalization, with access to contextual data
Reduce reliance to cloud computing, opening doors to offline use cases and more accessible products worldwide
Improve real time responsiveness, with improved latency and throughput. For example, running large language models on-device using the Hexagon NPU instead of the CPU delivers between 30 and 75% faster load time and 2-4x faster token rate. For traditional models, throughput is up to 92% faster, and memory footprint decreased by up to 47%.

Model Coverage

Models coverage include models across traditional AI use-cases such as object detection, image recognition, depth, OCR, ASR, segmentation and on-device text LLM and multimodal LLMs, for example:

Llama-3.2-3B-instruct
Roberta: FacebookAI's xlm-roberta-base
Gemma3: Gemma-3-1b
Qwen3: Qwen3-1.7B
Phi4: Phi-4-mini-instruct
Whisper: OpenAI's Whisper
SmolLM3-3B

Get started now!

We will be at the PyTorch Conference! Come visit us at booth #D3 and learn more here:

Learn more: https://pytorch.org/executorch
Get started: https://docs.pytorch.org/executorch/1.0/
Download: https://pypi.org/project/executorch/
Maven: https://mvnrepository.com/artifact/org.pytorch/executorch-android/1

Qualcomm AI Stack

The potential for on-device intelligence is growing rapidly thanks to optimizations across the AI stack, from model development to deployment. The ExecuTorch delegate for Hexagon NPU comes as an addition to our existing portfolio which already supports TFlite and ONNX.

Learn more at: Qualcomm AI Stack | Unified AI Software Portfolio | Qualcomm

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries ("Qualcomm"). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.

*Compared with the same models running on the CPU on Snapdragon 8 Elite Gen 4.

Qualcomm and Hexagon are trademarks or registered trademarks of Qualcomm Incorporated.

AI Edge AI Hexagon NPU

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries ("Qualcomm"). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.

Snapdragon and Qualcomm branded products are products of Qualcomm Technologies, Inc. and/or its subsidiaries.

About the Authors

Felix BaumSr. Director, Product Management, Qualcomm Technologies, Inc.