Back to All
Developer Blog

AnythingLLM – Local AI optimized for CPU and NPU on Snapdragon X Series Devices

Sign up for Developer monthly newsletter-image

Sign up for Developer monthly newsletter

Join thousands of developers around the globe who receive latest news and updates from our monthly curated newsletter.

Sign up
Come for support, stay for the community-image

Come for support, stay for the community

Get support from experts, connect with like-minded developers, and access exclusive virtual events.

Join Developer Discord

Co-written with Timothy Carambat, founder of Mintplex Labs.

Most large language models (LLMs) have two things in common: They are not very easy to set up and they run in the cloud or the data center. What if you want the productivity of an LLM without having to learn how to install and configure it? What if privacy is essential and you want to run AI as an application on your desktop?

That’s the sweet spot for AnythingLLM for desktop, an all-in-one application that delivers AI in a simple, IT-compliant way. With AnythingLLM, business users and consumers can easily take advantage of AI to analyze, create and chat with documents of any type. Enterprise developers and engineers can perform tasks with AI agents, automate intricate workflows and interact with proprietary systems to produce output or consume internal data. AnythingLLM for Desktop is designed to be private and run on the device by default.

And now, Mintplex Labs, makers of AnythingLLM, have unveiled a version that runs on the Snapdragon X Series devices. The process described below enabled the application to run LLMs on the Qualcomm Oryon CPU, then to optimize the LLMs further to run on the Qualcomm Hexagon neural processing unit (NPU).

The growth path for running LLMs locally

Consumers and ordinary users are continually hearing about the potential of AI. The addressable market is large for products that abstract the complexity of python scripts, API keys and tool configuration. The goal of AnythingLLM is to help make users productive with AI without complexity and knowledge of programming.

AnythingLLM is an ideal back end for developers building an interface, app or widget. It embodies all the lessons learned about running models locally and it is fully scalable for private LLM usage. In a single app, it allows developers to work nimbly: they can run an LLM, create agents and privately embed documents in an on-device vector database. The AnythingLLM team observed that running LLMs on the NPU of devices powered by Snapdragon X Series platforms had the potential to deliver both performance and power efficiency unattainable on x86 devices. 

Porting AnythingLLM to Windows on Snapdragon

AnythingLLM is written primarily in Node.js and built in public through an open-source GitHub repo. The port required that the engineers work at a lower level, initially on Windows on Snapdragon CPU. Once the engineers had successfully ported to the CPU, they worked at the level of the Snapdragon X Series NPU and the Qualcomm AI Engine Direct SDK (also known as QNN SDK).

Anything LLM UI - agent thinking

Using Dell Latitude 7455 devices with Snapdragon X Elite – the engineers took a few days to get NPU-enabled LLMs running locally. That included the work of powering on-device embeddings on NPU. To exploit the power of the NPU for LLM inference the AnythingLLM team relied on the Qualcomm AI Engine Direct SDK, Qualcomm AI Engine Direct documentation and tooling.

AnythingLLM on Snapdragon X Series: Higher performance and Efficiency

In general, for LLMs and other models (embedding, reranking, etc.) that the product runs, AnythingLLM performs about 30 percent faster on the Qualcomm Oryon CPU than on x86. And it performs much faster still than it does in x86 emulation. While testing, the AnythingLLM team also observed that traditional ML models for tasks like reranking and embedding ran significantly faster on the Qualcomm Hexagon NPU than on Qualcomm Oryon CPU. In short, on Snapdragon X Series devices, models that handle various tasks like image recognition, text classification, speech-to-text and more run significantly faster on the NPU. 

Anything LLM UI - NPU embedded provider

The team took it one step forward and extended their support for NPU models to their built-in document embedder as well. The embedder makes local documents readable with an LLM, so this extended support gives users an end-to-end, NPU-powered AI experience on the device.

Run powerful LLMs on NPU with AnythingLLM and Snapdragon X Elite

Jan 15, 2025 | 8:01

Video Player is loading.
Current Time 0:00
Duration 8:01
Loaded: 1.24%
Stream Type LIVE
Remaining Time 8:01
 
1x
  • Chapters
  • descriptions off, selected
  • captions off, selected
  • en (Main), selected

Next steps

AnythingLLM can now take advantage of both the Qualcomm Oryon CPU and Qualcomm Hexagon NPU in Copilot+ PCs powered by Snapdragon X Series processors.

“We offer AnythingLLM for Desktop at no cost to consumers,” says Timothy Carambat, founder of Mintplex Labs. “The Windows on Snapdragon version is currently in preview with Qualcomm Hexagon NPU support. We also offer a community hub where AnythingLLM users can share assets like workspaces, agent skills and system prompts with one another."

“Large language models are widely touted as being game changers. Snapdragon X Series powered devices are highly performant and efficient, and with open-source models becoming smaller, faster and more accurate; expectations around AI are rising, and both consumers and software developers want a simple tool and framework for using it" - adds Carambat.

The sweet spot for AnythingLLM is that, and in combination with the underlying Snapdragon X Series processors, we pass the AI hardware benefits straight to our users.”

Like what you are seeing? Join Developer Discord for deeper insights and real-time conversations with fellow developers.

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries ("Qualcomm"). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.

About the Author
Devang Aggarwal
Devang AggarwalProduct Manager, Senior
Qualcomm relentlessly innovates to deliver intelligent computing everywhere, helping the world tackle some of its most important challenges. Our leading-edge AI, high performance, low-power computing, and unrivaled connectivity deliver proven solutions that transform major industries. At Qualcomm, we are engineering human progress.

Stay connected

Get the latest Qualcomm and industry information delivered to your inbox.

Subscribe
Manage your subscription

© Qualcomm Technologies, Inc. and/or its affiliated companies.

Snapdragon and Qualcomm branded products are products of Qualcomm Technologies, Inc. and/or its subsidiaries. Qualcomm patented technologies are licensed by Qualcomm Incorporated.

Note: Certain services and materials may require you to accept additional terms and conditions before accessing or using those items.

References to "Qualcomm" may mean Qualcomm Incorporated, or subsidiaries or business units within the Qualcomm corporate structure, as applicable.

Qualcomm Incorporated includes our licensing business, QTL, and the vast majority of our patent portfolio. Qualcomm Technologies, Inc., a subsidiary of Qualcomm Incorporated, operates, along with its subsidiaries, substantially all of our engineering, research and development functions, and substantially all of our products and services businesses, including our QCT semiconductor business.

Materials that are as of a specific date, including but not limited to press releases, presentations, blog posts and webcasts, may have been superseded by subsequent events or disclosures.

Nothing in these materials is an offer to sell or license any of the services or materials referenced herein.