Back to All
Developer Blog

Ollama simplifies inference with open-source models on Snapdragon X series devices

Co-written with Manoj Khilnani and Michael Chiang.

Ollama is a tool that simplifies running open-source LLMs on a variety of hardware platforms, now including Snapdragon X-series devices.

Ollama has built a reputation for making it easy to get started with LLMs. Developers can dive in and use any of the models on Ollama.com for AI inference. With a single line of code, they can invoke an API that lets their apps switch from proprietary models to open-source models, including these:

  • Meta's Llama 3.2
  • Google's Gemma 2
  • Microsoft's Phi 3.5
  • Alibaba's Qwen 2.5
  • IBM's Granite Code
  • Mistral
  • Snowflake’s Arctic Embed

The tool also runs custom models contributed by Ollama and other vendors in the AI community. Given the increasing diversity of open-source models worldwide, developers using Ollama on devices powered by Snapdragon X Series now have a big advantage in choice of both LLM and hardware platform.

Keeping pace with the evolution of hardware and AI

As the installed base of devices powered by Snapdragon® Compute Platforms grows, software makers of all sizes see new market opportunities.

For Ollama, supporting Snapdragon X Series meant first enabling the engine for Windows on Snapdragon architecture.

As a result, developers can run Ollama natively on devices like Microsoft Copilot+ PCs built on Snapdragon X Series processors. From installation at ollama.com/download, the engine runs inference on devices powered by Snapdragon – especially Snapdragon X Elite.

Screenshot of Ollama interface

Developers choose Ollama because it supports open-source LLMs and runs locally. Instead of building their applications on proprietary, cloud-based models like GPT-4 from OpenAI and Claude from Anthropic, they want variety, control and several other business advantages, such as:

  • Low latency – Running LLMs locally means there is no need to send traffic across the network. That is especially valuable when time is of the essence for input to or output from the models.
  • Privacy – Whether the application is tasked with composing a recipe or interpreting medical data, users want to keep private, personally identifiable information (PII) local.

Portability from laptop to laptop, and laptop to cloud

Portability is a hallmark of Ollama. Regardless of the hardware brand or CPU architecture, developers can move easily from one to the other and enjoy the same user experience of Ollama. That now extends to Windows on Snapdragon devices.

As an example, developers already accustomed to Microsoft Visual Studio coding extensions can use Ollama to choose the models they want for, say, coding completion. A case in point is Continue, a startup building open-source tooling to help code completion on Visual Studio code.

As Michael Chiang, co-founder of Ollama, observes, “It’s a delight to make AI models even more accessible to developers and end-users through performance achievements of the Snapdragon X Elite Compute Platform. Developers can start using Ollama and all available models on the platform, and they have the option to import custom models to a laptop or compute device running Snapdragon X Elite. With a single line of code, they can switch from paying for external services like ChatGPT and Claude to running their apps with open-source models. Plus, they’ll enjoy the same developer experience when they’re ready to deploy to the cloud with the provider or server they choose.”

Multimodal support and the roadmap ahead

Ollama supports function calling, which allows LLM-based applications to fetch information for the model or interact with external tools through API calls. For example, LLMs are not well suited to answering prompts that require mathematical operations. But through Ollama’s support for function calling, developers can tell an LLM to pick up a calculator and input values, then use the result. Or in a weather app, where the LLM will not understand current conditions, Ollama can fetch the latest weather through an API and return it to the application.

Screenshot of Ollama function calling

LLMs are paving the way toward multimodal models, which Ollama supports. Multimodal models allow AI to go beyond analyzing text to analyzing video, images, voice queries and even sensor data, resulting in more-accurate answers to a greater variety of inputs. Use cases include optical character recognition (OCR) and summarizing the contents of a picture for eyesight-impaired users. Open-source models like Moondream, for computer vision, are already supported in Llama, and Ollama is collaborating with Meta to enable vision models in Llama 3.2 on Snapdragon devices.

Ollama currently runs Llama 3.2 1B (1 billion parameters) and 3B (3 billion parameters) models, with a proof of concept on Llama 3.2 11B (11 billion parameters) showcased at Snapdragon Summit.

Finally, Ollama is capable of running on the CPU of devices powered by Snapdragon X Series. Through collaboration with Qualcomm Technologies and Microsoft, Ollama plans to enable DirectML to offload inference tasks to the Qualcomm® Adreno™ GPU and Qualcomm® Hexagon™ NPU.

Your turn

Developers can get started immediately with any Ollama model.

  1. From https://ollama.com/download, download Ollama and install it.
  2. Find the desired model and copy the run command at the top of the model’s page (for example, ollama run mistral).
  3. Open a terminal window and execute the run command.

All Ollama models will now run on devices powered by Snapdragon X Series. Get ready for a performance boost, especially on devices powered by Snapdragon X Series.

Excited to hear more updates? Join the community of like-minded developers to connect, get support and exchange ideas at Qualcomm Developer Discord.

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries ("Qualcomm"). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.

Snapdragon and Qualcomm-branded products are products of Qualcomm Technologies, Inc. and/or its subsidiaries.

About the Author
Devang Aggarwal
Devang AggarwalProduct Manager, Senior
Qualcomm relentlessly innovates to deliver intelligent computing everywhere, helping the world tackle some of its most important challenges. Our leading-edge AI, high performance, low-power computing, and unrivaled connectivity deliver proven solutions that transform major industries. At Qualcomm, we are engineering human progress.

Stay connected

Get the latest Qualcomm and industry information delivered to your inbox.

Subscribe
Manage your subscription

© Qualcomm Technologies, Inc. and/or its affiliated companies.

Snapdragon and Qualcomm branded products are products of Qualcomm Technologies, Inc. and/or its subsidiaries. Qualcomm patented technologies are licensed by Qualcomm Incorporated.

Note: Certain services and materials may require you to accept additional terms and conditions before accessing or using those items.

References to "Qualcomm" may mean Qualcomm Incorporated, or subsidiaries or business units within the Qualcomm corporate structure, as applicable.

Qualcomm Incorporated includes our licensing business, QTL, and the vast majority of our patent portfolio. Qualcomm Technologies, Inc., a subsidiary of Qualcomm Incorporated, operates, along with its subsidiaries, substantially all of our engineering, research and development functions, and substantially all of our products and services businesses, including our QCT semiconductor business.

Materials that are as of a specific date, including but not limited to press releases, presentations, blog posts and webcasts, may have been superseded by subsequent events or disclosures.

Nothing in these materials is an offer to sell or license any of the services or materials referenced herein.