Developer Blog

Ollama simplifies inference with open-source models on Snapdragon X series devices

Written by

Devang Aggarwal

Oct 22, 2024

Co-written with Manoj Khilnani and Michael Chiang.

Ollama is a tool that simplifies running open-source LLMs on a variety of hardware platforms, now including Snapdragon X-series devices.

Ollama has built a reputation for making it easy to get started with LLMs. Developers can dive in and use any of the models on Ollama.com for AI inference. With a single line of code, they can invoke an API that lets their apps switch from proprietary models to open-source models, including these:

Meta's Llama 3.2
Google's Gemma 2
Microsoft's Phi 3.5
Alibaba's Qwen 2.5
IBM's Granite Code
Mistral
Snowflake’s Arctic Embed

The tool also runs custom models contributed by Ollama and other vendors in the AI community. Given the increasing diversity of open-source models worldwide, developers using Ollama on devices powered by Snapdragon X Series now have a big advantage in choice of both LLM and hardware platform.

Keeping pace with the evolution of hardware and AI

As the installed base of devices powered by Snapdragon® Compute Platforms grows, software makers of all sizes see new market opportunities.

For Ollama, supporting Snapdragon X Series meant first enabling the engine for Windows on Snapdragon architecture.

As a result, developers can run Ollama natively on devices like Microsoft Copilot+ PCs built on Snapdragon X Series processors. From installation at ollama.com/download, the engine runs inference on devices powered by Snapdragon – especially Snapdragon X Elite.

Developers choose Ollama because it supports open-source LLMs and runs locally. Instead of building their applications on proprietary, cloud-based models like GPT-4 from OpenAI and Claude from Anthropic, they want variety, control and several other business advantages, such as:

Low latency – Running LLMs locally means there is no need to send traffic across the network. That is especially valuable when time is of the essence for input to or output from the models.
Privacy – Whether the application is tasked with composing a recipe or interpreting medical data, users want to keep private, personally identifiable information (PII) local.

Portability from laptop to laptop, and laptop to cloud

Portability is a hallmark of Ollama. Regardless of the hardware brand or CPU architecture, developers can move easily from one to the other and enjoy the same user experience of Ollama. That now extends to Windows on Snapdragon devices.

As an example, developers already accustomed to Microsoft Visual Studio coding extensions can use Ollama to choose the models they want for, say, coding completion. A case in point is Continue, a startup building open-source tooling to help code completion on Visual Studio code.

As Michael Chiang, co-founder of Ollama, observes, “It’s a delight to make AI models even more accessible to developers and end-users through performance achievements of the Snapdragon X Elite Compute Platform. Developers can start using Ollama and all available models on the platform, and they have the option to import custom models to a laptop or compute device running Snapdragon X Elite. With a single line of code, they can switch from paying for external services like ChatGPT and Claude to running their apps with open-source models. Plus, they’ll enjoy the same developer experience when they’re ready to deploy to the cloud with the provider or server they choose.”

Multimodal support and the roadmap ahead

Ollama supports function calling, which allows LLM-based applications to fetch information for the model or interact with external tools through API calls. For example, LLMs are not well suited to answering prompts that require mathematical operations. But through Ollama’s support for function calling, developers can tell an LLM to pick up a calculator and input values, then use the result. Or in a weather app, where the LLM will not understand current conditions, Ollama can fetch the latest weather through an API and return it to the application.

LLMs are paving the way toward multimodal models, which Ollama supports. Multimodal models allow AI to go beyond analyzing text to analyzing video, images, voice queries and even sensor data, resulting in more-accurate answers to a greater variety of inputs. Use cases include optical character recognition (OCR) and summarizing the contents of a picture for eyesight-impaired users. Open-source models like Moondream, for computer vision, are already supported in Llama, and Ollama is collaborating with Meta to enable vision models in Llama 3.2 on Snapdragon devices.

Ollama currently runs Llama 3.2 1B (1 billion parameters) and 3B (3 billion parameters) models, with a proof of concept on Llama 3.2 11B (11 billion parameters) showcased at Snapdragon Summit.

Finally, Ollama is capable of running on the CPU of devices powered by Snapdragon X Series. Through collaboration with Qualcomm Technologies and Microsoft, Ollama plans to enable DirectML to offload inference tasks to the Qualcomm® Adreno™ GPU and Qualcomm® Hexagon™ NPU.

Your turn

Developers can get started immediately with any Ollama model.

From https://ollama.com/download, download Ollama and install it.
Find the desired model and copy the run command at the top of the model’s page (for example, ollama run mistral).
Open a terminal window and execute the run command.

All Ollama models will now run on devices powered by Snapdragon X Series. Get ready for a performance boost, especially on devices powered by Snapdragon X Series.

Excited to hear more updates? Join the community of like-minded developers to connect, get support and exchange ideas at Qualcomm Developer Discord.

Windows on Snapdragon Open Source AI

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries ("Qualcomm"). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.

Snapdragon and Qualcomm-branded products are products of Qualcomm Technologies, Inc. and/or its subsidiaries.

About the Author

Devang AggarwalProduct Manager, Senior