Build and run 100% local AI agents – Snapdragon X Elite demo video
Sign up for Developer monthly newsletter
Join thousands of developers around the globe who receive latest news and updates from our monthly curated newsletter.
Sign upWe’ve posted recently about running AI inference locally on devices powered by Snapdragon X Series processors. And now Paul Couvert has published a step-by-step video on how you can create and run an AI agent 100% locally on a laptop running Windows on Snapdragon. The AI agent is designed to automate an ordinary, manual procedure.
Paul points out the same advantages we’ve been emphasizing for running AI locally, including:
- No need to be connected to the internet; you can work entirely off line.
- No need to be a software developer.
- Complete privacy for your data, prompts and results.
- You can do this with open-source software; no need to pay a proprietary provider like OpenAI or Anthropic for an API key.
Below, we summarize the steps in the demo video to show how easy it is to try this on your own.
1. Setting up local chat with an LLM
First, you’ll configure your laptop or PC so you can chat locally with an LLM.
Paul walks through downloading and installing LM Studio, a cost-free tool for working with models. LM Studio offers an installer for Windows on Snapdragon.
You’ll then select and load an LLM to run locally in LM Studio; Paul recommends the Google Gemma 3 12B, a new model that is powerful for its size, from Bartkowski. LM Studio offers multiple quantizations (options and sizes) for the model, and Paul uses Q4_0 because it’s optimized for laptops powered by Snapdragon. The model is 8+ GB in size.
Once you’ve installed the model, you’ll have an LLM you can chat with locally.
2. The manual procedure to be automated
You can design and create AI agents that save you time by automatically executing manual procedures. For this demo, the manual task is similar to scraping websites and online documents for specific text and placing the text in a file. The steps are:
- Go to the Models section of Hugging Face
- Filter AI models by type of license
- Sort the resulting models by Most Likes
- Extract the names of the 5 most popular models
- Copy the names and paste them into a new .txt file
Come for support, stay for the community
Get support from experts, connect with like-minded developers, and access exclusive virtual events.
You’ll build an AI agent to automate the procedure and help you discover similar use cases.
3. Getting the open-source software
Besides LM Studio and the Google Gemma model, you’ll download and install other open-source, free-of-charge components:
- Smolagents, a small Python library for easily building AI agents
- Python, to run Smolagent
- Visual Studio Code, an integrated development environment (IDE)
If you prefer, you may use other tools with similar functions.
4. Creating and executing the prompt
To generate the AI agent you want, you’ll have to give LM Studio and the Gemma 3 model a prompt. In this case, the prompt consists of three parts:
- Natural language instructions – The text explains to the model that you want to use Smolagents to build an AI model that will build AI agents. You want the result to explain step by step how you can run the AI agent on Windows 11 in a uv Python environment. You mention that VS Code and Python are already installed, and that you want to run locally using LM Studio.
- Sample script
- LiteLLM documentation – From this bit of documentation, the model will learn how to connect the AI agent to LM Studio and run locally, without relying on proprietary APIs.
You copy the entire prompt, paste it into the “Type a message...” box in LM Studio, then select the Gemma 3 model downloaded earlier and click Send. At this point, the model loads and executes the prompt entirely locally. You can even disconnect from the network without interrupting execution.
The video shows the Windows Task Manager and the distribution of work across the local CPU, GPU and NPU. Paul points out that the laptop powered by Snapdragon can perform compute-intensive AI inference against a full model while simultaneously recording and encoding video.
Within a minute, the Assistant field in LM Studio begins scrolling with the results from the model:
When LM Studio is finished, you’ll have not only step-by-step instructions for creating the AI agent but also the Python script for the agent. Copy the entire script in preparation for the next step.
5. Running the script for the AI agent in Visual Studio Code
Switch to Visual Studio Code (VS Code) and paste the Python script into a new file (in the demo, app.py) in a VS Code folder (here, BLOGAGENT).
To power the AI agent, you must connect it to LM Studio. In the VS Code editor, modify a couple of values (model_id and api_base) in the Python script. Then, from the Terminal pane, launch the Python script (here, you run python app.py).
Over in LM Studio, the Developer Log shows debug messages from the AI agent:
In VS Code, the Terminal pane displays the progress:
When the Python script finishes, it saves the new file with the results in the BLOGAGENT folder:
Your turn
That’s how you can run an AI agent 100% locally thanks to LM Studio, Visual Studio Code, Python and Smolagents from Hugging Face. The video proves that devices powered by Snapdragon X and Snapdragon X Elite processors can easily perform 100% local AI inference even in airplane mode. They ensure strong privacy and run cost-free, open-source software.
Ready to try AI inference on your own laptop? Most of what you need to know is in this post, but you can follow along to the demo video from Paul Couvert (17 minutes).
You’ll find links above to all the software you need.

