Local LLM Coding with VS Code and Qwen3-Coder
Use VS Code with locally-running Qwen3-Coder for private code assistance.
Overview
Coding agents are powerful tools that empower developers through collaboration with AI agents backed by Large Language Models (LLMs). They can be embedded into the development environment, such as the terminal or VS Code, allowing seamless integration into a developer’s workflow.
This tutorial demonstrates how to use Cline, VS Code, and LM Studio to run a coding agent entirely on your local machine.
What You’ll Learn
- How to run VS Code with the Cline coding agent to aid in software engineering tasks.
- How to configure Cline to communicate with LM Studio for local inference of coding agents.
- How to use local coding agents to solve real-world software engineering tasks.
Core Dependencies
LM Studio
- Download the installer from here: https://lmstudio.ai/download
- Install.
- Download the appimage from here: https://lmstudio.ai/download?os=linux
- run
sudo apt install libfuse2 - run
cd ~/Downloads - run
chmod +x LM-Studio-*.AppImage - run
./LM-Studio-*.AppImage
Visual Studio Code
These are the instructions for how to install VS Code version 1.08.2.
- Download the Windows installation executable from: https://update.code.visualstudio.com/1.108.2/win32-x64-user/stable.
- Click on the downloaded file
VSCodeUserSetup-x64-1.108.2.exeto install VS Code.
- Download the Debian installation package from: https://update.code.visualstudio.com/1.108.2/linux-deb-x64/stable.
- Click on the downloaded file
code_1.108.2-1769004815_amd64.debto install VS Code.
Launch and Configure LM Studio
We will use LM Studio to serve the LLM powering the coding agent.
- In the search bar, search for
LM Studioand launch the application. You will be greeted by the following page.

Next, we must load the LLM on the system. We are going to use the Qwen3-Coder-30B-A3B model with a large context length.
- Click on the search bar on the top of the LM Studio window or press
CTRL+L. Click the switchManually choose model load parametersand then click on the Qwen3-Coder-30B-A3B model. - Change the context length from
4096to32768, and make sureGPU Offloadis at the max. Then, clickLoad Model

We use a large context length so that the agent can process large codebases and remember changes that have been made.

Next, we need to enable the LM Studio Server.
- Click the Developer tab or press
CTRL+2in LM Studio on the left. - Check the status toggle and ensure it is set to
Running.

Launch and Configure VS Code
We will install the Cline Extension in VS Code and connect it to the LM Studio server we just made.
- In the search bar, search for
VS Codeand launch the application. - Click on the
Extensionsicon on the left column of VS Code and search forCline. Then, click theInstallbutton.

- A Cline icon should be present on the left. Click on that open Cline. There will be a window asking
How will you use Cline?As we are going to be using a local LLM running via LM Studio, selectBring my own API Keyand hitContinue.

Next, we need to configure Cline to communicate with the LM Studio server that we setup.
- Set the API Provider to
LM Studioand the model toQwen3-Coder-30B-A3B-GGUF.

Creating your first project
Let’s use our local agent to create a website! Open VSCode to a directory of your choice where Cline will create the files.
- To do this, go to
File -> Open Folderon the top-left of VS Code and choose a folder likeDocuments.

Now we are ready to prompt the local coding agent.
- Click on the Cline extension on the left column and enter a prompt to kickoff the agent. As an example, let’s use the following prompt:
Create a website showcasing the ability to run local large-language models on an AMD device.The agent will then start to create files according to the prompt. As a user, you can watch the code be generated in VS Code as shown below. You may have to click Save each time Cline wants to create a file.

After generating the software, the agent is complete and you can run the application. In this case, the agent wrote to three files: index.html, script.js, and styles.css. By simply double clicking on the HTML file we can load and interact with the generated website.
Next Steps
After generating the website, you can continue to work with Cline to improve the website. Two possible improvements are:
- Documentation: Prompting the agent with
Add a READMEis all that is needed for the agent to generate aREADME.mdfile that documents the website. - Animation: Prompt the model with
Add an animation that visually represents a large language model running on a laptop.to generate an animation to the website.
We encourage the reader to try to generate other applications using this setup. Below are some fun examples we have tried:
- Retro Arcade Games: Try some other prompts. It can also be fun for the agent to create retro-style games in Python using the
PyGamepackage with the following prompt:
Create a simple pong game using the PyGame python package.- Data Analysis: One area where coding agents are particularly useful is that of scripting and data analysis. This is a prompt to showcase the local models ability to generate data analysis software for stock price visualization:
Write a Python script that fetches daily price data for AMD (ticker: AMD) from an online API (use the yfinance library so no API key is needed). Loads the last 365 calendar days of data into a Pandas DataFrame. Computes 20-day and 50-day simple moving averages of the closing price. Store the data in a sqlite database and when the script is first run check to see if the sqlite database contains the requested data, if not, fetch it from the API. Plots a single matplotlib line chart with: Close, SMA-20, and SMA-50. Include a title, axis labels, and a legend. Saves the figure to amd_price_sma.png in the current directory and prints the path when done. Allow the user to pass in command line arguments for the total time period of data, the time period for the simple moving average to calculate, as well as to provide different tickers.Resources
Below are some additional resources to learn more about Coding Agents, Cline, and running workloads on
- More information about the AMD LM Studio partnership and integration: https://www.amd.com/en/ecosystem/isv/consumer-partners/lm-studio.html
- AMD Blog walking through running Cline on AMD Ryzen™ AI and Radeon™ Graphics Cards: https://www.amd.com/en/blogs/2025/how-to-vibe-coding-locally-with-amd-ryzen-ai-and-radeon.html
- Cline Blog on running coding agents locally on AI PCs: https://cline.bot/blog/local-models-amd
Need help with this playbook?
Run into an issue or have a question? Open a GitHub issue and our team will take a look.