Skip to content
Device Family
Device
OS

Local LLM Coding with VS Code and Qwen3-Coder

Use VS Code with locally-running Qwen3-Coder for private code assistance.

Overview

Coding agents are powerful tools that empower developers through collaboration with AI agents backed by Large Language Models (LLMs). They can be embedded into the development environment, such as the terminal or VS Code, allowing seamless integration into a developer’s workflow.

This tutorial demonstrates how to use Cline, VS Code, and LM Studio to run a coding agent entirely on your local machine.

What You’ll Learn

  • How to run VS Code with the Cline coding agent to aid in software engineering tasks.
  • How to configure Cline to communicate with LM Studio for local inference of coding agents.
  • How to use local coding agents to solve real-world software engineering tasks.

Core Dependencies

LM Studio

  1. Download the installer from here: https://lmstudio.ai/download
  2. Install.
  1. Download the appimage from here: https://lmstudio.ai/download?os=linux
  2. run sudo apt install libfuse2
  3. run cd ~/Downloads
  4. run chmod +x LM-Studio-*.AppImage
  5. run ./LM-Studio-*.AppImage

Visual Studio Code

These are the instructions for how to install VS Code version 1.08.2.

  1. Download the Windows installation executable from: https://update.code.visualstudio.com/1.108.2/win32-x64-user/stable.
  2. Click on the downloaded file VSCodeUserSetup-x64-1.108.2.exe to install VS Code.
  1. Download the Debian installation package from: https://update.code.visualstudio.com/1.108.2/linux-deb-x64/stable.
  2. Click on the downloaded file code_1.108.2-1769004815_amd64.deb to install VS Code.

Launch and Configure LM Studio

We will use LM Studio to serve the LLM powering the coding agent.

  • In the search bar, search for LM Studio and launch the application. You will be greeted by the following page.

LM Studio Initial Screen

Next, we must load the LLM on the system. We are going to use the Qwen3-Coder-30B-A3B model with a large context length.

  • Click on the search bar on the top of the LM Studio window or press CTRL+L. Click the switch Manually choose model load parameters and then click on the Qwen3-Coder-30B-A3B model.
  • Change the context length from 4096 to 32768, and make sure GPU Offload is at the max. Then, click Load Model

Selecting Model

We use a large context length so that the agent can process large codebases and remember changes that have been made.

Configuring Model

Next, we need to enable the LM Studio Server.

  • Click the Developer tab or press CTRL+2 in LM Studio on the left.
  • Check the status toggle and ensure it is set to Running.

Server Status

Launch and Configure VS Code

We will install the Cline Extension in VS Code and connect it to the LM Studio server we just made.

  • In the search bar, search for VS Code and launch the application.
  • Click on the Extensions icon on the left column of VS Code and search for Cline. Then, click the Install button.

Installing Cline Extension

  • A Cline icon should be present on the left. Click on that open Cline. There will be a window asking How will you use Cline? As we are going to be using a local LLM running via LM Studio, select Bring my own API Key and hit Continue.

Account Creation

Next, we need to configure Cline to communicate with the LM Studio server that we setup.

  • Set the API Provider to LM Studio and the model to Qwen3-Coder-30B-A3B-GGUF.

Model Configuration

Creating your first project

Let’s use our local agent to create a website! Open VSCode to a directory of your choice where Cline will create the files.

  • To do this, go to File -> Open Folder on the top-left of VS Code and choose a folder like Documents.

VS Code Empty Folder

Now we are ready to prompt the local coding agent.

  • Click on the Cline extension on the left column and enter a prompt to kickoff the agent. As an example, let’s use the following prompt:
Create a website showcasing the ability to run local large-language models on an AMD device.

The agent will then start to create files according to the prompt. As a user, you can watch the code be generated in VS Code as shown below. You may have to click Save each time Cline wants to create a file.

Cline Code Generation

After generating the software, the agent is complete and you can run the application. In this case, the agent wrote to three files: index.html, script.js, and styles.css. By simply double clicking on the HTML file we can load and interact with the generated website.

Next Steps

After generating the website, you can continue to work with Cline to improve the website. Two possible improvements are:

  • Documentation: Prompting the agent with Add a README is all that is needed for the agent to generate a README.md file that documents the website.
  • Animation: Prompt the model with Add an animation that visually represents a large language model running on a laptop. to generate an animation to the website.

We encourage the reader to try to generate other applications using this setup. Below are some fun examples we have tried:

  • Retro Arcade Games: Try some other prompts. It can also be fun for the agent to create retro-style games in Python using the PyGame package with the following prompt:
Create a simple pong game using the PyGame python package.
  • Data Analysis: One area where coding agents are particularly useful is that of scripting and data analysis. This is a prompt to showcase the local models ability to generate data analysis software for stock price visualization:
Write a Python script that fetches daily price data for AMD (ticker: AMD) from an online API (use the yfinance library so no API key is needed). Loads the last 365 calendar days of data into a Pandas DataFrame. Computes 20-day and 50-day simple moving averages of the closing price. Store the data in a sqlite database and when the script is first run check to see if the sqlite database contains the requested data, if not, fetch it from the API. Plots a single matplotlib line chart with: Close, SMA-20, and SMA-50. Include a title, axis labels, and a legend. Saves the figure to amd_price_sma.png in the current directory and prints the path when done. Allow the user to pass in command line arguments for the total time period of data, the time period for the simple moving average to calculate, as well as to provide different tickers.

Resources

Below are some additional resources to learn more about Coding Agents, Cline, and running workloads on

Need help with this playbook?

Run into an issue or have a question? Open a GitHub issue and our team will take a look.