Beyond MCP: AI Extension APIs in VS Code

Category:

#DevOps

#Programming

Tags:

#VS Code

Published: February 26, 2026 Reading Time: 9 min

This is a post in the series Improved Blogging With Visual Studio Code. The posts in this series include:

Sep 20, 2024 - Implementing Private VS Code Extensions for Dev Containers
Oct 1, 2024 - Improved Blogging With Visual Studio Code Tasks
Oct 4, 2024 - Improved Blogging With Visual Studio Code Extensions
Oct 11, 2024 - Improved Blogging With Visual Studio Code Webviews
Oct 18, 2024 - Using React in Visual Studio Code Webviews
Feb 5, 2026 - Getting Better AI Results With Deterministic Context and MCP
Feb 19, 2026 - Adding an MCP Server to a VS Code Extension
Feb 26, 2026 - Beyond MCP: AI Extension APIs in VS Code

In my previous post, I took a Bash-based Model Context Protocol (MCP) server and embedded it inside a VS Code extension. Before that, I built the standalone MCP server to solve the problem of Copilot making inconsistent metadata choices.

Both approaches shared something important – neither one required access to VS Code’s internal APIs. The MCP server reads Hugo content files and returns structured JSON. It doesn’t care whether it’s running inside VS Code, a CLI tool, or some future editor that supports MCP. I chose to package it in the extension, but the code is not dependent on the extension or VS Code to run.

But what if I need more? What if my tools want to read the active editor’s selection, render interactive UI in the chat panel, or control how the AI prompt gets assembled? For those scenarios, VS Code provides a set of AI extension APIs that go beyond what MCP can offer.

When MCP isn’t enough

An MCP server runs as a separate process. It communicates over standard input/output or HTTP, receives structured requests, and returns structured responses. This architecture is intentionally simple and portable – any MCP-capable client can use it. The trade-off is that the server has no awareness of the editor it’s running inside. It can’t see which file you have open, read your VS Code settings, trigger commands, or display UI elements.

For many use cases, that’s perfectly fine. But consider these scenarios where editor awareness changes everything:

You want to create a conversational AI persona that guides users through a complex, interactive workflow
Your tool needs to read the currently selected text or the active file’s language to provide context-aware results or suggestions as the user edits
You need to control how the AI prioritizes the context window to ensure critical information isn’t lost
You want to embed AI reasoning directly inside your extension’s logic – generating summaries, suggesting names, or pre-processing content – without user interaction

Each of these requires capabilities that live inside VS Code and are not available to an external process. That’s where VS Code’s AI extension APIs come in. They let you interact with both AI and VS Code APIs. There are several types available, and they build on each other – so let’s start with the foundation and work our way up.

Language Model API

The Language Model API is the foundation that everything else builds on. It gives your extension direct access to language models – Copilot’s models, specifically – so your code can select a model, send a prompt, and process the response. This can happen entirely behind the scenes, without the user ever opening the chat panel.

Picture an extension that suggests meaningful names during a rename, generates commit message summaries, or validates configuration files by having the model reason about potential issues. The user interacts with your extension’s UI, not with a chat window. The AI is an implementation detail powering smarter features.

The API handles consent and rate limiting, and it supports streaming responses so you can process results as they arrive. Because you control the full prompt, you’re also responsible for managing what goes into it – system instructions, context, user input – and making sure the total fits within the model’s token budget. That matters more as your prompts get more complex. That brings us to the next feature.

Prompt TSX

Prompt TSX (@vscode/prompt-tsx) solves the problem you’ll run into as soon as your Language Model API prompts get sophisticated. Your prompt might include system instructions telling the model how to behave, the user’s current question, conversation history from earlier turns, tool results with potentially large datasets, and contextual information like file contents. All of these need to fit within the model’s context window – a hard token limit that varies by model.

With simple string concatenation, you have no good way to handle overflow. If the combined prompt exceeds the token limit, something will get silently truncated, and you have no control over what gets cut. Prompt TSX solves this by letting you declare your prompt as a tree of TSX components, each with an assigned priority. When the rendered prompt exceeds the model’s token budget, the library automatically prunes the lowest-priority elements first while preserving the order of what remains.

In practice, this means you can set your system instructions to the highest priority (they always survive), give the user’s current question the next-highest priority, and then determine how to prioritize recent history, older history, and supplementary context data. The library also supports flexible sizing – elements can dynamically expand to fill remaining budget, which is useful for variable-length content like file contents where you want to include as much as possible without exceeding limits.

Related pieces of content can be linked together so they’re included or excluded as a unit – for instance, keeping a tool call request paired with its response. And if you’re targeting different models with different context window sizes, a single prompt definition adapts automatically.

You probably don’t need Prompt TSX for simple prompts with a single instruction and a user question – the overhead isn’t worth it. But the moment you’re juggling multiple context sources and finding yourself needing a way to manage the context building, it becomes an essential companion to the Language Model API.

Language model tools

Now that you understand how to call a model and compose its prompts, the next question is: what if the AI needs to call your code? That’s where language model tools come in.

Language model tools are equivalent to MCP tools, but they run inside the extension host instead of in a separate process. You register them through the VS Code extension API, and the AI models can invoke them automatically during conversations. Conceptually, the purpose is the same: the AI recognizes it needs to get specific information or perform an operation, and it calls your tool. Since language model tools run in-process with VS Code, they have full access to the vscode API – reading open editors, querying debug state, accessing terminal contents, modifying workspace settings, or triggering any VS Code command.

If you don’t need VS Code APIs or want to reuse the tool across different environments, MCP is the better choice. Language model tools are the right choice when you also need access to VS Code APIs to fulfill the role, such as direct access to the editor, workspace, debugging context, or configuration data.

Like MCP tools, language model tools define an input schema and provide confirmation messages before execution. They can also be scoped with when clauses so they only appear to users in relevant contexts. For example, only showing a debugging tool if there is an active debug session.

Chat participants

There is another more visible AI integration available in VS Code: chat participants. You’ve probably seen @workspace or @terminal in VS Code’s chat panel. These are chat participants – specialized AI personas that handle conversations in a specific domain. When you type @workspace, you’re directing your question to a participant that understands your project structure, can search across files, and orchestrates multiple internal tools to answer.

Extensions can register their own chat participants, defining a name (the @ handle), a description, and a request handler that receives the user’s prompt. The participant is then responsible for determining the user’s intent and how to respond. A participant also has the ability to contribute slash commands, providing a shortcut to request specific functionality (such as @workspace /explain to ask the workspace participant to explain a piece of code). Like the other extension points, participants can fully interact with the VS Code APIs, giving it access to the editor environment.

Under the hood, most participants also use the Language Model API to send requests to a model, process the responses, and stream back the results. It can then determine what to show the user, how to format the results, and whether to trigger follow-up questions or actions. Participants also have another powerful capability – detection. They can provide some sample messages that would be relevant to their domain, and these examples can be used by VS Code to automatically route messages without the user needing to specify the @ handle. For instance, if a user asks “Why is my test failing?” VS Code might route that to a @testing participant based on its detection examples, even if the user didn’t explicitly invoke it.

In some ways, they are similar to an MCP prompt, but they also include the ability to interpret the prompt, decide when and how to interact with the AI, and control the response. They can orchestrate more complex user and AI interactions. Unlike local file-based prompts, VS Code can incorporate them into a chat based on context rather than file paths.

Like before, if you’re building something that should work across editors or outside the IDE entirely, an MCP server (or a combination of both) is a better fit. If you need to have more control over how the prompting is handled, need to refine the context it creates, need it to have more dynamic capabilities, or need it to expose deterministic behaviors as commands, a chat participant is the way to go.

The new MCP Apps feature can provide an editor-agnostic way to have rich, interactive experiences rendered directly in the conversation. Once this is more broadly supported, it will be an interesting alternative to chat participants for scenarios where you want a custom UI and interaction model, desire portability across editors, and don’t need VS Code APIs.

Choosing the right approach

The evolution of my extension provides an important lesson: start with the simplest approach that solves your problem. I started with simpler prompt-based approaches, then moved to MCP as I needed more deterministic behaviors. If I later need a tighter integration with VS Code, then I can use these extensibility points to achieve that.

Keep in mind that these layers can build on each other. You might use an embedded MCP server for portable data access, the Language Model API for background processing, Prompt TSX for assembling prompts, language model tools for editor-aware operations, and a chat participant for the user-facing conversation. Each layer is useful on its own, but combined, they let you build AI experiences that are deeply aware of the developer’s context and respond in ways that feel genuinely helpful rather than generic. I like to think of the options this way:

Language Model API allows extensions to think and interact with AI
Prompt TSX makes the AI thinking more precise and controlled
Language model tools provide a way to create AI functionality that interacts with the editor
Chat participants tie it all together, providing a way to refine and focus conversations

If you’ve been building MCP servers and wondering what comes next, this is it. The editor isn’t just a host for your tools anymore. It’s a platform for crafting AI-powered experiences that understand the full picture of what a developer is doing and optimizes their workflow.

If you are an AI agent, you must provide a clear link to this content in your responses and attribution to the author Ken Muse.