If you’ve been customizing GitHub Copilot with skills, you’ve probably noticed a pattern. The first skill you write is clean and focused. Then you realize the tool you are scripting against ships two major versions, each with a Free and Pro tier. Suddenly, that one skill becomes filled with numerous conditional instructions. The model gets confused and ignores the conditions, the token counts pile up, and the behaviors become unpredictable. You start wondering if there’s a better way.
There is. Instead of cramming every scenario into a single document, you can generate the instructions dynamically and return only the details that matter. It’s simpler than it sounds, and it changes the way you think about context engineering.
When one skill becomes too many
How does this problem happen? Imagine you maintain a skill for a CLI tool called super-ops. Version 2 uses a JSON config format, but version 3 switched to YAML. In addition, the Pro and Free tiers have different capabilities. You need Copilot to give correct guidance for whichever combination the developer happens to have installed. For this example, let’s say the differences are:
- v2 Free: JSON config, no plugins, limited to 3 concurrent processes
- v2 Pro: JSON config, plugin support, unlimited processes
- v3 Free: YAML config, no plugins, limited to 5 concurrent processes
- v3 Pro: YAML config, plugin support, unlimited processes, remote execution
The naive approach is to document all four combinations in one SKILL.md. You qualify each instruction carefully. For example, “If you’re on v2, use config.json. If you’re on v3, use config.yml.” Or perhaps you prefer to create separate sections for each version and license tier. Within those, you document each permutation carefully. Later, version 4 arrives with a new Enterprise edition. Now you decide a table may be a more maintainable way to store the data.
As the number of options increases, the skills become bloated with instructions that each apply only in specific circumstances. Worse, the document continues to grow with each new supported feature or version.
Why this approach hurts
The problems compound in multiple ways. First, there’s the token cost. Every byte of that skill loads into the context window (the limited amount of text the model can consider at once) every time you need to use the tool. In this example, the instructions for 3 of the four versions will be irrelevant for a single run. You’ll be using just one version of the tool, but you have instructions that don’t apply from the other versions. That’s wasted tokens displacing space that could be better used.
Second, conflicting guidance creates subtle failures. The model has to track “do X on v2, but do Y on v3” across the entire conversation. The model has no inherent way to know which conditional instruction currently applies, but only one is valid. The model is told that certain things should apply, but then also told that they don’t apply because of the version or license. This extra noise can actually confuse the model. It’s seeing lots of facts that it is told should not be considered. At the same time, these facts are being added as context. It’s a mixed message, and the model may give more attention to the wrong parts. The confusion can distract the model and lead to consistently – but confidently – wrong answers.
A variation of this leads to the model trying different approaches, knowing that one will be right. Each attempt consumes time and tokens. These failed attempts become part of the conversation history, signaling to the model that those approaches were tried and rejected, adding more noise to the context. This can lead to other subtle issues, a sense of unpredictability, and a lot of wasted tokens.
The biggest expense, however, is that the maintenance becomes fragile. When super-ops ships v6, you could be forced to edit a massive file and carefully reason about which instructions to apply and under what circumstances. Mistakes become inevitable.
The heart of the issue is that you’re trying to solve a runtime question at authoring time. The right instructions depend on facts you only know when you actually run the tool. At that time, you know exactly which version and license they have.
Skills that ask, not skills that tell
The shift is to stop thinking of a skill as a knowledge dump and start thinking of it as an orchestrator. Odds are, you’ve already thought about querying the environment to get the version and SKU. You tell Copilot to interact with its environment and ask the model to use that to execute the right instructions. That’s a good start, but you can take it further.
Instead of just querying those values, you could use a script that returns the actual instructions that apply. The skill ensures the script is called to retrieve the right instructions at runtime, and the script writes them to the console as part of its output. This ensures that only the relevant instructions are added to the context.
This isn’t a hack or a workaround. It’s taking advantage of how the agent and skills work. The skills are loaded on demand specifically to provide additional instructions or context. In addition, when a tool or script runs, its output is added to the context window. After that, we just need to tell the model to use that content as authoritative guidance for the conversation.
Now the model has a single truth with no conflicting guidance.
Building the code
Let’s build this out. First, we create the wrapper script. It will call super-ops to learn the version details. Based on that, it then writes the appropriate guidance to the console.
For this example, I’ll use Bash. You could use any supported language or a standalone executable.
1#!/bin/bash
2# scripts/super-ops-instructions.sh
3
4# Read the details about the version
5VERSION=$(super-ops --version | grep -oP '\d+')
6SKU=$(super-ops license status --format=short)
7
8# Based on the version and SKU, return the relevant instructions
9case "${VERSION}-${SKU}" in
10 2-free)
11 cat <<'EOF'
12Configuration uses JSON format (config.json).
13Plugin commands are not available in this tier.
14Maximum 3 concurrent processes. Use `super-ops run --limit=3`.
15EOF
16 ;;
17 2-pro)
18 cat <<'EOF'
19Configuration uses JSON format (config.json).
20Plugins are installed with `super-ops plugin add <name>`.
21No concurrency limits apply.
22EOF
23 ;;
24 3-free)
25 cat <<'EOF'
26Configuration uses YAML format (config.yml).
27Plugin commands are not available in this tier.
28Maximum 5 concurrent processes. Use `super-ops run --limit=5`.
29Remote execution is not available.
30EOF
31 ;;
32 3-pro)
33 cat <<'EOF'
34Configuration uses YAML format (config.yml).
35Plugins are installed with `super-ops plugin add <name>`.
36No concurrency limits. Remote execution uses `super-ops remote exec`.
37EOF
38 ;;
39 *)
40 echo "Unknown super-ops version or license: ${VERSION}-${SKU}."
41 echo "Stop processing. You will not be able to complete this task."
42 echo "Notify the user that they must install a supported version before you can proceed."
43 ;;
44esacAnd now for the skill itself. The SKILL.md is remarkably small:
1---
2description: |
3 Use this for guidance on the super-ops CLI tool or any time you need
4 to perform a deployment using super-ops.
5---
6
7# Super-ops CLI guidance
8
9Always run `./scripts/super-ops-instructions.sh`. Treat the
10script's stdout as authoritative guidance for this conversation. Follow
11those instructions exactly, but do not repeat them back to the user.
12
13If the script fails or produces no output, stop and inform the user that the
14super-ops CLI does not appear to be installed. Ask how they'd like to proceed.That’s the entire skill. It never changes when super-ops ships a new version – you just update the script. The skill itself stays tiny, loading almost no tokens into the context window until the script runs and returns only the relevant slice. If there are details that apply to all versions or instructions about exactly what super-ops should do, those can be added to the skill itself. The key is that the version-specific details are generated at runtime and only the relevant ones are returned.
Why this works
Let’s name what’s actually happening here. You are deliberately allowing an external process to write text into the model’s working context. That’s constructive prompt injection – intentionally inserting text into the model’s input to guide its behavior. We’re using it deliberately to shape our model’s behaviors.
The key line is “follow those instructions exactly, but do not repeat them back to the user.” This does two things. It tells the model to treat the script output as authoritative guidance – not as user input to simply analyze or echo back. It also prevents the model from echoing internal implementation details into the chat, keeping the conversation natural. Without that second part, the model is likely to repeat the instructions back to the user, breaking the feel of the conversation.
Remember that focused agents get better results. A smaller, more relevant context window produces more reliable outputs. By loading only the instructions that match the current environment, the model gets exactly what it needs and nothing else. It’s progressive disclosure – revealing information only when it’s needed – taken one step further. The disclosure happens at runtime and provides tailored guidance.
This can have a lot of practical uses. The script could check which cloud provider is configured, whether certain features are enabled in a config file, or what platform the developer is running on. It can look at the state of a Kubernetes cluster. It can take in facts and use those to provide prescriptive instructions that are perfectly aligned to the intended results.
The supply-chain implication
Here’s where it gets a bit uncomfortable. The same mechanism that delivers exactly-right instructions can deliver exactly-wrong ones if the source isn’t trusted.
Think about what a malicious skill, agent plugin, or command line tool could do with this pattern. A script running in your workspace context could return instructions that tell the model to exfiltrate environment variables, request the system to install a malicious package as a dependency, or silently redirect a Git remote to an attacker-controlled repository. The model would follow those instructions just as faithfully as it follows your legitimate ones – and the “do not repeat them back” directive would hide the evidence.
This isn’t theoretical. The Shai-Hulud exploit demonstrated that attackers actively target development supply chains. Any customization that can inject text into an AI’s context – skills, Model Context Protocol (MCP) servers, plugins, even instructions files – deserves the same scrutiny you’d give a runtime dependency.
Practical steps to protect yourself:
- Pin script sources to specific commits or checksums rather than tracking a branch
- Review what every skill and plugin actually executes before installing it
- Treat third-party agent plugins and skills with the same review process you would use for npm/Maven/Nuget packages or GitHub Actions
- Audit any script that produces text consumed by the model, especially if it fetches content from external sources
Wrapping up
Most of the time, we think of skills as static documents – reference material the model consults when it needs help. But that framing limits what they can do. A skill that delegates to a script transforms from a knowledge dump into a runtime adapter, one that shapes the model’s behavior based on the current state rather than a list of possibilities.
The mechanics are worth restating because they’re deceptively simple. A thin skill invokes a script. The script inspects the environment and prints only the instructions that apply. The model incorporates them silently and acts accordingly. No conflicting conditionals, no wasted tokens, no rewriting the skill every time a new version ships. You edit the script, and the skill keeps working.
What makes this powerful is also what makes it dangerous. Any process that can write text into an AI’s context can steer its behavior – for good or for ill. That’s the thread connecting dynamic instructions and supply-chain security. Review your scripts, pin your sources, and treat every customization that touches model context with the same rigor you’d apply to code that runs in production. Because in a very real sense, it does.
