Ken Muse

Deploying Services on GitHub Runner Custom Images


Throughout this series on custom runner images, you’ve seen how to pre-cache Docker images, set up pre-job hooks, and store repositories on your runners. Today, let’s explore something even more powerful: running services directly on your custom runner images. Because you have administrative rights during image creation, you can start services that persist into every workflow run – opening up creative possibilities for speeding up builds and reducing external dependencies.

Understanding why this works

When you build a custom runner image, your workflow runs with elevated privileges on the virtual machine. This means you can install software, modify system configurations, and start background services. Anything you configure during the image build gets captured in the snapshot.

But here’s where it gets interesting: you can also start services during pre-job scripts. As you learned earlier in this series, pre-job hooks run before your workflow steps begin. Services started during those hooks remain available throughout the entire job. This gives you two options for deploying services:

  • Image creation time
    Start the service during the image build so it’s always running when the runner boots
  • Pre-job script time
    Start the service dynamically when a job begins, giving you more control

Both approaches have their place, and you might even combine them for different scenarios.

Example: Running a local Docker registry

One of the most practical uses for this capability is running a local Docker registry on your runner. Instead of pulling images from a remote registry (and paying for egress), your workflows can pull from localhost. Here’s how you can set this up.

During your image creation workflow, you can add a step to the pre-job script that starts a Docker registry container:

1  docker run -d -p 5000:5000 \
2    --restart=always  -v /opt/registry:/var/lib/registry \
3    --name registry registry:2.7

This command starts a registry on port 5000 that automatically restarts if there are any issues. The --restart=always flag is crucial – it ensures the registry starts up whenever the custom image launches a new runner. The volume mount ensures that any layers loaded into the registry during image creation are available to your workflows.

Since the image runs on a full VM, you don’t have to start services in the pre-job script. You could also configure processes to start when the VM boots. Starting services during the pre-job gives you more control over what’s available for each workflow.

Pre-staging container images

Taking this one step further, you can push images to your local registry during image creation. Those images become immediately available to your workflows without any network calls. For example, after starting the registry, you might push commonly used images:

 1  # Start a registry using a known path for persistence
 2  docker run --rm -d -p 5000:5000 -v /opt/registry:/var/lib/registry --name registry registry:2.7
 3  
 4  # Pull from the remote registry
 5  docker pull alpine:latest
 6  
 7  # Tag for local registry
 8  docker tag alpine:latest localhost:5000/alpine:latest
 9  
10  # Push to local registry
11  docker push localhost:5000/alpine:latest
12  
13  # Stop the registry and terminate the container
14  docker stop registry

Now your workflows can use these pre-staged images. Here’s a workflow job that runs in a container pulled from the local registry:

1  container-task:
2    permissions:
3      contents: read
4    runs-on: my-custom-image
5    container: localhost:5000/alpine:latest
6    steps:
7      - name: Show container info
8        run: cat /etc/os-release

The container starts almost instantly because Docker doesn’t need to download anything – the image is already present locally. You also avoid egress costs from the original registry, since you’re pulling from localhost.

Other service possibilities

The local registry example is just one application of this pattern. You can apply the same approach to other services:

  • Package caches
    Run a local proxy for npm, Maven, or pip packages. Pre-populate it with known-good versions of your dependencies to speed up installs and reduce supply chain risks.
  • Proxies
    Deploy caching proxies that reduce external network calls or improve build times
  • VPN
    Silently connect the runner to internal resources based on the workflow’s needs.
  • Logging and Metrics
    Launch a background service that collects and reports metrics about your builds or processes on the server. Expose the data using Open Telemetry, push logs to a centralized SIEM, or run a post-job analysis and write the results to $GITHUB_STEP_SUMMARY.

Each of these can be configured during image creation or started dynamically through pre-job hooks, depending on your needs.

Important considerations

Just because you can run services on your runners doesn’t always mean you should. Before implementing this pattern, think carefully about:

  • Security implications
    Any service running on your runner is accessible to workflows using that runner. Make sure you trust all the code that will execute. Pre-staged images and packages should come from verified sources to avoid supply chain attacks. Remember that anything added to a cache becomes a permanent part of the image until it’s rebuilt, creating a potential for cache poisoning to create long-running vulnerabilities.
  • Maintenance implications
    Services need updates. A local registry running an old version might have vulnerabilities. Plan for regular image rebuilds to keep everything current.
  • Complexity implications
    Adding services creates another layer to understand and debug. In most cases, it’s simpler to use remote sources and – when necessary – job/service containers hosted on registries. For many scenarios, GitHub’s hosted runners with built-in caching and services are sufficient. Before implementing this pattern, consider whether a local Azure Container Registry paired with a Larger Hosted Runner and Azure VNet injection might provide the same benefits with less complexity.
  • Cost implications
    Services consume CPU, memory, and storage. Evaluate whether the performance benefits outweigh the resource costs for your specific use case.

The bottom line

Custom runner images give you remarkable flexibility to optimize your CI/CD pipelines. Running services directly on your runners can provide a way of centralizing dependencies, speeding up builds, and reducing external calls. However, it’s essential to weigh the benefits against the potential risks and maintenance overhead.

With great power comes great responsibility, but also lots of fun ways to customize your runners. Experiment with these patterns, measure the impact on your workflows, and find the right balance for your team’s needs.