Continuing with last week’s discussion of learning Azure, I wanted to explore how you can start to build an understanding of Azure VMs. In a previous article, I discussed how to correctly size Azure VMs. That’s an important part of the discussion, but it doesn’t dive too deep into the concept of SKUs. Today, we explore that.
Many people start working with virtual machines and look for the lowest price for comparable memory, storage, and cores. They are then surprised to see a significant performance difference. This is where a VM SKUs come in.
Not all systems are equal
Azure offers lots of different SKUs, each prefixed with a letter. If you’re coming from a traditional virtualization background, those can be quite confusing. In fact, you’re probably used to thinking of VMs as just a number of cores, some memory, and storage. Consider this: not all CPUs are created equal. For example, is one core from a 1990’s Pentium the same as one core from a Xeon processor? Memory is also not equal – it has different speeds and optimizations, with cache memory outperforming DIMMs. Even storage can be broken down into different types – local NVME( non-volatile memory express), networked solid-state drives (SSD), local SSD, networked classic hard disk drives (HDD), and many more.
In short, if you want to get the most out of your VMs, you want to understand the different performance characteristics available to you.
The right tool
Taking that a step further, a computer can be built and optimized for a number of different tasks. Perhaps it needs to do heavy machine learning. In that case, it needs to be optimized for accessing the graphic processing unit (GPU) and its numerous, parallelized cores. Some applications need to be optimized for CPU access, while others need more memory or faster disk access (IOPS). These scenarios each require a different configuration of memory, CPU, and storage requirements to optimize the machine for the task.
This is the concept behind Azure SKUs. A SKU is nothing more than a virtual machine which has been optimized to perform well with a specific workload. The letters often originated with a mnemonic that helps you to remember the primary workload characteristics. Over time, they added additional letters which indicate additional features or optimizations.
Mastering the letters
I haven’t found a good reference to the mnemonics that describe each SKU family, so I decided to document those. If you can remember the mnemonics, you’ll find that you can quickly and easily identify the optimal series for any workload without much issue.
|Alpha / AMD
|Entry-level VMs for dev/test. Originally, AMD Opteron-based
|Optimized for burst workloads, such as web servers. On average, the VM utilizes a smaller number of resources. During this time, it builds up credits that allow it to scale up to full performance for a period of time.
|General purpose enterprise applications. These systems provide a local SSD temporary disk, which is optimized for performance.
|Memory is comparable to G, but less CPU. Higher memory-to-CPU ratio.
|Workloads need faster CPUs or higher CPU-to-memory-ratio. This uses the same CPU as the D-series and is optimal for compute-focused workloads. The mnemonic is said to originate from the idea of the Ford F-series pickup trucks, which are general-purpose vehicles that tried to balance cost with an ability to handle a broad array of tasks
|Goliath / Gargantua
|Memory and compute
|More memory, compute, and SSD support than the D-series, optimized for supporting large operations and demanding workloads, including data warehousing. Until 2021, a G5 represented a fully isolated machine in a rack, so it was often the choice for guaranteeing hardware isolation. As such, it could be subdivided to support other machine types
|High Performance / HPC
|Intended for high-end computation and simulation, such as molecular modeling. Optimized for memory bandwidth and block performance, relying on NDR Infiniband (up to 800 GB/s memory bandwidth) and RDMA (remote direct-access memory) support to enable supercomputer-scale workloads
|Low-latency disk access
|While most Azure VMs rely on triple-redundant networked disks, this series utilizes local NVME drives to minimize disk latency. This can make it ideal for NoSQL solutions. The tradeoff is that you must implement your own replication strategy to ensure that data isn’t lost if the hardware fails. As a result, these are often used as members of replicating clusters.
|High-memory, compute-intensive workloads
|These support the highest vCPU count and largest memory. They can be ideal for large databases or applications which need significant compute and memory resources.
|High-performance computing and analysis. This series provides access to NVidia GPU cards to enable graphics rendering, video editing, visualization, and AI workloads. With the arrival of the P subfamily (NP-series), these machines are no longer purely NVidia-based.
Deeper down the rabbit hole
Understanding the family is the most important step in understanding what the VM was intended to support. That said, the complete name for a SKU will provide even more insights into supported features, restricted functionality, and available vCores. These are included in the naming of each individual SKU.
The convention for naming the VMs is
but the full names of SKUs generally follows the convention:
Standard_ (modern) or
Basic_ (legacy) VM type. When not specified,
Standard_ is implied.
|Single letter SKU from the table above
|Single letter that indicates a specific subtype in the family. As an example, the N-family contains the
V subfamilies (which represent specific NVidia graphics cards). In the D-series and E-series machines, the
C subfamily indicates
confidential computing and uses a Trusted Execution Environment (TEE) to restrict execution to only authorized code.
|Number of available vCPU cores. In older VMs (typically v2 and earlier), it represented a full core and hyper-thread (if available). In newer models, it takes 2 vCPU to equal one full core.
|Indicates a reduced number of available cores, but otherwise identical specs. For example, E8-2 would mean it’s an E8, but with only two of the cores available. This allows you to purchase the desired workload support but reduce the number of cores (and cost)
|A series of lowercase letters detailing additional supported features or functionality. See the table below.
|Uppercase letters and numbers that specify a hardware accelerator for specialized GPU SKUs. For example, NC-series VMs typically use V100 GPUs. The
T4 accelerator indicates it uses T4 GPUs with more cores and support for accelerated networking.
|A version specifier (lowercase
v and a number) which indicates a specific hardware revision (configuration), with higher versions representing newer hardware configurations.
Certain lowercase letters in a SKU represent additional supported features or characteristics. They can also indicate additional functionality or support for more specialized workloads.
|Uses AMD processors instead of Intel.
|Block storage performance. These use the standard remote disk storage, but are optimized to provide up to a 300% performance improvement with remote storage.
|The VM has a local temp disk available.
|A single-tenant compute unit. Previously, this was only supported by purchasing a G5 (or better).
|The amount of memory is reduced.
|Contains the highest amount of memory in a particular size.
|Uses Ampere Altra ARM cloud-native processors instead of Intel.
|Contains the smallest amount of memory in a particular size.
|Supports premium storage (SSD and potentially Ultra SSD).
The big picture
Understanding the SKU family is the first step to mastering your understanding of Azure VMs. From there, you can learn to use the additional specification details to quickly identify what the virtual machine supports and the workloads it targets. Make sure to review my article on sizing VMs to dive deeper into understanding the specifications. You’ll also learn about using Azure Compute Units (ACU) to understand relative performance.