Mastering Azure Virtual Machines

Category:

Azure

Tags:

#Azure

Published: May 26, 2023 Reading Time: 7 min

Continuing with last week’s discussion of learning Azure, I wanted to explore how you can start to build an understanding of Azure VMs. In a previous article, I discussed how to correctly size Azure VMs. That’s an important part of the discussion, but it doesn’t dive too deep into the concept of SKUs. Today, we explore that.

Many people start working with virtual machines and look for the lowest price for comparable memory, storage, and cores. They are then surprised to see a significant performance difference. This is where a VM SKUs come in.

Not all systems are equal

Azure offers lots of different SKUs, each prefixed with a letter. If you’re coming from a traditional virtualization background, those can be quite confusing. In fact, you’re probably used to thinking of VMs as just a number of cores, some memory, and storage. Consider this: not all CPUs are created equal. For example, is one core from a 1990’s Pentium the same as one core from a Xeon processor? Memory is also not equal – it has different speeds and optimizations, with cache memory outperforming DIMMs. Even storage can be broken down into different types – local NVME( non-volatile memory express), networked solid-state drives (SSD), local SSD, networked classic hard disk drives (HDD), and many more.

In short, if you want to get the most out of your VMs, you want to understand the different performance characteristics available to you.

The right tool

Taking that a step further, a computer can be built and optimized for a number of different tasks. Perhaps it needs to do heavy machine learning. In that case, it needs to be optimized for accessing the graphic processing unit (GPU) and its numerous, parallelized cores. Some applications need to be optimized for CPU access, while others need more memory or faster disk access (IOPS). These scenarios each require a different configuration of memory, CPU, and storage requirements to optimize the machine for the task.

This is the concept behind Azure SKUs. A SKU is nothing more than a virtual machine which has been optimized to perform well with a specific workload. The letters often originated with a mnemonic that helps you to remember the primary workload characteristics. Over time, they added additional letters which indicate additional features or optimizations.

Mastering the letters

I haven’t found a good reference to the mnemonics that describe each SKU family, so I decided to document those. If you can remember the mnemonics, you’ll find that you can quickly and easily identify the optimal series for any workload without much issue.

Series	Mnemonic	Usage	Details
A	Alpha / AMD	Economy Computing	Entry-level VMs for dev/test. Originally, AMD Opteron-based
B	Burst	Burst loads	Optimized for burst workloads, such as web servers. On average, the VM utilizes a smaller number of resources. During this time, it builds up credits that allow it to scale up to full performance for a period of time.
D	Disk	Enterprise applications	General purpose enterprise applications. These systems provide a local SSD temporary disk, which is optimized for performance.
E	Extended memory	Memory-intensive apps	Memory is comparable to G, but less CPU. Higher memory-to-CPU ratio.
F	Ford F-Series	Compute-bound apps	Workloads need faster CPUs or higher CPU-to-memory-ratio. This uses the same CPU as the D-series and is optimal for compute-focused workloads. The mnemonic is said to originate from the idea of the Ford F-series pickup trucks, which are general-purpose vehicles that tried to balance cost with an ability to handle a broad array of tasks
G	Goliath / Gargantua	Memory and compute	More memory, compute, and SSD support than the D-series, optimized for supporting large operations and demanding workloads, including data warehousing. Until 2021, a G5 represented a fully isolated machine in a rack, so it was often the choice for guaranteeing hardware isolation. As such, it could be subdivided to support other machine types
H	High Performance / HPC	High-performance compute	Intended for high-end computation and simulation, such as molecular modeling. Optimized for memory bandwidth and block performance, relying on NDR Infiniband (up to 800 GB/s memory bandwidth) and RDMA (remote direct-access memory) support to enable supercomputer-scale workloads
L	Low-Latency	Low-latency disk access	While most Azure VMs rely on triple-redundant networked disks, this series utilizes local NVME drives to minimize disk latency. This can make it ideal for NoSQL solutions. The tradeoff is that you must implement your own replication strategy to ensure that data isn’t lost if the hardware fails. As a result, these are often used as members of replicating clusters.
M	Monster	High-memory, compute-intensive workloads	These support the highest vCPU count and largest memory. They can be ideal for large databases or applications which need significant compute and memory resources.
N	NVidia	GPU applications	High-performance computing and analysis. This series provides access to NVidia GPU cards to enable graphics rendering, video editing, visualization, and AI workloads. With the arrival of the P subfamily (NP-series), these machines are no longer purely NVidia-based.

Deeper down the rabbit hole

Understanding the family is the most important step in understanding what the VM was intended to support. That said, the complete name for a SKU will provide even more insights into supported features, restricted functionality, and available vCores. These are included in the naming of each individual SKU.

Naming conventions

The convention for naming the VMs is detailed here, but the full names of SKUs generally follows the convention: N_FSP-CXA_V.

Placeholder	Meaning	Required	Details
N_	Class	No	Indicates `Standard_` (modern) or `Basic_` (legacy) VM type. When not specified, `Standard_` is implied.
F	Family	Yes	Single letter SKU from the table above
S	Subfamily	No	Single letter that indicates a specific subtype in the family. As an example, the N-family contains the `C`, `D`, `P`, and `V` subfamilies (which represent specific NVidia graphics cards). In the D-series and E-series machines, the `C` subfamily indicates confidential computing and uses a Trusted Execution Environment (TEE) to restrict execution to only authorized code.
P	vCPU Cores	Yes	Number of available vCPU cores. In older VMs (typically v2 and earlier), it represented a full core and hyper-thread (if available). In newer models, it takes 2 vCPU to equal one full core.
-C	Constraint	No	Indicates a reduced number of available cores, but otherwise identical specs. For example, E8-2 would mean it’s an E8, but with only two of the cores available. This allows you to purchase the desired workload support but reduce the number of cores (and cost)
X	Additional features	No	A series of lowercase letters detailing additional supported features or functionality. See the table below.
A	Accelerator	No	Uppercase letters and numbers that specify a hardware accelerator for specialized GPU SKUs. For example, NC-series VMs typically use V100 GPUs. The `T4` accelerator indicates it uses T4 GPUs with more cores and support for accelerated networking.
_V	Version	Yes	A version specifier (lowercase `v` and a number) which indicates a specific hardware revision (configuration), with higher versions representing newer hardware configurations.

Features

Certain lowercase letters in a SKU represent additional supported features or characteristics. They can also indicate additional functionality or support for more specialized workloads.

Letter	Mnemonic	Feature
a	AMD	Uses AMD processors instead of Intel.
b	Block	Block storage performance. These use the standard remote disk storage, but are optimized to provide up to a 300% performance improvement with remote storage.
d	“Diskful”	The VM has a local temp disk available.
i	Isolated	A single-tenant compute unit. Previously, this was only supported by purchasing a G5 (or better).
l	Low memory	The amount of memory is reduced.
m	Memory intensive	Contains the highest amount of memory in a particular size.
p	Processor	Uses Ampere Altra ARM cloud-native processors instead of Intel.
t	Tiny memory	Contains the smallest amount of memory in a particular size.
s	Storage	Supports premium storage (SSD and potentially Ultra SSD).

The big picture

Understanding the SKU family is the first step to mastering your understanding of Azure VMs. From there, you can learn to use the additional specification details to quickly identify what the virtual machine supports and the workloads it targets. Make sure to review my article on sizing VMs to dive deeper into understanding the specifications. You’ll also learn about using Azure Compute Units (ACU) to understand relative performance.

Happy DevOp’ing!