What is YAML?

Learn via video courses
Topics Covered

Overview

The recursive YAML acronym stands for 'YAML Ain't Markup Language,' and trust me, it's not your typical boring markup language. It's like a cool kid who hangs out with all the popular languages and picks up their best traits along the way. For example:

  • Scalars, lists, and associative arrays are based on Perl, so you know it's flexible and powerful.
  • The document separator '---' is based on MIME, so it's all about data transmission and organization.
  • Escape sequences are based on C, so it's got that geeky, low-level charm.
  • Whitespace wrapping is based on HTML, so it's easy to read and write, just like your favourite web page!

Still, wondering what is YAML? That's all right! This blog will cover it all.

Introduction

Welcome to the world of YAML, a powerful yet simple data serialization language that is changing the game in modern technologies. With its minimalistic syntax and whitespace-based structure, YAML provides an easy-to-read and easy-to-write way to represent data.

From its relevance in the DevOps and containerization era to its growing popularity in technologies like Ansible and Kubernetes, YAML is a key tool for modern development workflows. So, let's unlock the full potential of YAML and discover why it's a game-changer in the world of data serialization!

YAML Syntax

We know that YAML has a really simple and easy-to-read syntax. It's time for us to discuss how. The fundamentals of YAML are indentations and whitespaces. Just like in Python, if something is indented beneath another entity, it is treated as a part of that block.

YAML usually starts with three dashes '---'. These hyphens depict the beginning of a new YAML document. Yes, YAML supports multiple documents within a single file, and YAML parsers can segregate these documents by using these very hyphens.

YAML Syntax Example

What is a YAML file and how to write in it? Well, a YAML file is a file from where a YAML parser picks up code from. Here's an example to demonstrate the format of a YAML file:

See how minimal and easy to understand the file is? Indentations make it easy to cluster data in our minds and ingest it effortlessly. Meanwhile, here's the same data in JSON format:

Right away, there's a lot of visual clutter in JSON. Needless to say, with YAML's minimal approach, it has become very easy to use and understand serialization format.

What is YAML Used For?

YAML is commonly used as a data serialization format in modern technologies. Two popular use cases of YAML are Ansible and Kubernetes.

YAML in Ansible

Ansible, a widely used IT automation tool, uses YAML to define configuration files called "playbooks." Playbooks contain instructions to configure and manage systems, making them a crucial part of Ansible workflows. YAML's simplicity and human-readable syntax make it ideal for defining playbooks, making them easy to understand and modify.

Playbooks in Ansible are written in YAML format and define the desired state of a system. They contain tasks, which are defined as lists of key-value pairs, making it easy to express complex configurations. YAML allows for easy nesting of tasks, variables, and loops, making it flexible and powerful for automating IT operations with Ansible.

YAML for Kubernetes

Kubernetes, a popular container orchestration platform, also uses YAML extensively for defining and managing resources. YAML manifests are used to declare the desired state of Kubernetes objects such as pods, services, and volumes. These manifests are then used by Kubernetes controllers to create and manage these objects.

YAML's declarative nature makes it well-suited for defining complex configurations in Kubernetes, and its human-readable syntax makes it easy to understand and manage resources in a Kubernetes cluster.

In summary, YAML plays a significant role in both Ansible and Kubernetes, providing a simple, readable, and flexible way to define configurations and manage systems in modern technologies.

Outline Indentation and Whitespaces in YAML

When we answered what is YAML, we mentioned that one of the key aspects of YAML's syntax is its use of indentation and whitespaces to define data structure and hierarchy. Unlike other programming languages or data serialization formats that use symbols or braces, YAML relies on indentation to represent the relationships between data elements.

In YAML, indentation is used to denote nested or child elements. The number of spaces or tabs used for indentation is not strictly defined, but it must be consistent throughout the file. Typically, two spaces or four spaces are used for indentation in YAML files.

Here's an example to illustrate the use of indentation in YAML:

In this example, the "fruits" key has three values that are indented with two spaces each, indicating that they are part of the "fruits" list.

YAML's use of indentation makes it easy to read and visually appealing. However, it's important to be mindful of the indentation while writing YAML files, as a small mistake in indentation can result in syntax errors.

In addition to indentation, YAML also uses whitespaces to separate keys and values, as well as elements within lists and associative arrays. Whitespaces, such as spaces or tabs, are used to provide visual separation between different elements, making YAML files easy to read and understand.

Overall, YAML's use of indentation and whitespaces contributes to its simplicity and human-readable nature, making it a popular choice for configuration files, data serialization, and other use cases in modern technologies like DevOps, containerization, and beyond.

Comments

Comments in YAML are used to provide explanations, add context, or document decisions within the code. They are denoted by the '#' symbol and can be placed at the end of a line or on a new line. For example:

Comments in YAML are purely for human readability and documentation purposes and have no impact on the data structure or functionality of the YAML file. They are ignored by the YAML parser during parsing.

It's important to note that YAML does not support multi-line comments. Each comment must be placed on a single line and cannot span multiple lines. However, you can use multiple single-line comments to provide longer explanations or break down complex sections of your YAML code.

Comments can help provide context, document decisions or assumptions, or add instructions for other developers who may be working with the YAML file. They contribute to making your YAML code more understandable, maintainable, and professional.

YAML Data Types

YAML supports a variety of data types, making it a versatile serialization language for representing structured data. These data types include key-value pairs and dictionaries for organizing data, numeric types for representing numbers, strings for storing text, nulls for representing the absence of a value, booleans for representing true or false values, arrays for representing lists of values, and dictionaries for representing nested data structures.

In this section, we will explore what is YAML and the different data types supported by YAML and how they can be used to represent data in a flexible and human-readable manner.

Key-Value Pairs and Dictionaries

In YAML, key-value pairs are defined using a colon (":") to separate the key and the value. The key is a string, and the value can be any valid YAML data type, such as a string, number, boolean, null, array, or another dictionary. Here's an example of a key-value pair in YAML:

In addition to key-value pairs, YAML also supports dictionaries, which are collections of key-value pairs. Dictionaries in YAML are represented using curly braces ("{}") and can contain multiple key-value pairs separated by commas. Here's an example of a dictionary in YAML:

Dictionaries can be nested within other dictionaries, allowing for the representation of complex and hierarchical data structures. The use of key-value pairs and dictionaries in YAML provides a flexible and intuitive way to represent structured data in a human-readable format.

Numeric Types

YAML supports various numeric types, including integers and floating-point numbers. Integers in YAML can be represented as decimal, hexadecimal, or octal values. Additionally, YAML allows the representation of special numeric values such as "not-a-number" (NAN) or infinity. Here's an example:

The value of foo is a decimal integer, bar is a hexadecimal integer, and plop is an octal integer.

YAML also allows the representation of special numeric values:

In this example, foo represents positive infinity, bar represents negative infinity, and plop represents not-a-number (NAN).

Strings

YAML strings are represented as Unicode characters and can be used to store text data. In most cases, you don't need to specify them in quotes:

However, if you want to include special characters or escape sequences in strings, you can use double quotes:

YAML does not escape strings with single quotes, but single quotes can be used to prevent string contents from being interpreted as document formatting. Additionally, string values can span multiple lines using special characters in YAML.

The fold (greater than) character allows you to specify a string in a block and retains the line breaks:

Similarly, the pipe (block) character also allows multi-line strings, but unlike the fold character, YAML interprets the field exactly as is:

These features make it convenient to work with strings in YAML and represent text data in a flexible and readable manner.

Nulls

In YAML, null values represent the absence of a value or an empty value. You can enter nulls using the tilde (~) or the unquoted null string literal "null":

Both the tilde (~) and the null string literal "null" indicate a null value in YAML. They can be used interchangeably to represent the absence of a value in a YAML document.

Booleans

Boolean values in YAML are indicated using keywords such as "True", "On", and "Yes" for true values, and "False", "Off", and "No" for false values:

YAML allows multiple representations for boolean values, and you can use any of these keywords interchangeably to represent boolean values in your YAML documents.

Arrays

In YAML, arrays or lists can be represented on a single line using square brackets [], or on multiple lines using hyphens - followed by the list items:

Alternatively, you can also represent arrays using the hyphen - followed by each item on a new line, which makes it more readable:

YAML allows you to choose the representation that best suits your preference and makes your YAML documents more readable.

Dictionaries

In YAML, dictionaries or maps can be represented inline or span multiple lines. For example, inline representation:

Or representation on multiple lines:

Dictionaries in YAML can be nested and hold any value, providing a flexible and powerful way to structure and organize your data.

Advanced Options

Chomp Modifiers

Multiline values in YAML documents may contain trailing whitespace, which may or may not need to be preserved based on how the document is processed. To control the preservation of trailing whitespace, YAML provides the strip chomp and preserve chomp operators. To retain the last character, a plus sign can be added to the fold or block operators.

For example, if the multiline value ends with whitespace, such as a newline, YAML will preserve it using the strip operator:

On the other hand, to strip the trailing whitespace, the strip operator can be used:

This allows for precise control over how trailing whitespace is handled in YAML documents.

Multiple Documents

In YAML, multiple documents can be included within a single file. Each document starts with three dashes (---) and ends with three periods (...). While some YAML processors may require the document start operator, the document end operator is usually optional.

Here's an example:

In this example, two separate YAML documents are included in the same file, with the first document containing keys bar and foo, and the second document containing keys one and three. The use of document start and end operators allows for multiple documents to be organized and processed within a single YAML file.

Conclusion

  1. YAML is a human-readable data serialization format used for configuration files, data exchange, and more.
  2. YAML supports various data types, including strings, numbers, booleans, nulls, arrays, and dictionaries.
  3. YAML offers advanced features such as multi-line strings, escape sequences, chomp modifiers, and support for multiple documents in a single file.
  4. YAML provides different styles for representing data, including flow style, block style, and literal style.
  5. Proper understanding and usage of YAML syntax and features ensure accurate and reliable data representation in YAML-based applications and systems.