Home YAML YAML Tutorial | Getting Started With YAML

YAML Tutorial | Getting Started With YAML

YAML Basics

By Karthick
3.7K views

You may be working as a developer, administrator, cloud engineer, or DevOps engineer, and irrespective of your roles you will come across YAML and it is important you understand the what and how factor of yaml. This article is curated in a way to make you comfortable with the fundamentals of yaml.

Prior to get started with YAML basics, allow me to give you a brief introduction to Data serialization.

What is data serialization?

To put it in simple terms, data serialization is the process to convert your data to a format that can be stored, transferred over the network, and interpreted by the application.

The three most common data serialization formats are XML, JSON, YAML. We also have other formats like BSON, MessagePack, and Protobuf, etc.

This guide focuses only on what is YAML and how to work with YAML, with practical examples.

What is YAML?

YAML, stands for YAML ain’t markup language, is a data serialization language that is a superset of JSON. YAML is so popular because of its simplicity. It is easy to create and read yaml files compared to XML or json.

Modern tools like ansible, docker, Kubernetes, Chef, etc., and cloud environments like Aws, Azure, GCP uses yaml. You will see a lot of configuration and deployment files written using yaml. For example, in Ansible, the playbook is written in yaml.

Many popular programming languages have libraries to work with yaml, so yaml can easily be integrated with any environment.

YAML extension

YAML files should be saved with .yml or .yaml extensions. Every popular text editor like Vscode, Atom, Vim, Sublime Text, etc., have yaml language support.

You can also install extensions like yaml linter, yaml to json/XML converter, yaml beautify, etc. on your text editors which will offer more features when you work with yaml.

Basic Yaml structure

The basic structure of yaml will either be a sequence or a dictionary type. Sequence type is similar to python list and dictionary type is similar to python dictionaries.

Dictionaries are key-value pairs where the key is of string type and value can be any scalar type. To separate key and value use colon (:).

Site_name: ostechnix

Each document in the yaml stream should be separated using three dashes (---) and three dots (). Three dashes (---) points to the start of the document and three dots () points to the end of the document in the stream. In case of no multiple documents, not necessary to use dashes and dots.

---
operating_system: Redhat
version: 8
Same_family:
 - Rocky Linux
 - Alma Linux
 - Fedora Linux
...

---
operating_system: Debian
version: 11
Same_family:
 - Ubuntu
 - Linux Mint
 - Pop!_OS
...

Yaml uses indentation to define the structure of the object. Tabs are not supported and you should leave two spaces for the indentation. When you press <tabs> in the text editor when working with yaml, it will only leave two spaces or at least this is the case with Vscode.

YAML indentation
YAML indentation

You will learn more about indentation in the upcoming section.

I am going to run the following piece of python code which will parse the yaml file and convert the data type into a python data type.

#!/usr/bin/python3

import yaml

with open("input.yaml", 'r') as f:
   dict = yaml.load(f, yaml.FullLoader)
   for k, v in dict.items():
       print (k + " : " + str(v))
       print("data type", " = ", type(v))
       print()

You can either use this code or an online interpreter to run and test the yaml codes in this article.

YAML comments

Comments give a better understanding for anyone reading the yaml file. To add comments in your yaml file, use the # symbol. Yaml does not support multi-line comments. So if you wish to add multiline comments, you have to prefix each line with the # symbol.

# CONFIGURATION FILE BASED ON PoP_!OS COSMIC DESKTOP

OS_NAME: "Pop!_OS"
VERSION: 21.04       # VERSION 21.10 REACHED EOL
CODE_NAME: COSMIC  

In the above example, there are two comments added. The block comment is added at the first line, and the second is the inline comment added at line four.

YAML string type

Strings in yaml can be created either with quotes or without quotes. Yaml is smart enough to interpret the data type internally. Here the key is of string type and the value is also set to be a string type.

The important point to remember.

  • Create strings without using quotes unless it is necessary.
  • Use double quotes if the string contains any special escape characters.
  • Use single quotes if the special characters are to be interpreted as literals.
User1_review: Pop_!os is great to work with
User2_review: "Pop_!os \t is great to work with"
User3_review: 'Pop_!os \t is great to work with'
Yaml string type
Yaml string type

As I told earlier, yaml by default knows the data type. There is also an explicit way to specify the data type. Use !! symbol followed by the data type and the string value.

User1_review: !!str Pop_!os is great to work with
Implicit definition
Implicit definition

If you have a long line of string but wish to write it in multiple lines inside your yaml file, you can do it using > symbol. This is called folded style. Your parser will interpret the string as a single line though you have written it in multiple lines.

User4_review: >
 Among all the distribution
 I used
 PoP_!OS looks great
Folded style multi-line string
Folded style multi-line string

When multiline string needs to be printed as it is, use the pipe (|) symbol. This is called literal style.

User5_review: |
 Among all the distribution
 I used
 PoP_!OS looks great
Literal style multi-line string
Literal style multi-line string

You can use chomp modifiers to strip or preserve whitespace at the end of the values. You can use the "-" symbol after > or | to strip the white space.

User4_review: >-
Among all the distribution
I used
PoP_!OS looks great.
User5_review: |-
Among all the distribution
I used
PoP_!OS looks great.

If you want to preserve the white space, you can add "+" symbol after | or > symbols.

User4_review: >+
Among all the distribution
I used
PoP_!OS looks great.
User5_review: |+
Among all the distribution
I used
PoP_!OS looks great.

YAML numeric type

Yaml supports Integer, Float, Decimal, or Hexadecimal numeric types. By default yaml parser will detect the data type, but there is also an explicit way to define int and float data types like as shown in the below example.

int1: 98765
int2: !!int 56789 # Explicit Integer definition

float1: 20.0481
float2: !!float 20.0482 # Explicit Float definition
Int and Float data type
Int and Float data type

Hexadecimal and octal values will be converted to decimal values by the interpreter.

hex1: 0x14d3
oct1: 014442
Hexadecimal and Octal values
Hexadecimal and Octal values

YAML Boolean type

Yaml supports boolean values of "True" and "False". You can also use "Yes" or "on" which points to "True", and "no" or "off" which points to "false". You can also explicitly define the data type using !!bool.

There is no case restriction for boolean values. In the below example, you can see I have written "True" in many cases, and the parser interprets everything to "True" value.

upgrade: True
Reboot_After_Upgrade: TRUE
Enable_Firewall: on
Set_Power_Profile: yes
YAML Boolean true value
YAML Boolean true value

A similar condition is applicable to the "False" value too.

upgrade: False
Reboot_After_Upgrade: FALSE
Enable_Firewall: off
Set_Power_Profile: no
YAML Boolean false value
YAML Boolean false value

Heads Up: If you try to enclose the boolean value with quotes, it will be treated as a string.

YAML Null type

To make the value null you can either use the "~" symbol or "null" keyword. You can also define the key and leave the value empty which will be treated as null. You can also make explicit definitions using !!null.

Similar to boolean, there is no case restriction for the "null" keyword. You can see in the below example null keyword is written in three different cases.

upgrade: !!null null
Reboot_After_Upgrade: NULL
Enable_Firewall: Null
Set_Power_Profile: ~
Set_Network_Interface: # NO VALUES PASSED
YAML Null type
YAML Null type

In python NULL value is interpreted as “None” type and if you try to convert yaml to json null values will be interpreted as null value only.

YAML Sequence type

Yaml sequence is a list of values stored in order. Think sequences like python list or arrays in Perl where you will define a variable and store one or more values.

There are a couple of ways to define the sequence in yaml. First is the in-flow style where you will give a key name followed by a list of values inside square brackets. This is similar to the python list.

# INFLOW STYLE SEQUENCE
app_to_be_updated: [ "firefox", "timeshift"]
Inflow style sequence
Inflow style sequence

The second way is creating a sequence using block style. Each element in the sequence will be prefixed with a dash followed by a space and element value. Each element should be written in a separate line. You can either have an indentation of two spaces or create a sequence without indentation. But as best practice, just stick with indentation.

# BLOCK STYLE SEQUENCE

app_to_be_installed:
 - vscode
 - virtualbox
 - tilix

app_to_be_removed:
- pycharm
- stacer
- ufw
Block style sequence
Block style sequence

It is also possible to create a nested sequence like shown below.

# NESTED SEQUENCE
applications:
 - Productivity:
   - vscode
   - vagrant
   - docker
   - python3
 - Browser:
   - firefox
   - chrome
   - brave
Nested sequence
Nested sequence

YAML Dictionary type

We have already seen dictionary types in the initial sections of this article. Dictionary is a key-value pair that is one of the core building blocks in yaml. The dictionary "keys" are always string type and the values can be of any scalar type.

Similar to sequence, the dictionary can also be written in multiple ways. First is the Inflow style and this is similar to the representation of python dictionaries.

application: { Install: "Vscode", Remove: Stacer, Update: Firefox}
Dictionary - Inflow style
Dictionary - Inflow style

Dictionaries can also be created using block style.

application1:
 Install: Vscode
 Remove: Stacer
 Update: Firefox
Dictionary - Block style
Dictionary - Block style

You can also create a list of nested dictionaries.

application2:
 Install:
   python: 3.9
   Vscode: 1.58.2
 Remove:
   - Stacer
   - pycharm
 Update: Firefox
Nested dictionary
Nested dictionary

YAML anchors and alias

You can implement DRY (Don’t Repeat Yourself) in your yaml file using anchors and alias.

Anchor is denoted using "&" symbol and alias is denoted using "*" symbol.

Anchor is similar to a variable in programming. You will define an anchor using & symbol followed by a keyword and later use an alias (* keyword) to expand the anchor value.

Take a look at the below example. &x is defined as the anchor and later it is expanded using alias *x.

User4_review: &x Among all the distribution I used PoP_!OS looks great.
User5_review: *x
User6_review: *x
Anchors and Alias
Anchors and Alias

You can override any particular values when using anchors and alias using << symbol. Here, I am overriding the value of the version from 21.10 to 21.04.

PoP_OS: &pos
 version: 21.10
 code_name: cosmic desktop
 d_flavour: gnome

rewrite:
 <<: *pos
 version: 21.04
Override values
Override values

Conclusion

In this article, we have seen what is yaml and how to work with different types in yaml. In this guide, I have used Python as a yaml parser.

Hope you find this YAML tutorial useful. If you are new to yaml, then I suggest practicing whatever we have shown in this article which will help you in getting comfortable with the yaml and you can start using it according to your environment.

Leave a Comment

* By using this form you agree with the storage and handling of your data by this website.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

This website uses cookies to improve your experience. By using this site, we will assume that you're OK with it. Accept Read More