You may be working as a developer, administrator, cloud engineer, or DevOps engineer, and irrespective of your roles you will come across YAML and it is important you understand the what and how factor of yaml. This article is curated in a way to make you comfortable with the fundamentals of yaml.
Prior to get started with YAML basics, allow me to give you a brief introduction to Data serialization.
Table of Contents
What is data serialization?
To put it in simple terms, data serialization is the process to convert your data to a format that can be stored, transferred over the network, and interpreted by the application.
The three most common data serialization formats are XML, JSON, YAML. We also have other formats like BSON, MessagePack, and Protobuf, etc.
This guide focuses only on what is YAML and how to work with YAML, with practical examples.
What is YAML?
YAML, stands for YAML ain’t markup language, is a data serialization language that is a superset of JSON. YAML is so popular because of its simplicity. It is easy to create and read yaml files compared to XML or json.
Modern tools like ansible, docker, Kubernetes, Chef, etc., and cloud environments like Aws, Azure, GCP uses yaml. You will see a lot of configuration and deployment files written using yaml. For example, in Ansible, the playbook is written in yaml.
Many popular programming languages have libraries to work with yaml, so yaml can easily be integrated with any environment.
YAML extension
YAML files should be saved with .yml
or .yaml
extensions. Every popular text editor like Vscode, Atom, Vim, Sublime Text, etc., have yaml language support.
You can also install extensions like yaml linter, yaml to json/XML converter, yaml beautify, etc. on your text editors which will offer more features when you work with yaml.
Basic Yaml structure
The basic structure of yaml will either be a sequence or a dictionary type. Sequence type is similar to python list and dictionary type is similar to python dictionaries.
Dictionaries are key-value pairs where the key is of string type and value can be any scalar type. To separate key and value use colon (:).
Site_name: ostechnix
Each document in the yaml stream should be separated using three dashes (---
) and three dots (…
). Three dashes (---
) points to the start of the document and three dots (…
) points to the end of the document in the stream. In case of no multiple documents, not necessary to use dashes and dots.
--- operating_system: Redhat version: 8 Same_family: - Rocky Linux - Alma Linux - Fedora Linux ... --- operating_system: Debian version: 11 Same_family: - Ubuntu - Linux Mint - Pop!_OS ...
Yaml uses indentation to define the structure of the object. Tabs are not supported and you should leave two spaces for the indentation. When you press <tabs> in the text editor when working with yaml, it will only leave two spaces or at least this is the case with Vscode.
You will learn more about indentation in the upcoming section.
I am going to run the following piece of python code which will parse the yaml file and convert the data type into a python data type.
#!/usr/bin/python3 import yaml with open("input.yaml", 'r') as f: dict = yaml.load(f, yaml.FullLoader) for k, v in dict.items(): print (k + " : " + str(v)) print("data type", " = ", type(v)) print()
You can either use this code or an online interpreter to run and test the yaml codes in this article.
YAML comments
Comments give a better understanding for anyone reading the yaml file. To add comments in your yaml file, use the #
symbol. Yaml does not support multi-line comments. So if you wish to add multiline comments, you have to prefix each line with the #
symbol.
# CONFIGURATION FILE BASED ON PoP_!OS COSMIC DESKTOP OS_NAME: "Pop!_OS" VERSION: 21.04 # VERSION 21.10 REACHED EOL CODE_NAME: COSMIC
In the above example, there are two comments added. The block comment is added at the first line, and the second is the inline comment added at line four.
YAML string type
Strings in yaml can be created either with quotes or without quotes. Yaml is smart enough to interpret the data type internally. Here the key is of string type and the value is also set to be a string type.
The important point to remember.
- Create strings without using quotes unless it is necessary.
- Use double quotes if the string contains any special escape characters.
- Use single quotes if the special characters are to be interpreted as literals.
User1_review: Pop_!os is great to work with User2_review: "Pop_!os \t is great to work with" User3_review: 'Pop_!os \t is great to work with'
As I told earlier, yaml by default knows the data type. There is also an explicit way to specify the data type. Use !!
symbol followed by the data type and the string value.
User1_review: !!str Pop_!os is great to work with
If you have a long line of string but wish to write it in multiple lines inside your yaml file, you can do it using >
symbol. This is called folded style. Your parser will interpret the string as a single line though you have written it in multiple lines.
User4_review: > Among all the distribution I used PoP_!OS looks great
When multiline string needs to be printed as it is, use the pipe (|
) symbol. This is called literal style.
User5_review: | Among all the distribution I used PoP_!OS looks great
You can use chomp modifiers to strip or preserve whitespace at the end of the values. You can use the "-"
symbol after >
or |
to strip the white space.
User4_review: >- Among all the distribution I used PoP_!OS looks great. User5_review: |- Among all the distribution I used PoP_!OS looks great.
If you want to preserve the white space, you can add "+
" symbol after |
or >
symbols.
User4_review: >+ Among all the distribution I used PoP_!OS looks great. User5_review: |+ Among all the distribution I used PoP_!OS looks great.
YAML numeric type
Yaml supports Integer, Float, Decimal, or Hexadecimal numeric types. By default yaml parser will detect the data type, but there is also an explicit way to define int and float data types like as shown in the below example.
int1: 98765 int2: !!int 56789 # Explicit Integer definition float1: 20.0481 float2: !!float 20.0482 # Explicit Float definition
Hexadecimal and octal values will be converted to decimal values by the interpreter.
hex1: 0x14d3
oct1: 014442
YAML Boolean type
Yaml supports boolean values of "True" and "False". You can also use "Yes" or "on" which points to "True", and "no" or "off" which points to "false". You can also explicitly define the data type using !!bool.
There is no case restriction for boolean values. In the below example, you can see I have written "True" in many cases, and the parser interprets everything to "True" value.
upgrade: True
Reboot_After_Upgrade: TRUE
Enable_Firewall: on
Set_Power_Profile: yes
A similar condition is applicable to the "False" value too.
upgrade: False
Reboot_After_Upgrade: FALSE
Enable_Firewall: off
Set_Power_Profile: no
Heads Up: If you try to enclose the boolean value with quotes, it will be treated as a string.
YAML Null type
To make the value null you can either use the "~
" symbol or "null
" keyword. You can also define the key and leave the value empty which will be treated as null. You can also make explicit definitions using !!null
.
Similar to boolean, there is no case restriction for the "null
" keyword. You can see in the below example null keyword is written in three different cases.
upgrade: !!null null
Reboot_After_Upgrade: NULL
Enable_Firewall: Null
Set_Power_Profile: ~
Set_Network_Interface: # NO VALUES PASSED
In python NULL value is interpreted as “None” type and if you try to convert yaml to json null values will be interpreted as null value only.
YAML Sequence type
Yaml sequence is a list of values stored in order. Think sequences like python list or arrays in Perl where you will define a variable and store one or more values.
There are a couple of ways to define the sequence in yaml. First is the in-flow style where you will give a key name followed by a list of values inside square brackets. This is similar to the python list.
# INFLOW STYLE SEQUENCE app_to_be_updated: [ "firefox", "timeshift"]
The second way is creating a sequence using block style. Each element in the sequence will be prefixed with a dash followed by a space and element value. Each element should be written in a separate line. You can either have an indentation of two spaces or create a sequence without indentation. But as best practice, just stick with indentation.
# BLOCK STYLE SEQUENCE app_to_be_installed: - vscode - virtualbox - tilix app_to_be_removed: - pycharm - stacer - ufw
It is also possible to create a nested sequence like shown below.
# NESTED SEQUENCE applications: - Productivity: - vscode - vagrant - docker - python3 - Browser: - firefox - chrome - brave
YAML Dictionary type
We have already seen dictionary types in the initial sections of this article. Dictionary is a key-value pair that is one of the core building blocks in yaml. The dictionary "keys" are always string type and the values can be of any scalar type.
Similar to sequence, the dictionary can also be written in multiple ways. First is the Inflow style and this is similar to the representation of python dictionaries.
application: { Install: "Vscode", Remove: Stacer, Update: Firefox}
Dictionaries can also be created using block style.
application1: Install: Vscode Remove: Stacer Update: Firefox
You can also create a list of nested dictionaries.
application2: Install: python: 3.9 Vscode: 1.58.2 Remove: - Stacer - pycharm Update: Firefox
YAML anchors and alias
You can implement DRY (Don’t Repeat Yourself) in your yaml file using anchors and alias.
Anchor is denoted using "&
" symbol and alias is denoted using "*
" symbol.
Anchor is similar to a variable in programming. You will define an anchor using &
symbol followed by a keyword and later use an alias (*
keyword) to expand the anchor value.
Take a look at the below example. &x
is defined as the anchor and later it is expanded using alias *x
.
User4_review: &x Among all the distribution I used PoP_!OS looks great.
User5_review: *x
User6_review: *x
You can override any particular values when using anchors and alias using <<
symbol. Here, I am overriding the value of the version from 21.10 to 21.04.
PoP_OS: &pos version: 21.10 code_name: cosmic desktop d_flavour: gnome rewrite: <<: *pos version: 21.04
Conclusion
In this article, we have seen what is yaml and how to work with different types in yaml. In this guide, I have used Python as a yaml parser.
Hope you find this YAML tutorial useful. If you are new to yaml, then I suggest practicing whatever we have shown in this article which will help you in getting comfortable with the yaml and you can start using it according to your environment.