-
Notifications
You must be signed in to change notification settings - Fork 20
References | YAML
YAML is a computer data serialization language.
A YAML document represents a computer program's native data structure in a human readable text form. A node in a YAML document can have three basic data types:
- Scalar: Atomic data types like strings, numbers, booleans and null
- Sequence: A list of nodes
-
Mapping:
A map of nodes to nodes.
Also known as Hashes, Hash Maps, Dictionaries or Objects. Unlike in many programming languages, a key can be more than just a string. It can be a sequence or mapping itself. |
On top of that, YAML allows to serialize all other data types and classes:
-
Alias and Anchor: For serializing References / Pointers, including circular references.
-
Tag: With Tags it's possible to define custom types/classes.
For example, in many languages a Regular Expression is a builtin data type or object.
Some languages have only arrays, which are represented by the basic sequence type.
But some have tuples, which needs a custom tag.
Additionally to the indentation based Block Style there is a more compact Flow Style syntax.
One YAML File (or Stream) can consist of more than one Document.
The following examples will introduce you with YAML syntax elements step by step.
Let's write an invoice.
It has a number, a name and an address, order items and more.
The most common top level data type are mappings. A mapping maps values to keys. Keys and values are separated with a colon and a space :
.
- Each Key/Value pair is on its own line.
invoice number: 314159
name: Santa Claus
address: North Pole
An alternative way to write it:
---
invoice number: 314159
name: Santa Claus
address: North Pole
- The
---
is explicity starting a Document. - It marks the following content as YAML, but it is optional.
- It has some use cases, and it is needed when you have multiple Documents in one file.
- Read more about it in the Document Chapter.
Now we replace the address
string with another mapping. In that case the colon is followed by a linebreak.
- Mapping values that are not scalars must always start on a new line.
Nested items must always be indented more then the parent node, with at least one space.
- The typical indentation is two spaces.
- Tabs are forbidden as indentation.
invoice number: 314159
name: Santa Claus
address:
street: Santa Claus Lane
zip: 12345
city: North Pole
- Don't forget the indentation.
If you write it like this:
invoice number: 314159
name: Santa Claus
address:
street: Santa Claus Lane
zip: 12345
city: North Pole
... then it will actually mean this:
invoice number: 314159
name: Santa Claus
address: null
street: Santa Claus Lane
zip: 12345
city: North Pole
A sequence is a list (or array) of scalars (or other sequences or mappings).
- A sequence item starts with a hyphen and a space - .
Here is the list of YAML inventors:
- Oren Ben-Kiki
- Clark Evans
- Ingy döt Net
Now back to our invoice. We map a list of scalars to the key order items
.
- The sequence must start on the next line:
invoice number: 314159
name: Santa Claus
address:
street: Santa Claus Lane
zip: 12345
city: North Pole
order items:
- Sled
- Wrapping Paper
Because the -
counts as indentation, you can also write it like this:
invoice number: 314159
name: Santa Claus
address:
street: Santa Claus Lane
zip: 12345
city: North Pole
order items:
- Sled
- Wrapping Paper
You can also nest sequences. The typical example is a List of Dice Rolls.
- The nested sequence items can follow directly on the same line:
---
- - 2
- 3
- - 3
- 6
YAML allows to write that in a more compact way, the Flow Style:
---
- [ 2, 3 ]
- [ 3, 6 ]
- Read more about it in the Flow Style Chapter.
Let's add a billing address to the invoice.
- In our case it is the same as the shipping address.
We rename address to shipping address and add billing address:
invoice number: 314159
name: Santa Claus
shipping address:
street: Santa Claus Lane
zip: 12345
city: North Pole
billing address:
street: Santa Claus Lane
zip: 12345
city: North Pole
order items:
- Sled
- Wrapping Paper
- Now that's a bit wasted space. If it's the same address, you don't need to repeat it. Use an Alias.
In the native data structure of a programming language, this would be a reference, pointer, or alias.
Before an Alias can be used, it has to be created with an Anchor:
invoice number: 314159
name: Santa Claus
shipping address: &address # Anchor
street: Santa Claus Lane # ┐
zip: 12345 # │ Anchor content
city: North Pole # ┘
billing address: *address # Alias
order items:
- Sled
- Wrapping Paper
When loaded into a native data structure, the shipping address
and billing address
point to the same data structure.
- It depends on the capabilities of the programming language how this is implemented internally.
YAML is used in all kinds of applications as a configuration language.
One category is the configuration of Continuous Integration systems.
Here is a minimal example of a GitHub Action Workflow.
name: Linux
on: [push] # Compact Flow Style Sequence
jobs:
build:
name: Run Tests
runs-on: ubuntu-latest
steps:
- name: Say Hello
run: echo hello
- The value for
steps
is a list of mappings. A mapping can start directly on the same line as the-
. - Usually a
step
has aname
, which will be shown as the title when running the job, and arun
, which is a shell command, or multiple commands.
Let's add a more realistic scenario, with one step to checkout the code, and one with multiple commands.
If you use Double Quotes, which work like JSON strings, it looks like this:
steps:
# Plugin provided by GitHub to checkout the code
- uses: actions/checkout@v2
# Run multiple commands
- name: Run Tests
run: "./configure\nmake\nmake test\n"
One of the advantages of YAML here is that this can be formatted in a way that's easy to write and read with Block Scalars:
steps:
- uses: actions/checkout@v2
- name: Run Tests
run: | # Literal Block Scalar
./configure
make
make test
The Literal Block Scalar, as the name says, contains the literal content of the string. Tabs and similar characters are always literal. All trailing spaces will be kept.
Let's say, you have a number of longer commands that you would like to break up into multiple lines for readability:
steps:
- uses: actions/checkout@v2
- name: Install dependencies
run: > # Folded Block Scalar
apt-get update
&& apt-get install -y
git tig vim jq tmux tmate git-subrepo cpanminus
cpanm -n -l local
YAML::PP YAML::XS ...
The Folded Block Scalar is like the Literal Block Scalar, but with special folding rules.
Consecutive lines starting at the same indentation level will be folded with spaces, and empty lines create a linebreak.
Read more about Block Scalars and all other ways of quoting in the Quoting Chapter.
YAML itself has no concept of "variables" or "functions".
Systems like GitHub Actions usually provide a way to access certain information and environment variables with a Templating Syntax.
We set up a "matrix" test to build the code with gcc and clang.
strategy:
matrix:
compiler: [gcc, clang]
steps:
- ...
The strategy.matrix
entry will create two jobs instead of one, providing the compiler
in a "context" item that we can pass as an environment variable to the step:
strategy:
matrix:
compiler: [gcc, clang]
steps:
- uses: actions/checkout@v2
- name: Run Tests
env:
CC: ${{ matrix.compiler }}
run: |
./configure
make
make test
This sets the environment variable CC
to gcc
or clang
, respectively.
The ${{ matrix.compiler }}
syntax is not a special YAML syntax.
It is a simple plain scalar that could also have been written in quotes:
env:
CC: '${{ matrix.compiler }}'
It's the GitHub Action application that recognizes such variables and replaces them with their content at runtime.
Such variables can look different, depending on the application.
For example, Ansible is using the Jinja2 templating engine, where variables look like this:
wuth_items: '{{ user.names }}'
It is important to add quotes here, because the {
at the start actually would start a Flow Style Mapping otherwise.
So it's clever that GitHub Actions chose the ${{ ... }}
syntax, because the $
at the start is not special in YAML and doesn't need quotes.
Further Readings:
- YAML Ain’t Markup Language (YAML™) version 1.2
- https://www.yaml.info/ - offers various useful information for learning about YAML.
- YAML Tutorial: Everything You Need to Get Started in Minutes
- Complete YAML Course - Beginner to Advanced for DevOps and more! - YouTube
- Get started with Docker Compose
- Docker Compose Explained
- Use Docker Compose in VS Code
👉Note
- This page might contain some wide rendered images. If you want to see the whole contents with wider page, please use Wide GitHub extension of Chrome.
- If you are experiencing the error on rendered images due to the low-bandwith Internet delay, please use F5 function key for refreshing current page.