This tutorial explains what Ansible facts are, and how to gather system information i.e facts in Ansible playbooks.
Table of Contents
A Brief Introduction to Ansible Facts
Facts are the information about the managed hosts.
When you run the playbook, ansible will try to collect system-related information about the managed host and store it in the memory until the playbook is completed.
The information can be an IP address, operating system, filesystem and more. This information gathering is taken care of by the "setup" module.
You do not need to define the setup module. Instead, ansible will automatically use the setup module and collect the facts.
Have a look at the following example.
Create a playbook and add the following play definition. The play definition contains only the play name and the target hosts. No task is created for this play.
- name: Stats collection hosts: all
As you can see from the following output, when I run the playbook, gathering facts is submitted as the first task even though we have not defined any task in the playbook.
PLAY [Stats collection] ******************************************************************************************* TASK [Gathering Facts] ******************************************************************************************** ok: [rocky.anslab.com] ok: [master.anslab.com] ok: [ubuntu.anslab.com] PLAY RECAP ******************************************************************************************************** master.anslab.com : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 rocky.anslab.com : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 ubuntu.anslab.com : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
Facts are also called playbook variables.
Disabling Facts Gathering
Not every playbook requires facts to be collected. You can skip facts collection by adding "gather_facts: false
" to the play definition.
- name: Stats collection hosts: all gather_facts: false
How to Use Setup Module
As I told in the previous section, ansible uses the setup module to collect facts. Let's see some ways to use the setup module as adhoc command then we will see how to use the setup module explicitly in the playbook.
The output of the setup command will be in JSON
format. Run the following command against a target host to get the facts output.
$ ansible m1 -m setup | less
As you can see from the above image, a whole lot of system related information is collected. Some information may be useful and some may not.
Facts Data Type
The information stored in the facts are classified into three different data types.
- AnsibleUnsafeText
- Dictionary
- List
Following playbook is for demonstrating data types from facts output. I am using type_debug
to find the data type.
- name: Facts Data type hosts: m2 gather_facts: true tags: datatype tasks: - name: AnsibleUnsafeText debug: var: ansible_facts['distribution'] | type_debug - name: dict debug: var: ansible_facts['eth0']['ipv4'] | type_debug - name: List debug: var: ansible_facts['eth0']['ipv6'] | type_debug
Sample Output:
$ ansible-playbook facts_data_type.yml TASK [AnsibleUnsafeText] **************************************************************************************************************************************************************************************** ok: [rocky.anslab.com] => { "ansible_facts['distribution'] | type_debug": "AnsibleUnsafeText" } TASK [dict] ***************************************************************************************************************************************************************************************************** ok: [rocky.anslab.com] => { "ansible_facts['eth0']['ipv4'] | type_debug": "dict" } TASK [List] ***************************************************************************************************************************************************************************************************** ok: [rocky.anslab.com] => { "ansible_facts['eth0']['ipv6'] | type_debug": "list" }
Facts Filter and Gather_subset
You can apply filters to grab a section of data from the facts output. Let's say if I want to grab machine architecture then the command will be as follows.
$ ansible m2 -m setup -a "filter=ansible_architecture" rocky.anslab.com | SUCCESS => { "ansible_facts": { "ansible_architecture": "x86_64", "discovered_interpreter_python": "/usr/libexec/platform-python" }, "changed": false }
You can also filter the data using patterns.
$ ansible m2 -m setup -a "filter=*architecture" rocky.anslab.com | SUCCESS => { "ansible_facts": { "ansible_architecture": "x86_64", "ansible_userspace_architecture": "x86_64", "discovered_interpreter_python": "/usr/libexec/platform-python" }, "changed": false }
You can also use gather_subset
to get the specified subset of facts. For example if you want only the network related information, you can use "gather_subset=network
".
$ ansible m2 -m setup -a "gather_subset=network"
Likewise there are other subsets such as all
, min
, hardware
, network
, and virtual
etc. By default "all
" is set.
You can prefix !
symbol with any subset which will not collect the particular subset. For example, the below command will collect all facts except network related information.
$ ansible m2 -m setup -a "gather_subset=!network"
Writing The Facts to File
You can write the facts to a file using the --tree
flag. Directory path should be passed as argument to --tree
flag.
$ ansible all -m setup --tree /home/vagrant/facts
Here ansible will collect the facts for all group and store the output with the server name in the directory /home/vagrant/facts
.
$ ls -l ~/facts total 68 -rw-rw-r-- 1 vagrant vagrant 21092 Apr 27 05:36 master.anslab.com -rw-rw-r-- 1 vagrant vagrant 18336 Apr 27 05:36 rocky.anslab.com -rw-rw-r-- 1 vagrant vagrant 21093 Apr 27 05:36 ubuntu.anslab.com
Using Setup Module in the Playbook
Not all the data collected in the facts will be used in your playbooks. So you can explicitly use the setup module and collect the piece of data you are interested in.
If you take a look at the playbook below, the only data I am interested in getting is the managed node distribution information. Using that information I have written a conditional statement.
- name: Explicitly using setup module hosts: all gather_facts: false tasks: - name: Setup Module setup: filter: ansible_distribution - name: Doing something debug: msg: "Spotted RHEL based distro. Proceeding with activity..." when: ansible_distribution | lower in ["rocky","redhat","centos"]
In the output below, you can see tasks ran only on Rocky Linux and skipped for Ubuntu based distributions.
TASK [Setup Module] *********************************************************************************** ok: [rocky.anslab.com] ok: [master.anslab.com] ok: [ubuntu.anslab.com] TASK [Doing something] ******************************************************************************** skipping: [master.anslab.com] skipping: [ubuntu.anslab.com] ok: [rocky.anslab.com] => { "msg": "Spotted RHEL based distro. Proceeding with activity..." }
You can also use filters and gather_subset in the playbook.
- name: Applying filter setup: filter: - 'ansible_distribution' - 'ansible_eth[0-3]' - name: Using subsets setup: gather_subset: - '!all' - 'network'
Run Time Variation with and without Facts Collection
When you run the playbook against a large set of managed hosts, fact collection can slow the play. Using the time
command you can check the playbook run time with and without facts.
The below output shows the time difference with and without facts. My playbook has less number of tasks and only 3 managed hosts so there won't be any big difference in the run time.
# With facts $ time ansible-playbook facts.yml real0m3.096s user0m1.264s sys 0m0.340s
# Without facts $ time ansible-playbook facts.yml real0m2.112s user0m0.887s sys 0m0.265s
Conclusion
In this article we have discussed what is facts and how to run adhoc commands using the setup module and how to use the setup module explicitly in the playbook.
Facts gathering is a time consuming process when you are running against a huge number of targets.