Introduction
At work, we have a central configuration build, that we pick parts and pieces from for our ‘deployments’. As you can read in my resume I am working as a Virtualisation Engineer. There I am building most of the deployment code. Our base structure looks at our vCenters as primary key.
What does that mean?
Well, if you take a vCenter as main key, then everything that builds a vCenter (ESXi hosts, clusters, distributed switches, storage, etc.) are parts under the vCenter configuration. So you target a vCenter and not specifically the hosts underneath a vCenter.
Yeah so?
That means, that you need to do inflexible loops to go over the storage items, or ESXi hosts. Imagine that both storage and ESXi are address since somehow they related, then you might need to do two loops to get the implementation done. That is fine with a couple of hosts and small storage, but if you need to do that en-masse, that is inflexible. You cannot do easy loops as you can do in Python or Powershell for example.
So bottomline, this takes a bunch of time to process and when you deploy something, you want it as quick as possible.
To the rescue
I do not refer to Ansible’s Rescue mode. Block,rescue,always, you’ll know this if you done the course ;-). But I got a tip recently from one of my colleague’s from the Linux team, that there is this concept of ‘virtual host groups’ in Ansible. (ansible.builtin.add_host). So you can do some loops from the configuration and build virtual host objects and add them to a virtual group.
If you then rewrite parts of your ‘sequential’ playbook into smaller subsections and put them in an own play (basically still the same but then started from a different play). You can target the virtually created group in one go, which without limit just pushes it to as many (virtual) host objects as possible.
So instead of sequentially looping over each ESXi host and then storage. You can get all ESXi hosts, create the configuration for it that you need put it in a virtual ESXi group, and run your play against it. If one of the config items is a large configuration block for storage, you can then loop over that, or if you restucture it smartly, you might be able to use different ‘primary keys’ to smash the data against. This saves at least one slow iteration, and in my case it speeds up a large part of the building blocks by a factor of 10 (reducing the implementation time hugely).
How does that look like?
Again I cannot show how we do that at work, but given a certain configuration structure like:
configuration:
cluster:
- name: clusterA
hosts:
- name: HostA
ip: 127.0.0.1
description: This is host A under cluster A
- name: HostB
ip: 127.0.0.2
description: This is host B under cluster A
- name: ClusterB
hosts:
- name: HostC
ip: 127.0.0.3
description: This is host C under cluster B
- name: HostD
ip: 127.0.0.4
description: This is host D under cluster B
storage:
hosts:
- name: StorageA
fqdn: storage-a.your.domain
datastores:
- name: datastoreA
size: 1GB
amount: 10
- name: datastoreB
size: 10GB
amount: 10
You could have a playbook that has:
---
- name: Build storage
hosts: localhost
gather_facts: false
tasks:
- name: Get all data
ansible.builtin.debug:
msg:
- "storagename: {{ storage.0.name }}"
- "datastorename: {{ storage.1.name }} with size: {{ storage.1.size }} and how many times {{ storage.1.amount }}"
loop_control:
loop_var: storage
loop: "{{ query('subelements', configuration.storage.hosts, 'datastores') }}"
This will result in:
[...]
ok: [localhost] => (item=[{'name': 'StorageA', 'fqdn': 'storage-a.your.domain'}, {'name': 'datastoreA', 'size': '1GB', 'amount': 10}]) => {
"msg": [
"storagename: StorageA",
"datastorename: datastoreA with size: 1GB and how many times 10"
]
}
ok: [localhost] => (item=[{'name': 'StorageA', 'fqdn': 'storage-a.your.domain'}, {'name': 'datastoreB', 'size': '10GB', 'amount': 10}]) => {
"msg": [
"storagename: StorageA",
"datastorename: datastoreB with size: 10GB and how many times 10"
]
}
But, if you need to do something with the hosts as well, you cannot navigate to that, because that is on a different level/path in the configuration.
So you might need to do another loop and include a task file to target these hosts with the data from the loop above.
One can also get a list of all hosts, so if you add the following to the deploy yaml:
- name: Get all nodes
ansible.builtin.debug:
msg:
- "clustername: {{ cluster.0.name }}"
- "hostname: {{ cluster.1.name }}"
loop_control:
loop_var: cluster
loop: "{{ query('subelements', configuration.cluster, 'hosts') }}"
Then you will also have a list of clusters and nodes underneath that cluster.
If you then take the data and create a specific hostconfiguration (below is a dummy, you should be able to see the vision behind it, or contact me if not ;-)):
---
- name: Build storage
hosts: localhost
gather_facts: false
tasks:
- name: Get all data
ansible.builtin.debug:
msg:
- "storagename: {{ storage.0.name }}"
- "datastorename: {{ storage.1.name }} with size: {{ storage.1.size }} and how many times {{ storage.1.amount }}"
loop_control:
loop_var: storage
loop: "{{ query('subelements', configuration.storage.hosts, 'datastores') }}"
- name: Get all nodes
ansible.builtin.debug:
msg:
- "clustername: {{ cluster.0.name }}"
- "hostname: {{ cluster.1.name }}"
- "storagedata: {{ configuration.storage.hosts }}"
loop_control:
loop_var: cluster
loop: "{{ query('subelements', configuration.cluster, 'hosts') }}"
- name: Add virtual hostgroup
ansible.builtin.add_host:
groups: 'virtual_hostgroup'
name: "{{ cluster.1.name }}"
cluster_name: "{{ cluster.0.name }}"
storagedata: "{{ configuration.storage.hosts }}"
loop_control:
loop_var: cluster
loop: "{{ query('subelements', configuration.cluster, 'hosts') }}"
## New play only targeting the host objects
- name: Build storage for host
hosts: virtual_hostgroup
gather_facts: false
tasks:
- name: Print host
ansible.builtin.debug:
msg:
- "{{ inventory_hostname }}"
- "storages: {{ storagedata }}"
This will give the output of:
TASK [Print host] ***********************************************************************************************************************************************************************************************************************
task path: demo.yaml:42
ok: [HostA] => {
"msg": [
"HostA",
"storages: [{'name': 'StorageA', 'fqdn': 'storage-a.your.domain', 'datastores': [{'name': 'datastoreA', 'size': '1GB', 'amount': 10}, {'name': 'datastoreB', 'size': '10GB', 'amount': 10}]}]"
]
}
ok: [HostB] => {
"msg": [
"HostB",
"storages: [{'name': 'StorageA', 'fqdn': 'storage-a.your.domain', 'datastores': [{'name': 'datastoreA', 'size': '1GB', 'amount': 10}, {'name': 'datastoreB', 'size': '10GB', 'amount': 10}]}]"
]
}
ok: [HostC] => {
"msg": [
"HostC",
"storages: [{'name': 'StorageA', 'fqdn': 'storage-a.your.domain', 'datastores': [{'name': 'datastoreA', 'size': '1GB', 'amount': 10}, {'name': 'datastoreB', 'size': '10GB', 'amount': 10}]}]"
]
}
ok: [HostD] => {
"msg": [
"HostD",
"storages: [{'name': 'StorageA', 'fqdn': 'storage-a.your.domain', 'datastores': [{'name': 'datastoreA', 'size': '1GB', 'amount': 10}, {'name': 'datastoreB', 'size': '10GB', 'amount': 10}]}]"
]
}
Where you then can do a loop over the storagedata, or more complex data, but do it in parallel for each host (instead of sequentially per host). You can also only put in the information that is needed for this run and send them along as host_vars.
Ofcourse our setup is much much much more complex and has a lot more data, so it is not comparable at all. But, at least this gives an idea how you can target something like that. You could also in the above examples use the storage as primary key, and then do something with that when you loop over the hosts (the other way around then this example). It is all depending on what you need and how you need it. There might be 1000’s of hosts, 1000’s of datastores, 1000’s of whatever, and combining this wisely make you capable of doing things more in parallel instead of:
loop over all hosts (*1000) loop over all datastores (*1000) loop over all whateverdata (*1000)
you do:
1000*host loop over all datastores(*1000) loop over all whateverdata (*1000)
you can do the first run in parallel and take out a sequential wait of ‘1000’, and if every host iteration takes a second that saves you 1000 seconds or just shy of 17 minutes.
I did not experiment with this, but you might be able to create secondary virtual hostgroups and in parallel attack the datastores as well (play with forks or serial to prevent overloading your system ;-), reducing the time even more.
Summary
For us this is a huge performance gain where we can target the things that take a lot of time, and combine the data into a virtual host object in a virtual hostgroup, and target that in a seperate play and do activities on them in parallel.
As always, if you have questions, please contact me.