If you've been following along at all, you've probably noticed that TigerIQ is a big fan of ansible and Ansible Tower. Recently I encountered a little problem. We were dealing with an environment with both Windows and Linux machines. With the CloudForms integration with Tower, once you configure the provider, you get a nice little dynamic inventory set up for you. The issue we were facing is that the default method of accessing those machines is SSH, and we needed to set the Windows machines to use WinRM.
what we're after
We want Windows boxes to use WinRM and Linux boxes to use the default SSH, and we have one dynamic inventory. We could, of course, set that up at the host level, but we want a way to automate the full process of deploying either a Linux or Windows machine and including Tower actions as a part of the process, without changing any variables manually on the Tower side.
So it turns out, the way Tower handles this is pretty awesome. Not a big surprise if you've played with Ansible in the past. As I mentioned above, the CloudForms integration with Ansible gets you a dynamic inventory for free. You can see this by looking under the provider in the configuration menu. It should look like this:
This matches what we would find in Tower:
But if you look closely at one of the hosts, you'll see something else. Here is an example of what you might find under "variables" for any given host. This is what CloudForms passes to Tower, and this is what Tower uses to add the host to the dynamic inventory:
ansible_ssh_host: XX.XXX.XX.XX cloudforms: ansible_ssh_host: XX.XXX.XX.XX boot_time: "2017-02-01T11:42:23Z" cloud: false connection_state: connected cpu_limit: -1 cpu_reserve: 0 cpu_reserve_expand: false cpu_shares: 2000 cpu_shares_level: normal created_on: "2017-02-01T11:42:12Z" ems_cluster_id: 1000000000002 ems_id: 1000000000001 ems_ref: vm-124 ems_ref_obj: vm-124 evm_owner_id: 1000000000001 fault_tolerance: false guid: 78644a08-e873-11e6-a817-0050569bf95e host_id: 1000000000005 href: "https://my-awesome-cfme/api/vms/1000000000025" id: 1000000000025 ipaddresses: - XX.XXX.XX.XX last_perf_capture_on: "2017-02-02T08:19:20Z" last_scan_attempt_on: "2017-02-01T11:42:22Z" last_scan_on: "2017-02-01T11:42:51Z" last_sync_on: "2017-02-01T11:42:51Z" linked_clone: true location: ansible-test/ansible-test.vmx memory_limit: -1 memory_reserve: 0 memory_reserve_expand: false memory_shares: 40960 memory_shares_level: normal miq_group_id: 1000000000002 name: ansible-test power_state: "on" previous_state: poweredOff raw_power_state: poweredOn standby_action: checkpoint state_changed_on: "2017-02-01T11:42:23Z" storage_id: 1000000000003 tags: - href: "https://my-awesome-cfme/api/vms/1000000000025/tags/1000000000132" id: 1000000000132 name: "/managed/folder_path_blue/datacenters:dc01:vm" - href: "https://my-awesome-cfme/api/vms/1000000000025/tags/1000000000131" id: 1000000000131 name: /managed/folder_path_yellow/datacenters template: false tenant_id: 1000000000001 tools_status: toolsOk type: "ManageIQ::Providers::Vmware::InfraManager::Vm" uid_ems: 421bb126-13b6-0369-13c9-fc1c5ba51110 updated_on: "2017-02-02T08:19:36Z" vendor: vmware
Pay close attention to this bit:
tags: - href: "https://my-awesome-cfme/api/vms/1000000000025/tags/1000000000132" id: 1000000000132 name: "/managed/folder_path_blue/datacenters:dc01:vm" - href: "https://my-awesome-cfme/api/vms/1000000000025/tags/1000000000131" id: 1000000000131 name: /managed/folder_path_yellow/datacenters
So now we know that CloudForms passes tags to Tower. But here's the really cool part. Tower automatically compartmentalizes the dynamic inventory into groups based on these tags. So any tag that Ansible picks up in the variables will create a new group in the dynamic inventory. You can view and create groups in the Tower UI:
This is great news, because if you didn't know, we can override variables at the group level; variables like the ansible_connection, which determines whether to use SSH or WinRM to connect to the target host.
So now we have a game plan. We need to:
- apply a tag to every Windows host that gets created
- make sure that tag creates a new group in the dynamic inventory
- override the ansible_connection parameter at that group's level
First let's set up the tag. You can name this anything you want of course, but in this example we're going to create a tag category called operating_system and a tag called windows. To do this, navigate to the top right Configuration menu and then select the Region at the top of the accordion on the left. Then select [My Company] Categories. Click on the top row to create a category, name it operating_system, give it a description of Operating System, and click add. You should see something like this:
Now move one tab over to [My Company] Tags. Select your fancy new tag category from the drop down, click New Entry, and name the tag windows with a description of Windows Service. Click Add. It should look like this:
Now we need to make sure that any Windows VM that gets provisioned by CloudForms gets this tag applied to it during the process, before Ansible Tower picks it up in the dynamic inventory. We want this to apply to VMs that are provisioned via the Lifecycle menu, and anything provisioned through a Service Catalog, so first we need to copy the "Provision VM from Template" instance and the Methods class from the ManageIQ domain to our own, custom, domain. The instance is found here:
Infrastructure/VM/Provisioning/StateMachines/VMProvision_VM/Provision VM from Template (template)
and the class is found here:
Highlight these, one at a time, and select Configuration/Copy this [Instance|Class]. Make sure you leave the "Copy to same path" box checked.
Under the Methods class we need an instance and a method. Highlight it and select "Add a New Instance". Name it "AddTags". We inherited the schema from the Methods class we copied, so now we need to edit the common_meth1 entry. Enter "add_tags" for the value. It should look like this:
Now we need a method. Highlight the Methods class again and on the right, click the second tab, Methods, and select Configuration/Add a New Method. Name the new method "add_tags". In the Data field, we need to put the actual code. It should look like this:
# # Add a tag to create a new group in Ansible that enables winrm as the default ansible_connection # def add_tags_for_ansible_groups(vm) unless vm.nil? vm.tag_assign("operating_system/windows") if vm.platform == "windows" end begin vm = $evm.root['miq_provision'].vm add_tags_for_ansible_groups(vm) end
The whole thing should look like this:
Validate and Click "Save".
It's probably worth mentioning, that if you don't fully understand the line:
vm = $evm.root['miq_provision'].vm
it would be worth your time to read about the lifecycle of the Request and Task objects during the automate process. These concepts are explained in detail in Peter McGowan's indispensable book.
Now we need to point to our new instance in our state machine. Edit the schema of the "Provision VM from Template" instance and add a new state, "AddTags". We can either set the value as a default value in the schema or by editing the instance itself. We're going to do the former, so in the value, enter:
Save this, and then edit the Schema sequence and put "AddTags" right between the "CheckProvisioned" and "PostProvision" states.The end result should look like this:
Now we should make sure it works, we'll do that by provisioning a Windows VM. Once that completes, we can navigate to the VM information page and if all went as planned we should see our new tag referenced in the Smart Management section in the bottom right of the page. Something like this:
The CloudForms/ManageIQ portion of our journey is complete. On to Ansible Tower.
If we go back to our groups inside the Dynamic Inventory, we see a new one, created automatically based on our new tag,
If we edit that new group, we see that we can set variables at the group level:
There are a few things going on here but the important line is line 2:
Now the default connection for everything in this group will be WinRM instead of SSH. So if we have a Job Template in Tower that runs a playbook that installs an Internet Information Services (IIS) server, and we associate that job with a service bundle in CloudForms that provisions a Windows server, everything should work without any issue right?
some other things
WinRM can be a little tricky, you might need to do some troubleshooting. If you get timeout or other errors while testing this, the winping module in Tower is really useful. WinRM is also notoriously terse in its error messages, so it's a good idea to bump up the verbosity. If you edit the Job Template in Tower, you can change the verbosity to
5(WinRM Debug). This should give you some better clues. Keep in mind that the issue is very likely a problem with the target host, like not having "basic auth" enabled or something similar. Troubleshooting Windows boxes is beyond the scope of this post, but there are tons of WinRM troubleshooting tips online. This blog, however, was particularly useful.