Cross Cluster Migrations


We've shown several examples in the past of CloudForms and ManageIQ adding value to Red Hat OpenShift, like here and here.

Something a lot of people have asked about is whether we could create a service in CFME that would move a workload from one OCP cluster to another. This could have a number of implications, either as an end goal in and of itself or as a component of a larger pipeline or a cloud autoscaling strategy. Either way, there was enough interest so we sat down with Tero Ahonen of Nordic OpenShift fame to see if we could knock something out.

what we're after

We're assuming a few things to start:

  • Two OpenShift Clusters that were deployed in a standard fashion.
  • A common container registry that is used by both clusters (we'll be using Docker Hub).
  • A CloudForms appliance with access to both clusters (obviously).

We're going to start with a basic application running in the first OpenShift cluster, and the goal is to create a service catalog that moves that workload into the second cluster. To keep things simple, we're going to use this app, which essentially deploys an application with a link that sends you back to this website so you can trigger a time loop.

the code

To make it easy, Tero has a nice containerized version of our app running in Docker Hub. To deploy your base project execute the following command on your first OpenShift cluster:
oc new-app tpahonen/tigerinapod

Now you'll need the automate code. Everything is available here, including instructions on how to pull the Domain in via Git. Keep in mind that Domains imported via Git default to locked, so you'll need to create a new domain, make sure it's enabled, and copy all the Classes, Instances, and Methods over. You'll also need to add the credentials and URL endpoints for your two OCP clusters in the Methods Class under TigerIQ/Integration/OpenShift/Methods.

You'll also need a Service Catalog Item that will trigger this. The one we use here is simple, and just takes two variables:

  • delete_project: This is a string that is either 'true' or 'false'. If it is true, it will trigger the deletion of the original project after the migration is complete.
  • project_name: This needs to be the name of the project that you want to migrate.

Use the CrossClusterMigration State Machine as the end point for the Service Catalog Item. This State Machine has five states, we'll walk through each one now to give an overview of how this process works.

GetResources (code)

The first thing we need is to get all the information about the existing Project. We'll need the following:

  • Image data
  • Deployment Config data
  • Services data
  • Route data

The GetResources Instance calls the get_resources method, which collects all four of these objects from the OpenShift API in the form of JSON. It then stores them as state_vars, which allows us to access them later in the automation process. One oddity to note in this method, which we'll see again, is the following code which appears in the 'call_openshift' method:
query = resource=="services" ? ":8443/api/v1/namespaces/" : ":8443/oapi/v1/namespaces/"
This is a ternary expression, you can read more about those here.
This is necessary because OpenShift has two APIs, the Kubernetes API and the OpenShift API. The Services call we make is to the Kubernetes API, which uses the 'api' path. Everything else we do here uses the Openshift API, which uses the 'oapi' path. (H/T Peter McGowan)

ModifyJson (code)

We need to take the objects we pulled down in the last step and turn them into something we can use to deploy an identical Project. The ModifyJson Instance calls the modify_json method, which does this.

This is the trickiest part of the process, since the API returns lists, and we need to take certain things out of those lists. We also need to remove some things like the resourceVersion and status information, but the code handles all that. Ruby doesn't have any concept of ordering in its hash object, so in order for us to get this in the right order, each method creates a temporary hash and then adds the relevant bits on. Then we store the new, modified, objects as state_vars which we can access later.

CreateProject (code)

Nick has done more elaborate versions of this, among other cool things, here, but in our case we just need to simply create an OCP Project with the same name as the base Project on the second OCP Cluster. This stage does that.

DeployApp (code)

Now that we have an empty Project with the correct name on the second OCP Cluster, we can fill it up with all the modified JSON goodness that we sorted out in ModifyJson. We add the resources in the following order:

  • Image
  • Deployment Config
  • Service
  • Route

That's it, the new app is up and running.

DeleteApp (code)

This stage deletes the original Project on the first OCP Cluster. It only runs if the delete_project variable from the Service Dialog is 'true'. If it's 'false', or anything else for that matter, the original Project will remain intact.

some other things

This is intended to be a framework, a starting point for similar automation activities that might add value. We used a pretty simple app, but the basics should be consistent. If this process were to be used on a more complicated app, or made more flexible so it could handle multiple types of apps, the bulk of the work will need to be done in the ModifyJson stage, since application complexity will impact the specifics of how we need to manipulate the various JSON files needed to acheive this.