TKG with Multiple vCenters and VMC pt. 2

When I wrote this post I didn’t anticipate there would be a part 2. In that last post I got stumped trying to split a TKG cluster across multiple vCenter Servers, one being VMC’s hosted vCenter Server and the other vCenter Server I installed on VMC to manage remote ESXi hosts. Well I’m happy to report with some help from our Tanzu team I’ve got it working. As it turns out the spec for the vSphere CPI already includes support for multi-vc, it’s just not exposed in TKG.

The examples in this post are taken from my home lab and not VMC directly as those values have client names in them and scrubbing them would make it harder to follow, but this works the same in VMC as it does any other multi-vc setup.

TKG doesn’t support multiple credentials so we need a common username and password between our vCenter Server instances. In this case I used Active Directory over LDAP as the external identity provider for the vCenter Servers. An AD account needs to be given the correct permissions in each vCenter Server, then we can use that account to deploy our management cluster. The management cluster is deployed the same as any other management cluster.

Once it’s up and running we need to edit the CPI secret which updates the vsphere-cloud-config ConfigMap. This is the change that allows the management cluster to reconcile nodes on other vCenter Servers. Directly editing the values isn’t supported so we have to use an overlay to do it for us.

First we need to generate a yaml of the existing secret so we have something to work with. To do this make sure your context is set to your management cluster and run: kubectl -n tkg-system get secrets vcdx71m01-vsphere-cpi-addon -o yaml > vsphere-cpi-addon.yaml Where vcdx71m01 is the name of your management cluster. Now open vsphere-cpi-addon.yaml in your favorite yaml editor. We need to add an overlay that defines all of vCenter Servers and their Data Centers.

Remove all the metadata from the file except the name, namespace and labels, these must remain. Leave your existing base64 encoded string for values.yaml. The section we’re adding begins at stringData and goes to the end. The thumbprint for each vCenter Server is a must. The Workspace section defines the default storage.

apiVersion: v1
kind: Secret
type: tkg.tanzu.vmware.com/addon
metadata:
  name: vcdx71m01-vsphere-cpi-addon
  namespace: tkg-system
  labels:
    clusterctl.cluster.x-k8s.io/move: ""
    tkg.tanzu.vmware.com/addon-name: vsphere-cpi
    tkg.tanzu.vmware.com/cluster-name: vcdx71m01
data:
  values.yaml: <encodedData>
stringData:
  overlays.yaml: |
    #@ load("@ytt:overlay", "overlay")
    #@ load("/vsphereconf-custom.lib.txt", "vsphere_conf")
    #@overlay/match by=overlay.subset({"kind": "ConfigMap", "metadata": {"name": "vsphere-cloud-config"}})
    ---
    #@overlay/replace
    data:
      vsphere.conf: #@ vsphere_conf()
  vsphereconf-custom.lib.txt: |
    ((@def vsphere_conf(): -@)
    [Global]
    user = "username"
    password = "password"
    port = "443"
    datacenters = "vcdx71-m01-dc01, vcdx71-w01-DC"
    [VirtualCenter "hou-vc02.vcdx71.net"]
    datacenters = "vcdx71-m01-dc01"
    thumbprint = "B3:15:33:51:B5:E2:16:5E:61:D3:D4:19:B7:69:C4:0E:97:1D:EA:74"
    [VirtualCenter "hou-vc03.vcdx71.net"]
    datacenters = "vcdx71-w01-DC"
    thumbprint = "3F:06:2E:30:CD:34:F9:2A:89:37:4D:9E:6A:0C:43:5A:51:D8:DF:55"
    [Workspace]
    server = "hou-vc03.vcdx71.net"
    thumbprint = "3F:06:2E:30:CD:34:F9:2A:89:37:4D:9E:6A:0C:43:5A:51:D8:DF:55"
    datacenter = "vcdx71-w01-DC"
    default-datastore = "vcdx71-w01-hou-vc03-vcdx71-w01-c01-vsan01"
    resourcepool-path = "/vcdx71-w01-DC/host/vcdx71-w01-c01/Resources"
    folder = "TKG"
    [Disk]
    scsicontrollertype = pvscsi
    (@ end -@)
  

Once that file is saved apply it to your cluster with kubectl apply -f vsphere-cpi-addon.yaml Now run kubectl get apps -A, you should see Reconcile succeeded for vsphere-cpi. If there is an error you can get more information on it by running kubectl -n tkg-system describe apps vsphere-cpi.

Now check the ConfigMap, kubectl -n kube-system get configmaps vsphere-cloud-config -o yaml, you should see your overlay from the secret in the vsphere.conf section.

Now that this has been updated we can deploy a workload cluster that crosses vCenter Server boundaries. The tanzu cluster create commands do not support this so we need to generate a template and then apply it using kubectl.

To do this we need the yaml file that was used to create the management cluster, by default this is saved to ~/.tanzu/tkg/clusterconfigs/. My yaml file is named vx11yx1ynj.yaml, to generate a workload cluster yaml we can use run: tanzu cluster create ClusterName -f ~/.tanzu/tkg/clusterconfigs/vx11yx1ynj.yaml --vsphere-controlplane-endpoint 172.18.10.10 -d > clusterName.yaml

ClusterName is the name you want to give your workload cluster and the IP for vsphere-controlplane-endpoint is the VIP for the control plane.

In this file we need to edit the VSphereMachineTemplate and CPI sections. In the VSphereMachineTemplate sections we change the vSphere information to point to the vCenter Server instance we want to deploy each node to, during initial deployment we can only deploy workers to a single vCenter Server, if you need to have workers in both vCenter Servers I’ll cover that next.

In the CPI section we need to add the same overlay we used in the management cluster secret. For this to work you must specify the thumbprint for each vCenter Server in the VSphereMachineTemplate, failure to do this will cause the nodes not to reconcile and cluster api will keep attempting to fix the issue, usually by deleting and re-creating the worker nodes.

This template deploys the control plane on hou-vc02, and the worker node on hou-vc03.

Due to the size of this file, over 1000 lines, I won’t display it inline, but you can download it here.

VSphereMachineTemplate

---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: VSphereMachineTemplate
metadata:
  name: vcdx71w01-control-plane
  namespace: default
spec:
  template:
    spec:
      cloneMode: fullClone
      datacenter: /vcdx71-m01-dc01
      datastore: /vcdx71-m01-dc01/datastore/vcdx71-m01-vsan
      diskGiB: 40
      folder: /vcdx71-m01-dc01/vm/TKG
      memoryMiB: 8192
      network:
        devices:
        - dhcp4: true
          networkName: tkg
      numCPUs: 2
      resourcePool: /vcdx71-m01-dc01/host/vcdx71-m01-cl01/Resources
      server: hou-vc02.vcdx71.net
      thumbprint: B3:15:33:51:B5:E2:16:5E:61:D3:D4:19:B7:69:C4:0E:97:1D:EA:74
      storagePolicyName: ""
      template: /vcdx71-m01-dc01/vm/Templates/photon-3-kube-v1.20.5+vmware.2
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: VSphereMachineTemplate
metadata:
  name: vcdx71w01-worker
  namespace: default
spec:
  template:
    spec:
      cloneMode: fullClone
      datacenter: /vcdx71-w01-DC
      datastore: /vcdx71-w01-DC/datastore/vcdx71-w01-hou-vc03-vcdx71-w01-c01-vsan01
      diskGiB: 40
      folder: /vcdx71-w01-DC/vm/TKG
      memoryMiB: 8192
      network:
        devices:
        - dhcp4: true
          networkName: vcdx71-w01-k8s01
      numCPUs: 2
      resourcePool: /vcdx71-w01-DC/host/vcdx71-w01-c01/Resources
      server: hou-vc03.vcdx71.net
      thumbprint: 3F:06:2E:30:CD:34:F9:2A:89:37:4D:9E:6A:0C:43:5A:51:D8:DF:55
      storagePolicyName: ""
      template: /vcdx71-m01-dc01/vm/Templates/photon-3-kube-v1.20.5+vmware.2

CPI

apiVersion: v1
kind: Secret
metadata:
  annotations:
    tkg.tanzu.vmware.com/addon-type: cloud-provider/vsphere-cpi
  labels:
    tkg.tanzu.vmware.com/addon-name: vsphere-cpi
    tkg.tanzu.vmware.com/cluster-name: vcdx71w01
  name: vcdx71w01-vsphere-cpi-addon
  namespace: default
data:
  values.yaml: <encodedData>
stringData:
  overlays.yaml: |
    #@ load("@ytt:overlay", "overlay")
    #@ load("/vsphereconf-custom.lib.txt", "vsphere_conf")
    #@overlay/match by=overlay.subset({"kind": "ConfigMap", "metadata": {"name": "vsphere-cloud-config"}})
    ---
    #@overlay/replace
    data:
      vsphere.conf: #@ vsphere_conf()
  vsphereconf-custom.lib.txt: |
    ((@def vsphere_conf(): -@)
    [Global]
    user = "username"
    password = "password"
    port = "443"
    datacenters = "vcdx71-m01-dc01, vcdx71-w01-DC"
    [VirtualCenter "hou-vc02.vcdx71.net"]
    datacenters = "vcdx71-m01-dc01"
    thumbprint = "B3:15:33:51:B5:E2:16:5E:61:D3:D4:19:B7:69:C4:0E:97:1D:EA:74"
    [VirtualCenter "hou-vc03.vcdx71.net"]
    datacenters = "vcdx71-w01-DC"
    thumbprint = "3F:06:2E:30:CD:34:F9:2A:89:37:4D:9E:6A:0C:43:5A:51:D8:DF:55"
    [Workspace]
    server = "hou-vc03.vcdx71.net"
    thumbprint = "3F:06:2E:30:CD:34:F9:2A:89:37:4D:9E:6A:0C:43:5A:51:D8:DF:55"
    datacenter = "vcdx71-w01-DC"
    default-datastore = "vcdx71-w01-hou-vc03-vcdx71-w01-c01-vsan01"
    resourcepool-path = "/vcdx71-w01-DC/host/vcdx71-w01-c01/Resources"
    folder = "TKG"
    [Disk]
    scsicontrollertype = pvscsi
    (@ end -@)
type: tkg.tanzu.vmware.com/addon

Now that we have our template created we can create the cluster by running kubectl apply -f filename.yaml. After a couple minutes we’ll have our workload cluster up and running.

In some use cases we may need worker nodes in the same vCenter Server as our control plane, such as a 5G stack that requires the CU (centralized unit) and DU (distributed unit) be in the same k8s cluster where the CU is in a central datacenter or VMC and the DU is out at a cell tower. To accomplish this we create vspheremachinetemplates and machinedeployment objects.

To create the vspheremachinetemplates we’ll export our current one, give it a new name, change the vSphere information and apply it. Run kubectl get vspheremachinetemplates to get a list of all your vspheremachinetemplates, export the one for the workload cluster you’re going to add nodes to: kubectl get vspheremachinetemplates vcdx71w01-worker -o yaml > vcdx71w01-worker2.yaml  

Open the output in your yaml editor, change the name, and edit the template section, make sure to update the thumbprint. Your file should look similar to:

apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: VSphereMachineTemplate
metadata:
  generation: 1
  name: vcdx71w01-worker2
  namespace: default
  ownerReferences:
  - apiVersion: cluster.x-k8s.io/v1alpha3
    kind: Cluster
    name: vcdx71w01
    uid: f9167f5d-981d-4007-8ce8-71e549783b3b
spec:
  template:
    spec:
      cloneMode: fullClone
      datacenter: /vcdx71-m01-dc01
      datastore: /vcdx71-m01-dc01/datastore/vcdx71-m01-vsan
      diskGiB: 40
      folder: /vcdx71-m01-dc01/vm/TKG
      memoryMiB: 8192
      network:
        devices:
        - dhcp4: true
          networkName: tkg
      numCPUs: 2
      resourcePool: /vcdx71-m01-dc01/host/vcdx71-m01-cl01/Resources
      server: hou-vc02.vcdx71.net
      thumbprint: B3:15:33:51:B5:E2:16:5E:61:D3:D4:19:B7:69:C4:0E:97:1D:EA:74
      storagePolicyName: ""
      template: /vcdx71-m01-dc01/vm/Templates/photon-3-kube-v1.20.5+vmware.2

Create the template, kubectl apply -f vcdx71w01-worker2.yaml

Now we need to create a machine deployment to deploy our new template. We’ll export the current one and edit it. kubectl get machinedeployments vcdx71w01-md-0 -o yaml > vcdx71w01-md-2.yaml Edit the name and infrastructureRef. Your file should look similar to:

apiVersion: cluster.x-k8s.io/v1alpha3
kind: MachineDeployment
metadata:
  generation: 3
  labels:
    cluster.x-k8s.io/cluster-name: vcdx71w01
  name: vcdx71w01-md-2
  namespace: default
  ownerReferences:
  - apiVersion: cluster.x-k8s.io/v1alpha3
    kind: Cluster
    name: vcdx71w01
    uid: f9167f5d-981d-4007-8ce8-71e549783b3b
spec:
  clusterName: vcdx71w01
  minReadySeconds: 0
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 1
  selector:
    matchLabels:
      cluster.x-k8s.io/cluster-name: vcdx71w01
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
    type: RollingUpdate
  template:
    metadata:
      labels:
        cluster.x-k8s.io/cluster-name: vcdx71w01
        node-pool: vcdx71w01-worker-pool
    spec:
      bootstrap:
        configRef:
          apiVersion: bootstrap.cluster.x-k8s.io/v1alpha3
          kind: KubeadmConfigTemplate
          name: vcdx71w01-md-0
      clusterName: vcdx71w01
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
        kind: VSphereMachineTemplate
        name: vcdx71w01-worker2
      version: v1.20.5+vmware.2

Create the machine deployment, kubectl apply -f vcdx71w01-md-2.yaml. You should see a clone job in the vCenter Server for this node and after a few minutes it will be added to the workload cluster.

To scale up or down the number of worker nodes simply edit the number of replicas for the deployment associated with the vCenter Server you’re working with.

That’s it for now, I’ll cover upgrades in a future post.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.