VMware Cloud Director Service Crashes During CSE Communication with VCD

Introduction

In the sphere of cloud management, ensuring uninterrupted service is of paramount importance. However, challenges can emerge, affecting the smooth operation of services. Recently, a noteworthy issue surfaced with a customer – the ‘VMware Cloud Director service crashing when Container Service Extension communicates with VCD.’ This article delves into the symptoms, causes, and, most crucially, the solution to address this challenge.

It’s important to note that the workaround provided here is not an official recommendation from VMware. It should be applied at your discretion. We anticipate that VMware will release an official KB addressing this issue in the near future. The product versions under discussion in this article are VCD 10.4.2 and CSE 4.1.0.

Symptoms:

  • The VCD service crashes on VCD cells when traffic from CSE servers is permitted.
  • The count of ‘BEHAVIOR_INVOCATION’ operation in VCD DB is quite high (more than 10000).
vcloud=# select count(*) from jobs where operation = 'BEHAVIOR_INVOCATION';
 count
--------
 385151
(1 row)
  • In the logs, you may find the following events added in cell.log:
Successfully verified transfer spooling area: VfsFile[fileObject=file:///opt/vmware/vcloud-director/data/transfer] 
Cell startup completed in 1m 39s
java.lang.OutOfMemoryError: Java heap space
Dumping heap to /opt/vmware/vcloud-director/logs/java_pid14129.hprof ...
Dump file is incomplete: No space left on device
log4j:ERROR Failed to flush writer,
java.io.IOException: No space left on device
at java.base/java.io.FileOutputStream.writeBytes(Native Method)

Cause:

The root cause of this issue lies in VCD generating memory heap dumps due to an ‘OutOfMemoryError’ due to this, in turn, leads to the storage space being exhausted and ultimately results in the VCD service crashing.

Resolution:

The good news is that the VMware has identified this as a bug within VCD and plans to address it in the upcoming update releases of VCD. While we eagerly await this update, the team has suggested a workaround in case you encounter this issue:

  1. SSH into each VCD cell.
  2. Check the “/opt/vmware/vcloud-director/logs” directory for java heap dump files (.hprof) on each cell.

cd /opt/vmware/vcloud-director/logs

  1. Remove the files with the “.hprof” extension.
[ /opt/vmware/vcloud-director/logs ]# rm java_xxxxx.hprof
  1. Connect to the VCD database:
   sudo -i -u postgres psql vcloud
  1. Delete records of the operations ‘BEHAVIOR_INVOCATION’ from the ‘jobs’ table:
   vcloud=# delete from jobs where operation = 'BEHAVIOR_INVOCATION';
  1. Perform a service restart on all the VCD cells serially:
   service vmware-vcd restart

By following these steps, you can mitigate the issue and keep your VCD service running smoothly until the official bug fix is released in VCD.

How to validate TKGm Cluster?

Please find the steps to validate a TKGm cluster deployed through VMware Container Service Extension.

Step 1 : Download kubeconfig file

  • Download the Kubeconfig file to a windows machine which has access to the Native Kubernetes cluster.
  • Create folder .kube under $HOME.

$HOME\.kube

  • Copy the configfile dowloaded to .kube folder.
  • Rename the file to ‘config’ without any extensions.

Step 2 : Download kubectl

  • Download Kubectl for Windows from
https://dl.k8s.io/release/v1.22.0/bin/windows/amd64/kubectl.exe
  • Create folder $HOME\kubectl and copy kubectl.ext to the folder. Add the folder to the ‘Path’ User variable in Environment Variables.

Run kubectl

Step 3: Run a ‘hello world’ application in the cluster.

Follow the steps from following article to deploy a Hello World applicaiton in the K8S cluster created.

Exposing an External IP Address to Access an Application in a Cluster | Kubernetes

Note: In the following command use NodePort instead of LoadBalancer

kubectl expose deployment hello-world --type=LoadBalancer --name=my-service

How to create NSX-T Routed network in VCD for Tanzu Kubernetes Grid (TKG) clusters?

Please find the steps for configuring the Network in VCD for deploying TKG clusters.

Add the public IP to the Static IP Pool of T0 GW

  • Login to VCD Provider portal.
  • Navigate to Resources > Cloud Resources > Tier-0 Gateways.
  • Select the T0 Gateway.
  • Select ‘Network Specification’
  • Edit
  • Add the Public IP(s) to the ‘Static IP Pools’

Create Edge Gateway (T1 Router)

  • Login to VCD Provider portal.
  • Navigate to Resources > Cloud Resources. > Edge Gateways
  • Select New
  • Select the Org VDC and click Next
  • Provide a name for the Edge.
  • Select the appropriate T0 Gateway
  • Choose the appropriate Edge Cluster option for your environment.
  • Assign the Public IP for SNAT as Primary IP
  • Cleck Next review the settings and click Finish.

Create Organization Network

  • From provider portal select the Test organization.
  • Navigate to Networking > Networks.
  • Click New
  • Select Org VDC
  • Select Network Type ‘Routed
  • Select the Edge Gateway (T1)
  • Provide the Name and Gateway CIDR
  • Provide the DNS server accessible from the Org Network created. The DNS server should be able to resolve the FQDNS in the public domain/internet.
  • Click Next, review the settings and click on Finish.

Create SNAT

  • From provider portal select the Test organization.
  • Navigate to Networking > Edge Gateways
  • Select the Edge Gateway (T1)
  • Navigate to Services > NAT
  • Click New
  • Provide the details as mentioned in the screenshot.

Modify default Firewall rule

  • From provider portal select the Test organization.
  • Navigate to Networking > Edge Gateways
  • Select the Edge Gateway (T1)
  • Navigate to Services > Firewall
  • Select ‘Edit Rules’
  • Select the ‘default_rule’
  • Edit
  • Select Allow as Action.

How to run VMware Container Service Extension (CSE) as Linux Service?

After installing CSE please follow the steps below to run it as a service.

Create cse.sh file

Create cse.service file. You can copy the following code or create new one based on following link.
container-service-extension/cse.sh at master · vmware/container-service-extension (github.com)

# vi /opt/vmware/cse/cse.sh
#!/usr/bin/env bash
export CSE_CONFIG=/opt/vmware/cse/encrypted-config.yaml
export CSE_CONFIG_PASSWORD=<passwd>
cse run

Copy encrypted-config.yaml to /opt/vmware/cse directory.

Change the file permission

chmod +x /opt/vmware/cse/cse.sh

Create cse.service file

Create cse.service file. You can copy the following code or create new one based on following link.
container-service-extension/cse.service at master · vmware/container-service-extension (github.com)

vi /etc/systemd/system/cse.service
[Unit]
Description=Container Service Extension for VMware Cloud Director

[Service]
ExecStart=/opt/vmware/cse/cse.sh
User=root
WorkingDirectory=/opt/vmware/cse
Type=simple
Restart=always

[Install]
WantedBy=default.target

Enable and start the service

# systemctl enable cse.service
# systemctl start cse.service

Check the service status

# systemctl status cse.service
  cse.service - Container Service Extension for VMware Cloud Director
   Loaded: loaded (/etc/systemd/system/cse.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2021-11-24 14:43:56 +01; 1min 9s ago
 Main PID: 770 (bash)
   CGroup: /system.slice/cse.service
           ├─770 bash /opt/vmware/cse/cse.sh
           └─775 /usr/local/bin/python3.7 /usr/local/bin/cse run

Nov 24 14:44:06 cse01.lab.com cse.sh[770]: Validating CSE installation according to config file
Nov 24 14:44:06  cse.sh[770]: MQTT extension and API filters found
Nov 24 14:44:06 cse01.lab.com cse.sh[770]: Found catalog 'cse-site1-k8s'
Nov 24 14:44:06 cse01.lab.com  cse.sh[770]: CSE installation is valid
Nov 24 14:44:06 cse01.lab.com cse.sh[770]: Started thread 'MessageConsumer' (140229531580160)
Nov 24 14:44:06 cse01.lab.com l cse.sh[770]: Started thread 'ConsumerWatchdog' (140229523187456)
Nov 24 14:44:06 cse01.lab.com  cse.sh[770]: Container Service Extension for vCloud Director
Nov 24 14:44:06 cse01.lab.com  cse.sh[770]: Server running using config file: /opt/vmware/cse/encrypted-config.yaml
Nov 24 14:44:06 cse01.lab.com  cse.sh[770]: Log files: /root/.cse-logs/cse-server-info.log, /root/.cse-logs/cse-server-debug.log
Nov 24 14:44:06 cse01.lab.com  cse.sh[770]: waiting for requests (ctrl+c to close)

How to enable TKG in Container Service Extension (CSE) 3.1.1?

Step1: Download the TKG OVA

Starting CSE 3.1.1, CSE allows providers to import Ubuntu 20.04 based VMware Tanzu Kubernetes Grid OVA into VCD via CSE server cli. The following link provide different TKG Templates. Download the template with the K8s version you need.
Note: Ubuntu 20.04 Kubernetes OVAs from VMware Tanzu Kubernetes Grid Versions 1.4.0, 1.3.1, 1.3.0 are supported.
Kubernetes OVAs for VMware Tanzu Kubernetes Grid 1.4.0 are available here


I’ve downloaded Ubuntu 2004 Kubernetes v1.21.2 OVA since that’s the lates available version.
File Name : ubuntu-2004-kube-v1.21.2+vmware.1-tkg.1-7832907791984498322.ova

Step2: Import TKG OVA to VCD Catalog

Upload the downloaded OVA to the CSE server. Use the following command to import the OVA in Catalog.

# cse template import -c encrypted-config.yaml -F ubuntu-2004-kube-v1.21.2+vmware.1-tkg.1-7832907791984498322.ova

Required Python version: >= 3.7.3
Installed Python version: 3.7.12 (default, Nov 23 2021, 15:49:55)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-44)]
Password for config file decryption:
Decrypting 'encrypted-config.yaml'
Validating config file 'encrypted-config.yaml'
InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised.
Connected to vCloud Director (vcd.lab.com:443)
Connected to vCenter Server 'demovc.local' as '[email protected]' (demovc.local)
Config file 'encrypted-config.yaml' is valid
Uploading 'ubuntu-2004-kube-v1.21.2+vmware.1-tkg.1-7832907791984498322' to catalog 'cse-site1-k8s'
Uploaded 'ubuntu-2004-kube-v1.21.2+vmware.1-tkg.1-7832907791984498322' to catalog 'cse-site1-k8s'
Writing metadata onto catalog item ubuntu-2004-kube-v1.21.2+vmware.1-tkg.1-7832907791984498322.
Successfully imported TKGm OVA.

Step3: Restart CSE service.

I assume you’ve configured CSE to run as service. If yest restart the service.

Step4: Confirm TKG is available as option for Kubernetes Runtime

Login to the tenant portal and navigate to More > Kubernetes Container Clusters.

Click on New

How to delete the failed TKGm or Native k8s cluster in CSE?

In CSE 3.1.1, delete operation on a cluster (Native or TKG) that is in an error state (RDE.state = RESOLUTION_ERROR (or) status.phase = :FAILED), may fail with Bad request (400) or the Delete process will be stuck in ‘DELETEIN_PROGRESS’ state. The steps are given below to resolve the issue.

Step1: Assign API explorer privilege to the CSE Service Account.

Login to VCD Provider portal as Administrator.

Edit the CSE Service Role.

Navigate to Administration > Provider Access Control > Roles > CSE Service Role.

In the tenant portal check if there’re any stale vApp entries for the failed clusters. If so, please delete them.

Step2: Get the failed cluster ID through vcd cli

# vcd cse cluster list
Name        Org             Owner     VDC            K8s Runtime    K8s Version            Status
----------  --------------  --------  -------------  -------------  ---------------------  ------------------
test-tkg-1  CSE-Site1-Test  orgadmin  CSE-TEST-OVDC  TKGm           TKGm v1.21.2+vmware.1  DELETE:IN_PROGRESS
tkg         CSE-Site1-Test  orgadmin  CSE-TEST-OVDC  TKGm           TKGm v1.21.2+vmware.1  DELETE:IN_PROGRESS
tkgtest     CSE-Site1-Test  orgadmin  CSE-TEST-OVDC  TKGm           TKGm v1.21.2+vmware.1  DELETE:IN_PROGRESS
tkg-test    CSE-Site1-Test  orgadmin  CSE-TEST-OVDC  TKGm           TKGm v1.21.2+vmware.1  DELETE:IN_PROGRESS
tkg-test3   CSE-Site1-Test  orgadmin  CSE-TEST-OVDC  TKGm           TKGm v1.21.2+vmware.1  DELETE:IN_PROGRESS

# vcd cse cluster info tkg-test3 | grep uid
  uid: urn:vcloud:entity:cse:nativeCluster:9364bf18-0faa-49ce-8be7-7e92af692d1b

Step3: Run GET call to check the status

Login to VCD Provider portal with the CSE service account which has CSE Service Role assigned.
Open API Explorer.
Click on GET in difinedEntity section.

Click on TryitOut
In Description, provide the cluster UID from last step.
In the output we can see the state as PRE_CREATED.

Step3: Run the POST call resolve to resolve

Select the POST call from definedEntity section.

/1.0.0/entities/{id}/resolve
Validates the defined entity against the entity type schema.

Provide the cluster ID and run the call. The state will be changed to RESOLVED.


Step4: Run the DELETE call to delete RDE.

Povide the cluser ID and ‘false’ as value for inovkeHooks.

Please check and confirm the failed Cluster is deleted now.

#vcd cse cluster list
InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised.
Name      Org             Owner     VDC            K8s Runtime    K8s Version            Status
--------  --------------  --------  -------------  -------------  ---------------------  ------------------
tkg       CSE-Site1-Test  orgadmin  CSE-TEST-OVDC  TKGm           TKGm v1.21.2+vmware.1  DELETE:IN_PROGRESS
tkgtest   CSE-Site1-Test  orgadmin  CSE-TEST-OVDC  TKGm           TKGm v1.21.2+vmware.1  DELETE:IN_PROGRESS
tkg-test  CSE-Site1-Test  orgadmin  CSE-TEST-OVDC  TKGm           TKGm v1.21.2+vmware.1  DELETE:IN_PROGRESS


How to deploy Container Service Extension (CSE) 3.1.1?

Please find the steps to deploy Container Service Extension 3.1.1.

Step 1: Deploy CentOS 7 VM

Selected CentOS 7 as the Operating System for CSE server. CentOS 7 has higher EOL than CentOS 8. You can find the installations steps for CentOS 7 here.

Please find more details on CentOS releases below.

Kindly ensure following configurations are done in CSE VM.

  • Configure DNS.
  • Configure NTP.
  • Configure SSH.
  • SSH access for root user is enabled.

Please note the following network connections are required for rest of the configurations.

  • Access to VCD URL (https) from CSE Server.
  • The Internet access from CSE server.

Step 2: Take a snapshot of CSE server VM.

It’s recommended to take a snapshot of CSE server before continuing with Python installation. It’s an optional step.

Step 3: Install Python 3.7.3 or greater

Install python 3.7.3 or greater in 3.7.x series. Please note that python 3.8.0 and above is not supported (ref: CSE doc)
The built-in python version in CentOS 7 is 2.7. So, we’ve to install the latest in 3.7.x series, at the moment version 3.7.12 is the latest. Please follow the below steps to install Python.

yum update -y
yum install -y yum-utils
yum groupinstall -y development
yum install -y gcc openssl-devel bzip2-devel libffi-devel zlib-devel xz-devel

#Install sqlite3

cd /tmp/
curl -O https://www.sqlite.org/2020/sqlite-autoconf-3310100.tar.gz
tar xvf sqlite-autoconf-3310100.tar.gz
cd sqlite-autoconf-3310100/
./configure
make install

# Install Python

cd /tmp/
curl -O https://www.python.org/ftp/python/3.7.12/Python-3.7.12.tgz
tar -xvf Python-3.7.12.tgz
cd Python-3.7.12
./configure --enable-optimizations
make altinstall
alternatives --install /usr/bin/python3 python3 /usr/local/bin/python3.7 1
alternatives --install /usr/bin/pip3 pip3 /usr/local/bin/pip3.7 1
alternatives --list

# Check Python and pip3 versions

python3 --version
pip3 --version


Step 4: Install vcd-cli

# Install and verify vcd-cli
pip3 install vcd-cli
vcd version
     vcd-cli, VMware vCloud Director Command Line Interface, 24.0.1

Step 5: Install CSE

# Install and verify cse
pip3 install container-service-extension
cse version
CSE, Container Service Extension for VMware vCloud Director, version 3.1.1

Step 6: Enable CSE client

# Create ~/.vcd-cli directory
mkdir ~/.vcd-cli

# Create profiles.yaml
cat > ~/.vcd-cli/profiles.yaml << EOF
extensions: 
- container_service_extension.client.cse
EOF

Step 7: Create CSE Service Role for CSE server management

[root@test ~]# cse create-service-role <vcd fqdn> -s
Username for System Administrator: administrator
Password for administrator:
Connecting to vCD:  <vcd fqdn>
Connected to vCD as system administrator: administrator
Creating CSE Service Role...
Successfully created CSE Service Role

Step 7: Create service account for CSE in VCD

Create a Service Account in VCD with the role ‘CSE Service Role’

Step 8: Create service account for CSE in vCenter

Create new role in vCenter with Power User + Guest Operations privilege. Assign the role to the service account for CSE.

  • Clone ‘Virtual Machine Power User (sample) role
  • Edit role
  • Select Virtual machine > Guest operations.

Step 9: Create a sample CSE config file and update it.

cse sample -o config.yaml

# vi config.yaml

mqtt:
  verify_ssl: false

vcd:
  host: vcd.vmware.com
  log: true
  password: my_secret_password
  port: 443
  username: administrator
  verify: false

vcs:
- name: vc1
  password: my_secret_password
  username: [email protected]
  verify: false

service:
  enforce_authorization: false
  legacy_mode: false
  log_wire: false
  no_vc_communication_mode: false
  processors: 15
  telemetry:
    enable: true

broker:
  catalog: cse
  ip_allocation_mode: pool
  network: my_network
  org: my_org
  remote_template_cookbook_url: https://raw.githubusercontent.com/vmware/container-service-extension-templates/master/template_v2.yaml
  storage_profile: '*'

Step 7: Encrypt Configuration file

cse encrypt config.yaml --output encrypted-config.yaml

Step 8: Validate the Configuration file

chmod 600 encrypted-config.yaml
cse check encrypted-config.yaml

Step 9: Install CSE

cse install -c encrypted-config.yaml --skip-template-creation

Step 10: Validate CSE Installation

cse check encrypted-config.yaml --check-install

Step 10: List the available templates

# cse template list -c encrypted-config.yaml
Password for config file decryption:
Decrypting 'encrypted-config.yaml'
Retrieved config from 'encrypted-config.yaml'
name                                revision  local    remote      cpu    memory  description                                                        deprecated
--------------------------------  ----------  -------  --------  -----  --------  -----------------------------------------------------------------  ------------
ubuntu-16.04_k8-1.21_weave-2.8.1           1  No       Yes           2      2048  Ubuntu 16.04, Docker-ce 20.10.7, Kubernetes 1.21.2, weave 2.8.1    No
ubuntu-16.04_k8-1.20_weave-2.6.5           2  No       Yes           2      2048  Ubuntu 16.04, Docker-ce 19.03.15, Kubernetes 1.20.6, weave 2.6.5   No
ubuntu-16.04_k8-1.19_weave-2.6.5           2  No       Yes           2      2048  Ubuntu 16.04, Docker-ce 19.03.12, Kubernetes 1.19.3, weave 2.6.5   No
ubuntu-16.04_k8-1.18_weave-2.6.5           2  No       Yes           2      2048  Ubuntu 16.04, Docker-ce 19.03.12, Kubernetes 1.18.6, weave 2.6.5   No
photon-v2_k8-1.14_weave-2.5.2              4  No       Yes           2      2048  PhotonOS v2, Docker-ce 18.06.2-6, Kubernetes 1.14.10, weave 2.5.2  Yes

Step 11: Import the latest K8S template

cse template install TEMPLATE_NAME TEMPLATE_REVISION

# cse template install ubuntu-16.04_k8-1.21_weave-2.8.1 1 -c encrypted-config.yaml

It will take a while to complete the download of template, be patient.

Downloading file from 'https://cloud-images.ubuntu.com/releases/xenial/release-20180418/ubuntu-16.04-server-cloudimg-amd64.ova' to 'cse_cache/ubuntu-16.04-server-cloudimg-amd64.ova'...
Download complete
Uploading 'ubuntu-16.04-server-cloudimg-amd64.ova' to catalog 'cse-site1-k8s'
Uploaded 'ubuntu-16.04-server-cloudimg-amd64.ova' to catalog 'cse-site1-k8s'
Deleting temporary vApp 'ubuntu-16.04_k8-1.21_weave-2.8.1_temp'
Creating vApp 'ubuntu-16.04_k8-1.21_weave-2.8.1_temp'
Found data file: /root/.cse_scripts/2.0.0/ubuntu-16.04_k8-1.21_weave-2.8.1_rev1/init.sh
Created vApp 'ubuntu-16.04_k8-1.21_weave-2.8.1_temp'
Customizing vApp 'ubuntu-16.04_k8-1.21_weave-2.8.1_temp', vm 'ubuntu-1604-k8s1212-weave281-vm'
Found data file: /root/.cse_scripts/2.0.0/ubuntu-16.04_k8-1.21_weave-2.8.1_rev1/cust.sh
Waiting for guest tools, status: "vm='vim.VirtualMachine:vm-2296', status=guestToolsNotRunning
Waiting for guest tools, status: "vm='vim.VirtualMachine:vm-2296', status=guestToolsNotRunning
Waiting for guest tools, status: "vm='vim.VirtualMachine:vm-2296', status=guestToolsNotRunning
Waiting for guest tools, status: "vm='vim.VirtualMachine:vm-2296', status=guestToolsRunning
.....
......
......

waiting for process 1611 on vm 'vim.VirtualMachine:vm-2296' to finish (1)
waiting for process 1611 on vm 'vim.VirtualMachine:vm-2296' to finish (2)
waiting for process 1611 on vm 'vim.VirtualMachine:vm-2296' to finish (3)
waiting for process 1611 on vm 'vim.VirtualMachine:vm-2296' to finish (4)
waiting for process 1611 on vm 'vim.VirtualMachine:vm-2296' to finish (5)
waiting for process 1611 on vm 'vim.VirtualMachine:vm-2296' to finish (6)
waiting for process 1611 on vm 'vim.VirtualMachine:vm-2296' to finish (7)
waiting for process 1611 on vm 'vim.VirtualMachine:vm-2296' to finish (8)
...
...
..
/etc/kernel/postinst.d/x-grub-legacy-ec2:
Searching for GRUB installation directory ... found: /boot/grub
Searching for default file ... found: /boot/grub/default
Testing for an existing GRUB menu.lst file ... found: /boot/grub/menu.lst
Searching for splash image ... none found, skipping ...
Found kernel: /boot/vmlinuz-4.4.0-119-generic
Found kernel: /boot/vmlinuz-4.4.0-210-generic
Found kernel: /boot/vmlinuz-4.4.0-119-generic
Updating /boot/grub/menu.lst ... done

/etc/kernel/postinst.d/zz-update-grub:
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-4.4.0-210-generic
Found initrd image: /boot/initrd.img-4.4.0-210-generic
Found linux image: /boot/vmlinuz-4.4.0-119-generic
Found initrd image: /boot/initrd.img-4.4.0-119-generic
done
customization completed

Customized vApp 'ubuntu-16.04_k8-1.21_weave-2.8.1_temp', vm 'ubuntu-1604-k8s1212-weave281-vm'
Creating K8 template 'ubuntu-16.04_k8-1.21_weave-2.8.1_rev1' from vApp 'ubuntu-16.04_k8-1.21_weave-2.8.1_temp'
Shutting down vApp 'ubuntu-16.04_k8-1.21_weave-2.8.1_temp'
Successfully shut down vApp 'ubuntu-16.04_k8-1.21_weave-2.8.1_temp'
Capturing template 'ubuntu-16.04_k8-1.21_weave-2.8.1_rev1' from vApp 'ubuntu-16.04_k8-1.21_weave-2.8.1_temp'
Created K8 template 'ubuntu-16.04_k8-1.21_weave-2.8.1_rev1' from vApp 'ubuntu-16.04_k8-1.21_weave-2.8.1_temp'
Successfully tagged template ubuntu-16.04_k8-1.21_weave-2.8.1_rev1 with placement policy native.
Deleting temporary vApp 'ubuntu-16.04_k8-1.21_weave-2.8.1_temp'
Deleted temporary vApp 'ubuntu-16.04_k8-1.21_weave-2.8.1_temp'

Step 12: Confirm the template is available in CSE catalog

Login to CSE Tenant portal.
Navigate to the Libraries > Catalogs > vApp Templates. We can see the newly created K8S upstream template.

Step 13: Enable Organizations for Native deployments.

The provider must explicitly enable organizational virtual datacenter(s) to host native deployments, by running the command: vcd cse ovdc enable.

vcd login <vcd> system administrator -i
InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised.
Password:
administrator logged in, org: 'system', vdc: ''

# vcd cse ovdc enable <orgvdc> -n -o <organization>
# vcd cse ovdc enable TEST-OVDC -n -o Site1-Test

InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised.
OVDC Update: Updating OVDC placement policies
task: 10e70b37-5aa6-4cf9-b437-ef478bd9f06a, Operation success, result: success

Step 14: Check Create New Native Cluster is available now

Login to the VCD Tenant portal and navigate to More > Kubernetes Container Clusters.
Click on New.

We can see the option to ‘Create New Native Cluster’.

Step 15: Publish Right Bundle ‘cse:nativeCluster Entitlement’

  • The following article has details on differences between right bundle and roles.
  • Login to VCD as Provider and navigate to Administration > Right Bundles
  • Select ‘cse:nativeCluster Entitlement’
  • Create a backup of right bundle by Cloning.
    • Select the Rights Bundle cse:nativeCluster Entitlement.
    • Select Clone.
    • Keep the auto generaed name ‘Clone:cse:nativeCluster Entitlement’
  • Edit the right bundle cse:nativeCluster Entitlement
  • Select the following rights
  • Select ‘PUBLISH’
  • Select the specific Tenants from the list.

Step 16: Add CSE rights to Global Role ‘Organization Administrator’

  • Login as Provider and edit the Global Role ‘Organization Administrator’
  • Select same rights we selected in the last step.

Upgrade VMware Cloud Director App Launchpad from 2.0 to 2.1

Please find the steps to upgrade VMware Cloud Director App Launchpad from version 2.0 to 2.1

  1. Download VMware Cloud Director App Launchpad 2.1 RPM package from here.
  2. Upload it to the App Launchpad VM.
  3. Open an SSH connection to the App Launchpad VM and log in as root.
  4. Upgrade the RPM package.
[root@test ~]# rpm -U vmware-alp-2.1.0-18834930.x86_64.rpm
warning: vmware-alp-2.1.0-18834930.x86_64.rpm: Header V3 RSA/SHA1 Signature, key ID 001e5cc9: NOKEY
Upgrading...

Execute 'alp upgrade' to upgrade ...

  Append the excute permission to the existing logs...

5. Run the following command to upgrade App Launchpad.

[root@test ~]# alp upgrade --admin-user administrator@system --admin-pass 'passwd'
Upgraded the plugin of App Launchpad successfully.

Upgraded the management service successfully.
  [Upgrade Task]
     CREATE_ENTITY_TYPE_CATALOG_INFO : true
     MIGRATE_CATALOGS : true
     CREATE_ENTITY_TYPE_SIZING_TEMPLATE : true
     MIGRATE_LEGACY_SIZING_TEMPLATES : true
     CREATE_ENTITY_TYPE_MARKETPLACE_BANNER : true
     CREATE_ENTITY_TYPE_ORG_METRICS : true
     UPGRADE_SERVICE_ROLE : true

6. Restart alp service and confirm its running.

[root@test~]# systemctl restart alp
[root@test ~]# systemctl status alp
● alp.service - VMware ALP Management Service
   Loaded: loaded (/usr/lib/systemd/system/alp.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2021-11-18 11:46:14 +01; 14s ago
 Main PID: 29334 (java)
   CGroup: /system.slice/alp.service
           └─29334 java -jar /opt/vmware/alp/alp.jar --logging.path=log

Nov 18 11:46:14 bd1-srp-al01.acs.local systemd[1]: Stopped VMware ALP Management Service.
Nov 18 11:46:14 bd1-srp-al01.acs.local systemd[1]: Started VMware ALP Management Service.

7. Diagnose deployment errors by running the /opt/vmware/alp/bin/diagnose executable file.

The diagnose tool verifies that the services are up and running and that all configuration
requirements are met.

[root@test ~]# /opt/vmware/alp/bin/diagnose

Step 1: System diagnose
--------------------------------------------------------------------------------
- App Launchpad service is initialized.


Step 2: Cloud Director diagnose
--------------------------------------------------------------------------------
- Service Account for App Launchpad is good.
- App Launchpad's extension is ready.


Step 3: MQTT diagnose
--------------------------------------------------------------------------------
- Cloud Director MQTT for extensibility is ready.


Step 4: Integration diagnose
--------------------------------------------------------------------------------
- App Launchpad API is up, and version is 2.1.0-18834930.


Step 5: App Launchpad diagnose
--------------------------------------------------------------------------------
- App Launchpad service is listening on port 8086.


8. Confirm the ALP version.

[root@test ~]# alp
NAME:
        alp - The Cloud Director App Launchpad
        (ALP) Command-line tool

USAGE:
        alp <subcommand> [flags]

VERSION:
        '2.1.0-18834930'